Epitope refers to any region of an antigen biomacromolecule which is recognized,
or bound, by another biomacromolecule. The meaning is more restricted and refers
to particular structures recognized by the immune system in particular ways.
Epitope can be defined as the minimal structure necessary to invoke an immune
response (Flower, 2008).
Epitopes prediction plays an important role in enhancing immunodiagnostic tests,
reverse vaccinology, predicting allergenicity and antibodies production. Epitope
prediction can be fairly described as both the high frontier of immunoinformatic
investigation and a grand scientific challenge (Flower, 2007).
B-cell epitopes are regions of a protein recognized by antibody molecules.
B-cell epitopes divided in two categories conformational epitopes and continuous
epitopes. Conformational epitopes are discontinuous determinants on a protein
antigen formed from several separate regions in the primary sequence of a protein
brought together by protein folding. Continuous epitopes are linear antigenic
determinants on proteins that are contiguous in amino acid sequence and do not
require folding of a protein into its native conformation for antibody to bind
with it (Cruse and Lewis, 2003).
One of most important applications of predicting B-cell epitopes is computational design of immunogenic peptides to produce specific antibodies for specific protein.
The purposes of this study aimed to (1) Building datasets to train epitopes
prediction models on it; (2) Building B-cell prediction Models (BM), (3) Develop
tool to apply the models to any protein sequence; (4) Predicting the epitopes
in the case study (Potato leaf roll virus); (5) Selecting the immunogenic peptide
to be injected into animal to produce antibodies that cross react with the potato
leaf roll virus and (6) Testing the obtained antibodies.
MATERIALS AND METHODS
Waikato Environment for Knowledge Analysis (WEKA): The Waikato Environment
for Knowledge Analysis (WEKA) is the leading open-source project in machine
learning. WEKA is a comprehensive collection of algorithms for data mining tasks
written in Java and released under the GPL, containing tools for data pre-processing,
classification, regression, clustering, association rules and visualization
(Gewehr et al., 2007). WEKA is developed in University
of Waikato in New Zealand and it consists of WEKA Explorer, WEKA Experimenter,
WEKA Knowledge Flow and WEKA simple command line interface. WEKA Explorer was
used in this work for applying machine learning algorithm to datasets of epitopes
to build B-cell prediction Models (BM).
Datasets: The datasets used for building the epitopes prediction models
are a set of the epitopes and non-epitopes peptides obtained from IEDB (Peters
et al., 2005) and datasets used in other work (El-Manzalawy
et al., 2008a). The datasets were built in ARFF format containing
two class attributes (1) for positive peptides and (0) for negative peptide.
We built four datasets LB01-dataset, LB02- dataset, LB03-dataset and LB04- dataset.
Table 1 illustrated the datasets and the number of instances
in each one.
BM models building: Support vector machine and subsequence string kernel
were used to build models for predicting linear B-cell epitopes as described
in El-Manzalawy et al. (2008a, b).
Table 2 shows the BM models decay factor (λ) parameter
that used for building BM models.
Epitopes Model Applier Software (EMAS): Epitopes Model Applier Software
(EMAS) was built on the top of Weka machine learning workbench (Frank
et al., 2004), Epitopes Toolkit (EpiT) and BioJava (Holland
et al., 2008). EMAS is available through this link (https://sites.google.com/site/epitopesprediction).
||Number of instances in LB-datasets
||BM models parameters
Case of study: Potato leaf roll virus: The case study was the coat protein
sequence of the Egyptian isolates of potato leaf roll virus (El-Attar
et al., 2010) obtained from NCBI with accessions no. ACU80557 (Fig.
1). The EMAS and BM models were used to predict most immunogenic peptide
in this amino acids sequence of the coat protein of the Egyptian isolates of
Synthetic peptide and immunization: Mice were injected five times with 50, 70, 150, 200 and 250 μg with one week interval between every injection with equal volume of complete Freund's adjuvant in first two injections and incomplete Freund's adjuvant for the rest injections. The blood was collected after one week from last injection then the antiserum was separated from blood and tested using immuno dot-blot analysis and Triple Antibody Sandwich ELISA (TAS-ELISA).
Serological detection of PLRV: PLRV-antiserum was tested using immuno
dot-blot and TAS- ELISA according to the procedures described by Weidemann
(1988), DArey et al. (1989) and El-Araby
et al. (2009).
Epitope prediction models: We built eight models and they are available
through the link (https://sites.google.com/site/epitopesprediction).
Performance evaluation of BM models done by 10 fold cross validation test and
area under the Receiver Operation Characteristic (ROC) curve was calculated
to all BM models (Table 3).
Epitopes Model Applier Software (EMAS): EMAS which was developed as
open source software and released under General Public License (GPL) is a tool
to apply models to any protein sequence. After downloading EMAS from (https://sites.google.com/site/epitopesprediction)
EMAS can be run as in Fig. 2 and steps to perform the prediction
can be as follow:
||Upload model file
||Upload test data
||Adjust peptide or window length
||Choose peptide based
||Choose input format as fasta sequence
||Make output file
||Click predict button to start the prediction process
Epitopes prediction to potato leaf roll virus coat protein using EMAS:
The potato leaf roll virus coat protein sequence was retrieved from GenBank
(accession no. ACU80557), then EMAS run seven times with each BM models using
the PLRV coat protein sequence.
||BM models sorted by Area er ROC curve
||The score of PLRV coat protein (163:192) peptide obtained
by eight BM models
||Comparison between PLRV predicted epitopes with those previously
obtained by Torrance (1992) and Terradot
et al. (2001)
|Underlined sequences correspond to previously detected PLRV
The thirty amino acids peptide which starts from position 163 to position 192
(Table 5) got high score with most BM models.
The results were match with Torrance (1992) and Terradot
et al. (2001) studies, which make it one of the best candidates to
be immunogenic and capable of producing antibodies that cross react with PLRV.
Table 4 represent the results of eight models with the PLRV
coat protein (163:192) peptide. (Table 5) illustrate the comparison
between PLRV predicted epitopes with those previously obtained by Torrance
(1992) and Terradot et al. (2001).
Antiserum production against PLRV-predicted epitopes: The PLRV coat protein peptide (ARMINGVEWHDSSEDQCRILW KGNGKSSDT) from position (163) to (192) was ordered from GenScript Corporation, NJ 08854, USA.
PLRV-antiserum raised against this synthetic peptide was produced using mice for immunization and was serologically tested as described below.
Serological detection of PLRV: The produced PLRV- antiserum was tested using immune dot-blot and TAS- ELISA. Two more antisera were used for comparison: Antiserum raised against PLRV virus particles (viral antiserum) and antiserum raised against PLRV coat protein (CP antiserum).
Immuno dot-blot test: Viral, CP and synthetic peptide antisera were strongly reacted against PLRV-infected potato samples (Fig. 2, samples 1, 3 and 5, respectively). However, the reaction against the synthetic peptide using the synthetic peptide-antiserum was higher than that of the viral and CP antisera (Fig. 3, samples 6, 2 and 4, respectively). No reaction was detected against the healthy potato sample using synthetic peptide-antiserum (Samples 7, 8).
||Amino acid sequence for coat protein of PLRV in fasta format
||EMAS how to run
||Dot-blot detection of PLRV using PLRV-antiserum produced against
the synthetic peptide in comparison with two different PLRV-antisera. Dots
1, 3 and 5 are PLRV-infected potato sample. Dots 2, 4 and 6 are the synthetic
peptide. Dot 7 and 8 are negative control. A, B and C are PLRV-antisera
raised against the viral particles, the coat protein and the synthetic peptide,
TAS-ELISA: PLRV was specifically detected using the synthetic peptide,
viral and CP antisera. No reaction was appeared with the negative control (Table
||ELISA detection of PLRV-infected potato leaves using PLRV
antiserum in comparison with two different PLRV antisera
|1and 2: Two different samples of PLRV-
potato leaves, 3O.D. reading equal or greater than twice absorbance
value of healthy controls was considered positive
Our approach for using computational methods for producing antibodies by predicting
most immunogenic peptide in viral antigen agree with Saravanan
et al. (2009), although they used another algorithm for epitopes
prediction. They used antigenic index (residue-based predictors) which is calculated
on a weighted scale by considering the presence of characters such as surface
probability, hydrophilicity and flexibility of a given set of amino acids in
the range of seven to eleven amino acids in a protein. So, Saravanan
et al. (2009) method depend on physical and chemical properties only
which was reported for its low performance according to Blythe
and Flower (2005) but Saravanan et al. (2009)
overcome the low performance of prediction method by immunization with multiple
Our method belongs to epitope-based predictors. We used machine learning algorithms
(SVM and Subsequence string Kernel) which was reported for its high performance
in predicting linear B- cell epitopes by El-Manzalawy et
al. (2008a, b) which enable us to use single
peptide in immunization but in more length (30 instead of 11) to produce more
As a conclusion, results indicate that our bioinformatics strategy is a powerful
tool for antibodies production. The use of epitopes prediction by computational
methods has eliminated the need to obtain large amounts of viral expressed proteins
or purified virus. Also, results indicate that using BM models with EMAS in
the designing and choosing of immunogenic peptide are reliable and have advantages
like: (1) Producing antibodies faster and cheaper; (2) Producing antibodies
for any protein we have information about its sequence even we dont have
the protein itself physically. And (3) Commercialization of the produced antibodies
faster and easier than antibodies produced by viral expressed proteins and cloning
methods because of intellectual property rights issues related to cloning vectors.