pr7b00477_si_004.zip (7.32 MB)
PTML Model for Proteome Mining of B‑Cell Epitopes and Theoretical–Experimental Study of Bm86 Protein Sequences from Colima, Mexico
dataset
posted on 2017-09-18, 00:00 authored by Saúl
G. Martínez-Arzate, Esvieta Tenorio-Borroto, Alberto Barbabosa Pliego, Héctor M. Díaz-Albiter, Juan C. Vázquez-Chagoyán, Humbert González-DíazIn this work, we developed
a general perturbation theory and machine learning method for data
mining of proteomes to discover new B-cell epitopes useful for vaccine
design. The method predicts the epitope activity εq(cqj) of one query peptide
(q-peptide) under a set of experimental query conditions (cqj). The method uses as input the sequence of the q-peptide.
The method also uses as input information about the sequence and epitope
activity εr(crj) of a peptide of reference (r-peptide) assayed under similar
experimental conditions (crj). The model proposed here is able to classify 1 048 190
pairs of query and reference peptide sequences from the proteome of
many organisms reported on IEDB database. These pairs have variations
(perturbations) under sequence or assay conditions. The model has
accuracy, sensitivity, and specificity between 71 and 80% for training
and external validation series. The retrieved information contains
structural changes in 83 683 peptides sequences (Seq) determined
in experimental assays with boundary conditions involving 1448 epitope
organisms (Org), 323 host organisms (Host), 15 types of in vivo process
(Proc), 28 experimental techniques (Tech), and 505 adjuvant additives
(Adj). Afterward, we reported the experimental sampling, isolation,
and sequencing of 15 complete sequences of Bm86 gene from state of
Colima, Mexico. Last, we used the model to predict the epitope immunogenic
scores under different experimental conditions for the 26 112
peptides obtained from these sequences. The model may become a useful
tool for epitope selection toward vaccine design. The theoretical–experimental
results on Bm86 protein may help the future design of a new vaccine
based on this protein.