posted on 2015-12-04, 00:00authored byHeeyoun Hwang, Gun Wook Park, Kwang Hoe Kim, Ju Yeon Lee, Hyun Kyoung Lee, Eun Sun Ji, Sung-Kyu Robin Park, Tao Xu, John R. Yates, Kyung-Hoon Kwon, Young Mok Park, Hyoung-Joo Lee, Young-Ki Paik, Jin Young Kim, Jong Shin Yoo
The goal of the Chromosome-Centric
Human Proteome Project (C-HPP)
is to fully provide proteomic information from each human chromosome,
including novel proteoforms, such as novel protein-coding variants
expressed from noncoding genomic regions, alternative splicing variants
(ASVs), and single amino acid variants (SAAVs). In the 144 LC/MS/MS
raw files from human hippocampal tissues of control, epilepsy, and
Alzheimer’s disease, we identified the novel proteoforms with
a workflow including integrated proteomic pipeline using three different
search engines, MASCOT, SEQUEST, and MS-GF+. With a <1% false discovery
rate (FDR) at the protein level, the 11 detected peptides mapped to
four translated long noncoding RNA variants against the customized
databases of GENCODE lncRNA, which also mapped to coding-proteins
at different chromosomal sites. We also identified four novel ASVs
against the customized databases of GENCODE transcript. The target
peptides from the variants were validated by tandem MS fragmentation
pattern from their corresponding synthetic peptides. Additionally,
a total of 128 SAAVs paired with their wild-type peptides were identified
with FDR <1% at the peptide level using a customized database from
neXtProt including nonsynonymous single nucleotide polymorphism (nsSNP)
information. Among these results, several novel variants related in
neuro-degenerative disease were identified using the workflow that
could be applicable to C-HPP studies. All raw files used in this study
were deposited in ProteomeXchange (PXD000395).