Addressing Trypsin Bias in Large Scale (Phospho)proteome Analysis by Size Exclusion Chromatography and Secondary Digestion of Large Post-Trypsin Peptides

In the vast majority of bottom-up proteomics studies, protein digestion is performed using only mammalian trypsin. Although it is clearly the best enzyme available, the sole use of trypsin rarely leads to complete sequence coverage, even for abundant proteins. It is commonly assumed that this is because many tryptic peptides are either too short or too long to be identified by RPLC−MS/MS. We show through <i>in silico</i> analysis that 20−30% of the total sequence of three proteomes (<i>Schizosaccharomyces pombe</i>, <i>Saccharomyces cerevisiae</i>, and <i>Homo sapiens</i>) is expected to be covered by Large post-Trypsin Peptides (LpTPs) with <i>M</i><sub>r</sub> above 3000 Da. We then established size exclusion chromatography to fractionate complex yeast tryptic digests into pools of peptides based on size. We found that secondary digestion of LpTPs followed by LC−MS/MS analysis leads to a significant increase in identified proteins and a 32−50% relative increase in average sequence coverage compared to trypsin digestion alone. Application of the developed strategy to analyze the phosphoproteomes of <i>S. pombe</i> and of a human cell line identified a significant fraction of novel phosphosites. Overall our data indicate that specific targeting of LpTPs can complement standard bottom-up workflows to reveal a largely neglected portion of the proteome.