Correction of Errors in Tandem Mass Spectrum Extraction Enhances Phosphopeptide Identification

The tandem mass spectrum extraction of phosphopeptides is more difficult and error-prone than that of unmodified peptides due to their lower abundance, lower ionization efficiency, the cofragmentation with other high-abundance peptides, and the use of MS<sup>3</sup> on MS<sup>2</sup> fragments with neutral losses. However, there are still no established methods to evaluate its correctness. Here we propose to identify and correct these errors via the combinatorial use of multiple spectrum extraction tools. We evaluated five free and two commercial extraction tools using Mascot and phosphoproteomics raw data from LTQ FT Ultra, in which RawXtract 1.9.9.2 identified the highest number of unique phosphopeptides (peptide expectation value <0.05). Surprisingly, ProteoWizzard (v. 3.0.3476) extracted wrong precursor mass for most MS<sup>3</sup> spectra. Comparison of the top three free extraction tools showed that only 54% of the identified spectra were identified consistently from all three tools, indicating that some errors might happen during spectrum extraction. Manual check of 258 spectra not identified from all three tools revealed 405 errors of spectrum extraction with 7.4% in selecting wrong precursor charge, 50.6% in selecting wrong precursor mass, and 42.1% in exporting MS/MS fragments. We then corrected the errors by selecting the best extracted MGF file for each spectrum among the three tools for another database search. With the errors corrected, it results in the 22.4 and 12.2% increase in spectrum matches and unique peptide identification, respectively, compared with the best single method. Correction of errors in spectrum extraction improves both the sensitivity and confidence of phosphopeptide identification. Data analysis on nonphosphopeptide spectra indicates that this strategy applies to unmodified peptides as well. The identification of errors in spectrum extraction will promote the improvement of spectrum extraction tools in future.