Proteogenomic Definition of Biomarkers for the Large <i>Roseobacter</i> Clade and Application for a Quick Screening of New Environmental Isolates

Whole-cell, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry has become a routine and reliable method for microbial characterization due to its simplicity, low cost, and high reproducibility. The identification of microbial isolates relies on the spectral resemblance of low-molecular-weight proteins to already-existing isolates within the databases. This is a gold standard for clinicians who have a finite number of well-defined pathogenic strains but represents a problem for environmental microbiologists with an overwhelming number of organisms to be defined. Here we set a milestone for implementing whole-cell MALDI-TOF mass spectrometry to identify isolates from the biosphere. To make this technique accessible for environmental studies, we propose to (i) define biomarkers that will always show up with an intense <i>m</i>/<i>z</i> signal in the MALDI-TOF spectra and (ii) create a database with all the possible <i>m</i>/<i>z</i> values that these biomarkers can generate to screen new isolates. We tested our method with the relevant marine <i>Roseobacter</i> lineage. The use of shotgun nanoLC-MS/MS proteomics on the small proteome fraction of nine <i>Roseobacter</i> strains and the proteogenomic toolbox helped us to identify potential biomarkers in terms of protein abundance and low variability among strains. We show that the DNA binding protein, HU, and the ribosomal proteins, L29 and L30, are the most robust biomarkers within the <i>Roseobacter</i> clade. The molecular weights of these three biomarkers, as for other conserved homologous proteins, vary due to sequence variation above the genus level. Therefore, we calculated the <i>m</i>/<i>z</i> values expected for each one of the known <i>Roseobacter</i> genera and tested our strategy during an extensive screening of natural marine isolates obtained from coastal waters of the Western Mediterranean Sea. The use of this technique versus standard sequencing methods is discussed.