Datasets for paper "Label-free data standardization for clinical metabolomics".
1. Mass lists.
File name format: Sample_(sample number)_(mass spectrometer used).txt)
Archive: 'Mass lists.zip'
Basic set:
Sample_1_maXis.txt
Sample_2_maXis.txt
Sample_3_maXis.txt
Sample_1_ Apex_Ultra.txt
Sample_2_ Apex_Ultra.txt
Sample_3_ Apex_Ultra.txt
Sample_1_ OrbiTrap_Elite.txt
Sample_2_ OrbiTrap_Elite.txt
Sample_3_ OrbiTrap_Elite.txt
Sample_1_ micrOTOF-Q.txt
Sample_2_ micrOTOF-Q.txt
Sample_3_ micrOTOF-Q.txt
Sample_1_ IFunnel_Q-ToF.txt
Sample_2_ IFunnel_Q-ToF.txt
Sample_3_ IFunnel_Q-ToF.txt
Additional set:
Sample_3_maXis_Wide_range.txt
Sample_3_maXis_High_range.txt
2. Raw mass spectra in mzXL format.
File name format: Sample_(sample number)_(mass spectrometer used).mzXL
Archive: 'Mass spectra.zip'
3. Data for the basic set of mass spectra (file ‘matrix.mat’).
Data are presented in Matlab workspace format and include the following:
mz, m/z values for mass peak intensities;
intensities, mass peak intensities, where columns correspond to mass lists (basic set) and rows correspond to m/z values;
normalization_curves, normalization curves for mass lists, where columns correspond to mass lists (basic set) and rows correspond to m/z values;
standardized_intensities, standardized_intensities for mass lists, where columns correspond to mass lists (basic set) and rows correspond to m/z values.
4. Data for additional set of mass spectra (file ‘matrix_ad.mat’).
Data are presented in Matlab workspace format and include the following:
mz, m/z values for mass peak intensities;
intensities, mass peak intensities, where columns correspond to mass lists (additional set) and rows correspond to m/z values;
normalization_curves, normalization curves for mass lists, where columns correspond to mass lists (additional set) and rows correspond to m/z values;
standardized_intensities, standardized_intensities for mass lists, where columns correspond to mass lists (additional set) and rows correspond to m/z values.
5. Dataset for Figure 1 (file ‘Dataset for Figure 1.xlsx’).
The file is in the Microsoft Excel program format and includes data for Figure 1.
6. Dataset for Figure 3 (file ‘Dataset data for Figure 3.xlsx’).
The file is in the Microsoft Excel program format and includes data for Figure 3.
7. Source code for the SantaOmics algorithm and the data to run it.
The source code is presented as a Matlab script (file ‘SantaOmics.m’). Data are presented as a saved Matlab workspace (file ‘workspace.mat’). To run the SantaOmics algorithm, the workspace should be loaded in the Matlab program, and ‘SantaOmics.m’ should be evaluated in the Matlab environment. Mass peak intensities of the initial mass spectra (presented in variable ‘intensities’; n = 15) should be standardized and written as variable ‘standardized_intensities’. Depending on the power of the computer, the algorithm may take from several to tens of minutes to complete.