# Datasets for paper "Label-free data standardization for clinical metabolomics".

**1. ****Mass lists.**

**File
name format: Sample_(sample number)_(mass spectrometer used).txt) **

Archive: 'Mass lists.zip'

Basic set:

Sample_1_maXis.txt

Sample_2_maXis.txt

Sample_3_maXis.txt

Sample_1_ Apex_Ultra.txt

Sample_2_ Apex_Ultra.txt

Sample_3_ Apex_Ultra.txt

Sample_1_ OrbiTrap_Elite.txt

Sample_2_ OrbiTrap_Elite.txt

Sample_3_ OrbiTrap_Elite.txt

Sample_1_ micrOTOF-Q.txt

Sample_2_ micrOTOF-Q.txt

Sample_3_ micrOTOF-Q.txt

Sample_1_ IFunnel_Q-ToF.txt

Sample_2_ IFunnel_Q-ToF.txt

Sample_3_ IFunnel_Q-ToF.txt

Additional set:

Sample_3_maXis_Wide_range.txt

Sample_3_maXis_High_range.txt

**2. Raw mass spectra in
mzXL format.**

File name format: Sample_(sample number)_(mass spectrometer used).mzXL

Archive: 'Mass spectra.zip'

**3. Data for the basic
set of mass spectra (file ‘matrix.mat’). **

Data are presented in Matlab workspace format and include the following:

**mz**,** ***m/z* values for mass peak
intensities;

**intensities**,** **mass peak intensities, where columns correspond to mass lists
(basic set) and rows correspond to *m/z*
values;

**normalization_curves**,** **normalization curves for mass lists, where columns correspond to
mass lists (basic set) and rows correspond to *m/z* values;

**standardized_intensities**,** **standardized_intensities for mass lists,** **where columns correspond to mass lists (basic set) and rows
correspond to *m/z* values.

**4. Data for additional
set of mass spectra (file ‘matrix_ad.mat’). **

Data are presented in Matlab workspace format and include the following:

**mz**,** ***m/z* values for mass peak
intensities;

**intensities**,** **mass peak intensities, where columns correspond to mass lists
(additional set) and rows correspond to *m/z*
values;

**normalization_curves**,** **normalization curves for mass lists, where columns correspond to
mass lists (additional set) and rows correspond to *m/z* values;

**standardized_intensities**,** **standardized_intensities for mass lists,** **where columns correspond to mass lists (additional set) and rows
correspond to *m/z* values.

**5. Dataset for Figure 1 (file ‘Dataset for Figure 1.xlsx’).**

The file is in the Microsoft Excel program format and includes data for Figure 1.

**6. Dataset for Figure 3 (file ‘Dataset data for Figure 3.xlsx’).**

The file is in the Microsoft Excel program format and includes data for Figure 3.

**7.** **Source code for the SantaOmics algorithm
and the data to run it.**

The source code is
presented as a Matlab script (file **‘SantaOmics.m’**). Data are presented
as a saved Matlab workspace (file **‘workspace.mat’**). To run the
SantaOmics algorithm, the workspace should be loaded in the Matlab program, and
‘SantaOmics.m’ should be evaluated in the Matlab environment. Mass peak
intensities of the initial mass spectra (presented in variable ‘intensities’; n
= 15) should be standardized and written as variable
‘standardized_intensities’. Depending on the power of the computer, the
algorithm may take from several to tens of minutes to complete.