jp050395e_si_001.xls (10.3 MB)
The Classification of Solvents by Combining Classical QSPR Methodology with Principal Component Analysis
dataset
posted on 2005-11-17, 00:00 authored by Alan R. Katritzky, Dan C. Fara, Minati Kuanar, Evrim Hur, Mati KarelsonThe results of a quantitative structure−property relationship (QSPR) analysis of 127 different solvent scales
and 774 solvents using the CODESSA PRO program are presented. QSPR models for each scale were
constructed using only theoretical descriptors. The high quality of the models is reflected by the squared
multiple correlation coefficients that range from 0.726 to 0.999; only 18 models have R2 < 0.800. This enables
direct theoretical calculation of predicted values for any scale and/or for any organic solvent, including those
previously unmeasured. The molecular descriptors involved in the models are classified and discussed according
to (i) the origin of their calculation (i.e., constitutional, geometric, charge-related, etc.) and (ii) the commonly
accepted classification of physical interactions between the solute and solvent molecules in liquid (condensed)
media. A reduced matrix 774 (solvents) × 100 (solvent scales) was selected for the principal component
analysis (PCA) by taking into account only the solvent scales with more than 20 experimental data points.
The first 5 principal components account for 75% of the total variance. The robustness of the PCA model
obtained was validated by the comparison models development for restricted submatrices of data and with
the results obtained for the full data set. The total variance accounted for by the first three PCs, for the
submatrices with the same number of solvent scales but different numbers of solvents, varies from 68.2% to
59.0%. This demonstrates that the total variance described by the first 3 components is essentially stable as
the number of solvents involved varies from 100 to 774. Subsequently, a matrix with 703 diverse solvents
and 100 solvent scales was selected for the general classification of the solvents and scales according to the
scores and loadings obtained from the PCA treatment. Classification of the theoretical molecular descriptors,
derived from the chemical structure alone, according to their relevance to specific types of intermolecular
interaction (cavity formation, electrostatic polarization, dispersion, and hydrogen bonding) in liquid media
enables a more easily comprehensible physical interpretation of the QSPR of molecular properties in liquids
and solutions. The reported QSPR models for solvent scales with theoretical molecular descriptors and the
results of the PCA analysis are potentially of great practical importance, as they extend the applicability of
correlations with empirical solvent scales to many previously unmeasured systems.