10.1021/ci800406y.s001
Junmei Wang
Junmei
Wang
Tingjun Hou
Tingjun
Hou
Xiaojie Xu
Xiaojie
Xu
Aqueous Solubility Prediction Based on Weighted Atom
Type Counts and Solvent Accessible Surface Areas
American Chemical Society
2009
prioritizing compound libraries
Aqueous Solubility Prediction
solubility model
HTS
RMSE
Weighted Atom Type Counts
0.841 logarithm unit
Solvent Accessible Surface AreasIn
0.840 logarithm unit
atom type counts
2009-03-23 00:00:00
Dataset
https://acs.figshare.com/articles/dataset/Aqueous_Solubility_Prediction_Based_on_Weighted_Atom_Type_Counts_and_Solvent_Accessible_Surface_Areas/2869495
In this work, four reliable aqueous solubility models, ASM-ATC
(aqueous solubility model based on atom type counts), ASM-ATC-LOGP
(aqueous solubility model based on atom type counts and <i>ClogP</i> as an additional descriptor), ASM-SAS (aqueous solubility model
based on solvent accessible surface areas), and ASM-SAS-LOGP (aqueous
solubility model based on solvent accessible surface areas and <i>ClogP</i> as an additional descriptor), have been developed
for a diverse data set of 3664 compounds. All four models were extensively
validated by various cross-validation tests, and encouraging predictability
was achieved. ASM-ATC-LOGP, the best model, achieves leave-one-out
correlation coefficient square (<i>q</i><sup>2</sup>) and
root-mean-square error (<i>RMSE</i>) of 0.832 and 0.840
logarithm unit, respectively. In a 10,000 times 85/15 cross-validation
test, this model achieves the mean of <i>q</i><sup>2</sup> and <i>RMSE</i> being 0.832 and 0.841 logarithm unit,
respectively. We believe that those robust models can serve as an
important rule in druglikeness analysis and an efficient filter in
prioritizing compound libraries prior to high throughput screenings
(HTS).