Solubility Challenge Revisited after Ten Years, with
Multilab Shake-Flask Data, Using Tight (SD ∼ 0.17 log) and
Loose (SD ∼ 0.62 log) Test Sets
Posted on 2019-05-01 - 00:00
Ten years ago we
issued, in conjunction with the Journal
of Chemical Information and Modeling, an open prediction
challenge to the cheminformatics community. Would they be able to
predict the intrinsic solubilities of 32 druglike compounds using
only a high-precision set of 100 compounds as a training set? The
“Solubility Challenge” was a widely recognized success
and spurred many discussions about the prediction methods and quality
of data. Regardless of the obvious limitations of the challenge, the
conclusions were somewhat unexpected. Despite contestants employing
the entire spectrum of approaches available then to predict aqueous
solubility and disposing of an extremely tight data set, it was not
possible to identify the best methods at predicting aqueous solubility,
a variety of methods and combinations all performed equally well (or
badly). Several authors have suggested since then that it is not the
poor quality of the solubility data which limits the accuracy of the
predictions, but the deficient methods used. Now, ten years after
the original Solubility Challenge, we revisit it and challenge the
community to a new test with a much larger database with estimates
of interlaboratory reproducibility.
CITE THIS COLLECTION
DataCiteDataCite
No result found
Llinas, Antonio; Avdeef, Alex (2019). Solubility Challenge Revisited after Ten Years, with
Multilab Shake-Flask Data, Using Tight (SD ∼ 0.17 log) and
Loose (SD ∼ 0.62 log) Test Sets. ACS Publications. Collection. https://doi.org/10.1021/acs.jcim.9b00345