Prediction of pH-Dependent Aqueous Solubility of Druglike Molecules Niclas Tue Hansen Irene Kouskoumvekaki Flemming Steen Jørgensen Søren Brunak Svava Ósk Jónsdóttir 10.1021/ci600292q.s002 https://acs.figshare.com/articles/journal_contribution/Prediction_of_pH_Dependent_Aqueous_Solubility_of_Druglike_Molecules/3045172 In the present work, the Henderson−Hasselbalch (HH) equation has been employed for the development of a tool for the prediction of pH-dependent aqueous solubility of drugs and drug candidates. A new prediction method for the intrinsic solubility was developed, based on artificial neural networks that have been trained on a druglike PHYSPROP subset of 4548 compounds. For the prediction of acid/base dissociation coefficients, the commercial tool Marvin has been used, following validation on a data set of 467 molecules from the PHYSPROP database. The best performing network for intrinsic solubility predictions has a cross-validated root mean square error (RMSE) of 0.70 log<i> S</i>-units, while the Marvin p<i>K</i><sub>a</sub> plug-in has an RMSE of 0.71 pH-units. A data set of 27 drugs with experimentally determined pH-solubility curves was assembled from the literature for the validation of the combined pH-dependent model, giving a mean RMSE of 0.79 log<i> S</i>-units. Finally, the combined model has been applied on profiling the solubility space at low pH of five large vendor libraries. 2006-11-27 00:00:00 tool Marvin drug candidates HH prediction method Druglike MoleculesIn solubility space Marvin pKa RMSE PHYSPROP database solubility predictions 467 molecules 27 drugs log druglike PHYSPROP subset vendor libraries 4548 compounds model validation square error