Physicochemical property profile for brain permeability: comparative study by different approaches.

A comparative study of classification models of brain penetration by different approaches was carried out on a training set of 1000 chemicals and drugs, and an external test set of 100 drugs. Ten approaches were applied in this work: seven medicinal chemistry approaches (including "rule of 5" and multiparameter optimization) and also three SAR techniques: logistic regression (LR), random forest (RF) and support vector machine (SVM). Forty-one different medicinal chemistry descriptors representing diverse physicochemical properties were used in this work. Medicinal chemistry approaches based on the intuitive estimation of preference zones of CNS or non-CNS chemicals, with different rules and scoring functions, yield unbalanced models with poor classification accuracy. RF and SVM methods yielded 82% and 84% classification accuracy respectively for the external test set. LR was also successful in CNS/non-CNS (denoted in this study as CNS+/CNS-) classification and yielded an overall accuracy equivalent to that of SVM and RF. At the same time, LR is especially valuable for medicinal chemists because of its simplicity and the possibility of clear mechanistic interpretation.


Introduction
Statistics indicates that the percentage of people over 65 years of age in the population of many industrialized countries is permanently increasing. It has been estimated that the treatment of central nervous system (CNS) diseases affected by aging, such as Alzheimer's disease, Parkinson's disease, brain cancer and stroke, will cost trillions of dollars.
The human brain is a uniquely complex organ, which has evolved a sophisticated protection system to prevent injury from external insults and toxins. Designing molecules that can overcome this protection system and achieve optimal concentration at the desired therapeutic target in the brain is a specific and major challenge for medicinal chemists working in CNS drug discovery [1].
The physicochemical properties needed for blood-brain barrier (BBB) penetration have been extensively studied by many researchers. Early work in this area included that of Young et al. (relationships of logBB with logP octanol-water and H-bond capacity) [2], van der Waterbeemd et al. (modeling of BBB penetration by plotting various combinations of two physicochemical properties, such as molecular size, shape, H-bonding descriptors and polar surface area (PSA), and formulation of various guidelines for use by medicinal chemists, e.g. PSA of 90 Å 2 and molecular mass of <450 [3][4][5]), Kelder et al. (estimation of upper limit of PSA for CNS drugs) [6], and Abraham et al. (Linear Free Energy Relationships study) [7][8][9]. A summary of selected literature of this century describing the physicochemical properties required for optimal brain exposure is presented in [1]. Several physicochemical properties have consistently been found to be important for permeability in CNS, including lipophilicity, expressed by logP; logarithm of the octanol/ water partition coefficient at physiological pH 7.4 (logD); the number of hydrogen bond donors (HBD) and acceptors (HBA); PSA; ionization state (pK a ); rotatable bond (RB) count, molecular weight (MW) and others [1][2][3][4][5][6][7][8][9][10].
In 1997, Lipinski proposed his ''rule of 5'' whereby four physicochemical parameters (MW, HBD, HBA and logP) were utilized for CNS drug-likeness prediction [11]. This publication had a significant influence on the development of medicinal chemistry and has been cited hundreds of times. There is also the ''rule of fours'' (compounds with (N þ O) ! 8, MW > 400 and acid pK a > 4 are likely to be Pgp substrates, whereas compounds with (N þ O) 4, MW < 400 and base pK a < 8 are likely to be non-substrates) [12].
In the current century, multivariate models that interrogate multiple descriptors simultaneously have been developed. In particularly, Wager et al. proposed multiparameter optimization (MPO) specifically for CNS drug discovery [13]. This method utilizes six physicochemical parameters (logP, logD, MW, TPSA, HBD and pK a of the most basic center). Here, instead of hard cutoffs an algorithm constructs desirability scores (within limits 0-1) for each of the six properties and calculates an overall desirability score using summation, ranging from 0 (undesirable) to 6 (highly desirable). Multiple physicochemical property graphical analysis using Radar charts was proposed by Ghose et al. [10]. In a recently published perspective [1], it was indicated that ''over the recent years MPO methods have been steadily gaining a wider acceptance in drug discovery, with their application to the prospective design of new compounds being particularly appealing. Machine-learning algorithms designed to differentiate CNS from non-CNS penetrating compounds on the basis of their substructure fingerprints have also been reported in recent years. However, these approaches are based on molecular descriptors and fingerprint patterns that are less intuitive and difficult for medicinal chemists''.
(Q)SAR methods (including machine-learning techniques) with the construction of classification and regression models of BBB penetration were also successfully developed during this century. Particular attention is drawn to the publications in this field of Adenot et al. [14], Abraham et al. [15] and Raevsky et al. [16], which showed that hydrogen-bonding properties of compounds play a crucial role in the modeling of BBB penetration. Our review of (Q)SAR approaches is presented in [17].
The main aim of this work is a comparison of different computer approaches of CNSþ/CNS À classification using the same datasets. Seven intuitive approaches of medicinal chemists (including the ''rule of 5'', MPO), chemometric methods (including logistic regression (LR) and also machine-learning techniques of random forest (RF) and support vector machine (SVM)) were applied in this work to classify CNS and non-CNS chemicals (drugs). The quality of all SAR models was evaluated by the use of strict statistical measures of the performance of a binary classification (CNSþ/CNSÀ): true positive (TP), false negative (FN), true negative (TN), false positive (FP), sensitivity

Datasets
The datasets used in this research were based on published data collected by us, and contain information about 3000 chemicals.
A training set of 1000 chemicals contained 500 CNS þ and 500 CNS À chemicals. Pruning was carried out by a dissimilarity procedure. In this case, all CNS þ and CNS À chemicals (drugs) were separately arranged in order of their Tanimoto dissimilarity indices, and the first 500 CNS þ and CNS À chemicals were selected for our datasets using our program manager CheD [18] and our MolDivs (MOLecular DIVersity & Similarity) program [19]. An external test set was formed from an excellent dataset of supporting information of the article by Ghose et al. [10]. Fifty CNS þ and 50 CNS À drugs which were not included in the above-mentioned training set were selected by the procedures indicated above. Figure 1 shows the distribution of chemical functional groups in the studied chemicals. It should be noted that the distributions are approximately equal for the training, test and external sets. A comparison of CNS þ and CNS À sets also indicated a higher percentage of hydroxyl, carboxyl and secondary amine chemicals in the CNS À sets compared with the CNS þ sets.

Descriptors
Forty-one molecular physicochemical descriptors which are widely used in medicinal chemistry permeability research were applied in this work. They are: descriptors of the ''rule of 5'' [11]: MW, number of HBD atoms, number of HBA atoms, Moriguchi octanol-water partition coefficient (MlogP); MPO descriptors [13]: topological PSA using contributions of nitrogen and oxygen atoms (TPSA(N,O)), octanol-water partition coefficient at pH ¼ 7.4 (logDD), acidity constant (pK a ); descriptors of Radar chart [10]: Ghose-Crippen octanol-water partition coefficient (AlogP), number of rotatable bonds (NRB), number of chiral centers (NCC); HYBOT descriptors [20][21][22]: octanolwater partition coefficient (logPP), molecular polarizability (a), which has been used as an important descriptor in 1450 QSAR equations [23]; sum of all positive atomic charges in a molecule ( P Q þ ); maximum negative atomic charge in a molecule (maxQ À ); maximum positive atomic charge in a molecule (maxQ þ ); maximum H-bond acceptor atom enthalpy factor in a molecule (maxE a ); maximum Hbond acceptor free energy factor in a molecule (maxC a ); maximum H-bond donor enthalpy factor in a molecule (maxE d ); maximum H-bond donor free energy factor in a molecule (maxC d ); sum of Hbond acceptor enthalpy atom factors in a molecule ( P E a ); sum of H-bond donor enthalpy atom factors in a molecule ( P E d ); sum of Hbond acceptor and donor atom enthalpy factors in a molecule ( P E ad ); sum of H-bond acceptor free energy atom factors in a molecule ( P C a ); sum of H-bond donor free energy atom factors in a molecule ( P C d ); sum of H-bond acceptor and donor free energy atom factors in a molecule ( P C ad ); PSA using enthalpy H-bond acceptor contributions (PSA ea ); PSA using free energy H-bond acceptor contributions (PSA ca ); PSA using enthalpy H-bond donor contributions (PSA ed ); PSA using free energy H-bond donor contributions (PSA cd ); PSA using enthalpy H-bond donor and acceptor contributions (PSA e ); PSA using free energy H-bond donor and acceptor contributions (PSA c ); and also constructed descriptors: P Q À /a, P E a /a, P E d /a, P E ad /a, P C a /a, P C d /a, P C ad /a, maxE a *maxE d , maxC a *maxC d . These physicochemical descriptors are connected with the main intermolecular interactions: steric (MW, a), electrostatic ( P Q À ) and hydrogen bonding ( P C a , P C d , P C ad , TPSA, HBD, HBA) and lipophilicity (MlogP, AlogP and logDD). Data regarding descriptor intervals for training and external sets are included in Supplementary material (Table S3).

Random forest (RF)
The RF model uses an ensemble of decision trees to perform classification. Each tree is formed on the basis of a bootstrap-selected set. A fixed number of randomly selected variables (descriptors) is applied at node-splitting stages for the construction of trees. The decision trees are constructed without truncation. Classification is carried out by voting. In our work, we have used computer program rf5new [24] created by Leo Breiman and Adele Cutler who are authors of the RF method. The following parameters have been used for classification: jbt (number of trees) ¼ 500, mtry (number of variables randomly selected at each node) ¼ (m) 0.5 þ 0.5 (m -number of variables), ndsize (number of cases in a node below which the tree will not split) ¼ 1. To estimate the error of the constructed model, we used the out-of-bag method.

Support vector machine (SVM)
The SVM method can be applied to create classification and regression models. The basis of the method consists of a transition from the initial space of variables to a space of higher dimension with the help of a kernel function and a search for a separating hyperplane with the maximum margin in this space. For constructing the SVM model, the LIBSVM package [25] was used. Both training and test data were scaled in standard deviation units (autoscaling procedure) [26]. During construction of a hyperplane in classification, C-SVC minimization was used. The kernel type was radial basis function.
To construct SVM models, three parameters were used: penalty parameter C (default 1), loss function parameter e (default 0.1) and kernel parameter c (default 1/M, where M is the number of features). To obtain optimized values of c and C parameters, a graphical grid search strategy was used based on three-fold crossvalidation (CV). Various pairs of C, c values were tried and the one with the best CV ACC was selected. The interval for C was 2 À5 . . . 2 15 and for c was 2 À15 . . .2 3 .
To reduce linear dependence, the selection of the descriptors was performed by two algorithms.
Algorithm 1 (1) Calculate variation coefficient (VC) and correlation coefficients (CC) for each descriptor with others; (2) choose descriptor with maximum VC; (3) set the CC threshold; (4) estimate nearest neighbors for this descriptor; (5) nearest neighbors with CC higher than the threshold are eliminated.
Algorithm 2 (1) Calculate variation coefficient (VC), correlation coefficients (CC) for each descriptor with others, and correlation coefficient with the activity (CCA); (2) choose descriptor with maximum CCA; (3) set the CC threshold; (4) estimate nearest neighbors for this descriptor; (5) nearest neighbors with CC higher than the threshold are eliminated.
Selections of the descriptors were made for the CC threshold in the range [0.50-1.00] in increments of 0.05.

Results and discussion
Empirical intuitive binary CNSþ/CNSÀ classification There are many publications in which binary classification of permeability is characterized by medicinal chemists by means of the following: improved chance of CNS penetration if MW < 450 and PSA < 90 Å 2 [5]; upper PSA limit for most of the CNS drugs is <60-70 Å 2 [6]; proposed cutoffs to avoid P-gp efflux liability: MW < 400 [12]; attributes of a successful CNS drug candidate: Table 1. Statistical protocol of CNSþ/CNS À classification by some intuitive approaches of medicinal chemists for 1000 chemicals and drugs of the training set (500 CNSþ/500 CNSÀ) and 100 drugs (50 CNSþ/50 CNSÀ) of the external test set. MW < 450; cLogP < 5; HBD < 3; HBA < 7; RB < 8; H-bonds < 8; pK a 7.5-10.5; PSA < 60-70 Å 2 [27]; compounds with TPSA < 60 Å 2 and pK a < 8 are less likely to be P-gp substrates [28]. Such rules are guidelines for structural properties of drug-like compounds. In the authors' opinion [29]: ''rules are effective and efficient means of rapidly assessing structural properties. The fastest method for evaluating the drug-like properties of a compound is to apply 'rules'''.
Rules are a set of guidelines for the structural properties of compounds that have a higher probability of being well absorbed after oral dosing. The values for the properties associated with rules are quickly counted from an examination of the structure or calculated using software that is widely available. These guidelines are not absolute, nor are they intended to form strict cutoff values for which property values are drug-like and which are not drug-like. Nevertheless, they can be quite effective and efficient. Such declarations do not ensure complete correct information about the ACC of classification of whole studied datasets which have to contain exact data on the ratio of true and false objects among active and inactive classes. Our attempt to put the abovementioned declarations on a strict statistical basis is presented in Table 1. Published cutoff descriptor values of seven known intuitive approaches were used in this table to estimate its CNSþ/ CNS À discriminating power for the same datasets: training (1000 chemicals) and external (100 chemicals).
The ''rule of 5'' or what has become known as the ''Lipinski rules'' is the most known and popular. ''These rules are a set of property values that were derived from classifying the key physicochemical properties of drug-like compounds. The rules were used at Pfizer for a few years prior to their publication and since then have become widely used. The impact of these rules in the field has been very high. This acceptance can be attributed to many factors: the rules are easy, fast and have no cost to use; the '5' mnemonic makes the rules easy to remember; the rules are intuitively evident to medicinal chemists; the rules are a widely used standard benchmark; the rules are based on solid research, documentation and rationale; the rules work effectively'' [29].
As is obvious from Table 1, the binary CNSþ/ CNS À classification model based on the ''rule of 5'' showed that 397 from 500 CNS þ chemicals of training were placed below the indicated cutoff values and so were classified correctly. However, in this region also, 302 CNS À chemicals were shown. Clearly, this is because of rather high cutoff values. As a result, although the SE of this model is good (SE ¼ 0.794 for the training set and 0.940 for the external set), SP is very poor (SP ¼ 0.396 and 0.220, respectively) and overall ACC is rather modest (ACC ¼ 0.595 and 0.580).
The MW cutoff value in model 2 is lower than that in model 1, the logP descriptor is changed to logD, and a PSA descriptor is incorporated, leading to better classification compared with model 1. However, it should be noted that the model is unbalanced for the training set (SE ¼ 0.438; SP ¼ 0.832). Model 3 is, even more, unbalanced, with correct recognition of almost all CNS þ and false recognition of almost all CNS À chemicals. Model 4, using the composite descriptor clogP- (N þ O), is quite well balanced, with total prediction ACC of 0.700. Models 5 and 6 recognize and predict CNS þ chemicals very poorly. Model 7, which was indicated in [1] as a perspective for drug design, gave total prediction ACC close to 50%, which could be obtained simply by random allocation to the two classes.
Other obvious disadvantages characterize all intuitive approaches of CNSþ/CNS À classification: arbitrary choice of cutoff descriptor values, poor discriminating power, absence of analysis of the balance between correctly recognized CNS þ and CNS À chemicals and an ignorance of any validation procedures. Those intuitive models are often named as ''rules''. That presumes their universal character although often they are constructed on the basis of small datasets [13]. Hence, the application of such ''rules'' to all chemical space, containing at least 10 180 chemicals [33], seems too far from reality.

SAR binary CNSþ/CNS À classification by RF and SVM
At present, a strict chemometric binary classification can be achieved by a whole set of classical and modern methods including discriminant analysis, Na€ ıve Bayes, LR, classification and regression tree, k-nearest neighbor, RF, SVM, Gaussian process, stochastic gradient boosting and others. Yee et al. [34] have reviewed this area, and an example of successful application of the above methods for SAR binary classification has been reported by Zang et al. [35]. Table 2 contains results of CNSþ/CNS À classification of studied train and external sets by RF and SVM.
Both methods gave good results of CNSþ/CNS À classification. The best RF model was constructed using 17 descriptors connected with volume-related, electrostatic terms and hydrogen bonding. CNSþ/CNS À classification of the external test set has excellent statistical criteria. Classification by SVM for the external test set using 26 descriptors was even a little better. Results of CV for both models are also quite satisfactory.

Logistic regression as an alternative to intuitive CNSþ/ CNS À classification approaches
The above unsatisfactory situation connected with the predictive power of ''intuitive'' approaches of medicinal chemists for CNSþ/ CNS À chemical classification leads to the problem of selecting an appropriate SAR method(s) from the above-indicated wide set of modern powerful chemometric techniques that can be used in medicinal chemistry. The availability in this work of a suitably large descriptor pool suggests the use of a machine-learning technique such as RF and SVM for CNSþ/CNS À classification. However, as pointed out by Rankovic [1], ''these approaches are less intuitive and difficult for medicinal chemists to translate into optimization hypotheses''. In other words, medicinal chemists prefer to carry out simple and understandable calculations and have models with clear mechanistic interpretation. That is why we selected in this work an LR method for detailed SAR analysis of predicted CNSþ/CNS À classification power of single descriptors and of multiple descriptors. LR is very similar to linear regression, and in our view, it satisfies the above requirements of medicinal chemists. This method is used to model the probability of the occurrence of some event as a linear function of a set of predictors.
The following equation calculates the probability Y: . . a n x n and a 0 is an intercept, x 1 , x 2 . . . x n are descriptors with their corresponding regression coefficients. Given an unknown compound, LR calculates the probability that the compound belongs to a certain target class. For example, in predicting whether an unknown compound is CNS þ or CNSÀ, LR tries to estimate the probability of the compound being a CNS þ chemical. If the calculated Y is >0.5, then it is more likely to be CNSþ. It is important for medicinal chemists that, in a similar way to multiple linear regression, the regression coefficients in LR can describe the influence of molecular descriptors on the outcome of the prediction. When the coefficient has a large value, it shows that the molecular descriptor strongly affects the probability of the outcome, whereas a zero value coefficient shows that the molecular descriptor has no influence on the outcome probability. Likewise, the sign of each coefficient affects the probability as well; that is, a positive coefficient increases the probability of an outcome, while a negative coefficient will result in the opposite. As indicated in [34], LR generally cannot handle high dimensional data well, especially if the dataset is large (e.g. more than 60 000 compounds) with large class imbalance (e.g. active:non-active ¼ 1:68), which can be common in a training set intended for virtual screening. Such LR models commonly require increased computational time and do not perform well. The size of our datasets and balance ratio of CNSþ/CNS À chemicals permit the use of LR, so we have carried out a detailed analysis of binary classification by this method. A successful recent application of LR in SAR for binary classification of active/inactive chemicals is described in [36][37][38].
First of all, in this work, we used LR for CNSþ/ CNS À classification by single descriptors. It should be noted that in the case of application of a small number of descriptors, the LR method is even preferable to machine-learning techniques, from a methodology point of view. The supplementary material of this article includes data about the CNSþ/CNS À classification ability of 41 single descriptors used in this work (Table S1).
For the 1000-compound training set, total classification ACC for the whole descriptor set is in the range 0.529-0.682 and for the 100-compound external test set the range is 0.490-0.760.
The best classification results were obtained with descriptors TPSA (ACC ¼ 0.760 for external test set), PSA c (0.750), PSA e (0.740), HBD (0.740), P C d (0.730) and P C ad (0.720). All these descriptors directly or indirectly describe hydrogen bonding ability. For example, P C ad is a thermodynamic descriptor and reflects the total H-bond ability of a molecule [21]. (T)PSA is defined simply as the part of a molecular surface that is polar and can participate in hydrogen bonding. PSA descriptors for the prediction of drug permeability are very popular for several reasons. First, PSA is very easy to interpret, with the notion of ''molecular polar surface'' and its influence on interactions with a molecule's environment similar to a medicinal chemist's own intuition (and is probably also a good approximation of physical reality). Second, PSA is easy to calculate. For TPSA, the calculation is particularly easy and fast, requiring only the identification of polar fragments and then a table lookup to find respective fragment contributions [39]. The high correlation of 3D PSA and TPSA for 34 810 drug-like molecules (r2 ¼ 0.982) helps to confirm the scientific meaning of this descriptor [40].
We expected good CNSþ/CNS À classification ability of the above-mentioned descriptors because hydrogen bonding is considered now as one of the most important (if not the most important) intermolecular interactions in general and in drug transport in living organisms in particular.
We have also found that descriptors connected with lipophilicity (MlogP, AlogP, logPP) have the poor CNSþ/CNS À discriminating ability, as first shown by Ghose et al. [10]. Single descriptors connected with volume-related terms (MW, a) and electrostatic interactions (maxQ þ , maxQ À ) gave accuracies of CNSþ/ CNS À classification in the range 0.550-0.650. The results of CNSþ/ CNS À classification by LR using two popular single descriptors, TPSA and MlogP, are given in Table 3.
It is clear from Table 3 that the model using TPSA alone with LR has better statistical criteria compared with models which were obtained by intuitive approaches. The situation with the clogP descriptors was found to be much more complex. A valuable help in the interpretation of those results was obtained from a plot of logP distribution (Figure 2).
Clearly, this graph of MlogP distribution shows that there is very strong overlapping of CNS þ and CNS À clusters, although the cluster of CNS þ objects with a maximum at $2.5 logP units is situated to the right of the cluster of CNS À chemicals with a maximum at $1.5. The mean value of those maxima is very close to the threshold estimation of 1.71 by LR procedures (Table 3).
Further improvement of CNSþ/CNS À classification by means of LR is possible by simultaneous application of a number of descriptors. However, it is necessary to be very careful because of possible high pair-wise collinearity of descriptors. For example, the best single descriptors TPSA and PSA c are highly correlated, r ¼ 0.872, and similarly for HBD and P C d , with r ¼ 0.937. For our dataset containing 1000 chemicals and drugs, using a maximum permissible level of mutual descriptor correlation of r 0.50 meant that only 13 from 41 descriptors were kept.
As to the mechanistic interpretation of the results, it is necessary to emphasize that chemical transport in CNS is a very complex process including solvation, absorption, metabolism, interaction with P-glycoprotein and other phenomena. Hence, it is highly unlikely that any single descriptor can ensure satisfactory CNSþ/ CNS À classification. Roughly, three main components may contribute to chemical transport in the CNS: transmembrane diffusion, efflux effect connected pre-eminently with P-glycoprotein interactions, and influx effects. The majority of CNS drugs are small molecules that cross the BBB via the transcellular passive diffusion route [1]. Kerns and Di [29] make the following comment about transmembrane diffusion: ''Molecules with a larger molecular size (i.e. higher MW) do not pass through the tightly packed region as readily as smaller molecules. Molecules with higher lipophilicity typically are more permeable than less lipophilic molecules through the highly non-polar central core of the lipid bilayer membrane. Molecules then move through the side chains and polar head groups of the other leaflet of the bilayer and are rehydrated by water molecules and form hydrogen bonds again''. In [5], an equation of transmembrane diffusion on the basis of MW, HBA and HBD descriptors was proposed. So, we first of all used those three descriptors for CNSþ/CNS À classification of our dataset. Equation parameters of discriminating function (Table S2) of model 12 based on those three descriptors demonstrate the predominant contribution of HBA, followed by MW. The HBD contribution is no better than its standard deviation. Classification results for model 12 are included in Table 4, showing that three out of four chemicals may be correctly predicted using this model which describes transmembrane diffusion.
Further improvement of CNSþ/CNS À classification is possible by application of descriptors connected with other components of blood-brain and brain-blood chemical transport. Table 4 contains results of CNSþ/CNS À classification by the two best models obtained by LR for the sets considered. Model 13 is based on four descriptors, including the best single descriptor TPSA, which has in this model maximal contribution to classification with an autoscale coefficient value k¼ À1.008 (parameters of LR discriminating functions for three SAR models 12-14 containing three, four and seven descriptors (Table S2)). The second significant descriptor is maxC a *maxC d (k ¼ 0.458). This descriptor characterizes the contributions of two of the most powerful chemical functional H-bond acceptor and donor groups. The third significant descriptor is maxQ À (k ¼ 0.301). This descriptor relates to electrostatic potential. The last descriptor, maxC a (k ¼ 0.254), characterizes the H-bond ability of the strongest acceptor. Model 13 has good classification parameters (overall prediction ACC of 0.810) with balanced SE and SP values. Model 14 is based on seven descriptors without TPSA. Three descriptors (maxQ À , maxC a and maxC a *maxC d ) are the same as in model 13. And instead of TPSA, four descriptors (MW, HBD, NCC and logD) together with those already indicated ensure the same total ACC as in model 13. In the case of model 14, it is clear that increasing the negative contributions of MW and HBD leads to more CNS À activity. And contributions of each from the seven descriptors in classification are approximately equal.
In order to analyze the changes of classification results obtained when the contributions of transports are included or excluded, the ''rule of four'' of Didziapetris et al. [12] was used to eliminate a number of P-glycoprotein substrates from the training set of model 13. As a result, descriptor coefficient values of discriminating functions were essentially changed: contributions of TPSA and maxC a *maxC d were remarkably decreased, the contribution of maxQ À essentially increased, whilst the contribution of maxC a was unaltered. Total ACC was slightly better as a result of better recognition of CNSþ.
Examination of Tables 2 and 4 shows that the overall ACC is approximately equal for the three SAR methods used (LR, RF and SVM). At the same time, LR is a simple method with a strict chemometric platform and allows the investigation of contributions of each descriptor in CNSþ/CNS À classification. Hence, we propose the LR method in medicinal chemistry as a reasonable alternative to the abovementioned intuitive approaches which apply arbitrary cutoffs.  Four CNSþ/CNS À classification models (models 8, 9, 13 and 14) were developed in this work, with overall accuracies of prediction of 0.81-0.84. Each method is independent and based on its own set of descriptors and calculation method. Analysis of those data showed that 41 from 50 CNS þ drugs were correctly recognized by all four models, plus additionally one drug was correctly recognized by three models. For CNS À drugs, the numbers are 30 and 6, respectively. So, 78% of drugs from the external test set were reliably classified by joint consideration of the four models. This allowed the construction of consensus models combining different combinations of three models (Table S4). The consensus model based on models (9), (13) and (14) can be considered as the optimal consensus model not only because of its high ACC (81%) but also because it has the best balance of correctly predicted CNS þ and CNS À drugs. The last column in Table S4 contains results of the program for BBBþ/BBB À classification by Bayesian approach [41] with open access on Internet [42] for studied in this work external set. Obtained in this case overall ACC equals only 0.675.

Conclusion
Medicinal chemistry rules for the design of new effective drugs, although probably useful for property optimization in sets of related compounds, are intuitive (i.e. subjective) extreme models because the cutoff takes only one part of the data into account and does not consider other parts. Because no single physicochemical parameter can be used to predict properties related to the very complicated phenomenon of brain penetration, more complex multivariate models that interrogate multiple descriptors simultaneously are required. Such well-known methods of medicinal chemists as ''rule of 5'' and MPO did not demonstrate in this work any essential advantages compared with one descriptor dependence. However, LR and modern machine-learning techniques of RF and SVM enabled stable predictable SAR models to be obtained with strict validation procedures. Because LR is simple to use and allows of mechanistic interpretation with a description of the influence of molecular descriptors on the outcome of the prediction, this method may be very useful for medicinal chemists.