Identification of potential CAMKK2 inhibitors based on virtual screening and molecular dynamics simulation

ABSTRACT CAMKK2 inhibitors have therapeutic effects on cancer, metabolic disorders, diabetes and osteoporosis. However, high-quality CAMKK2 inhibitors are still lacking. Natural product databases were screened for potential inhibitors using bioinformatics strategies such as pharmacophore models, molecular docking, molecular dynamics simulation. Two hit compounds, STOCK1N-53910 and STOCK1N-66485, were got via Lipinski rule, Veber rule, PAINS rule, pharmacophore models and molecular docking. The results of molecular docking showed hit compounds had similar binding patterns to the active control. Hit compounds had low toxicity by prediction of ADME/T. Molecular dynamics simulation and energy decomposition showed the binding energy of hit compounds was better than active control, and had higher affinity. These results suggest STOCK1N-53910 and STOCK1N-66485 may inhibit CAMKK2 and could be as candidate drugs for further research.


Introduction
Calcium/calmodulin-dependent protein kinase kinase 2 (CAMKK2) is a serine/threonine kinase. CAMKK2 is located on chromosome 12q24.2 and encodes a protein kinase with a molecular weight of 66∼68 kDa.
CAMKK2 regulates the phosphorylation of CaMKI, CaMKIV and AMPK and other CaM kinases [1]. It is one of the upstream activators of AMP-dependent protein kinase. On the one hand, CAMKK2 can promote ATP production by activating catabolicrelated pathways as an important cellular energy sensor. On the other hand, it can also preserve ATP by closing certain biosynthetic pathways. Ca 2+ is an important second messenger and regulates a wide range of signal transduction events by binding to CaM [2]. When combine, the two regulate cellular responses by activating a series of downstream proteins. These lead to the regulation of many important physiological and pathological processes [3,4].
Studies have shown inhibition of CAMKK2 may be effective in treating diseasea such as cancer [5], metabolic disorders, diabetes [6] and osteoporosis [7].
Abnormal activation and overexpression of CAMKK2 are associated with multiple cancer types, including prostate cancer, breast cancer, ovarian cancer, gastric cancer and liver cancer [8][9][10][11]. In many cancer cell lines and tumour models, down-regulation or drug inhibition can reduce the proliferation, migration and invasion of CAMKK2 [8,12,13]. CAMKK2 participates in the metabolism and regulation of inflammatory factors, thereby influencing the growth of cancer [9].
Liraglutide is a peptide-1 receptor agonist similar to glucagon. When Liraglutide is used to inhibit CAMKK2, it reduces food intake and promotes weight loss [14].
CAMKK2 also plays an important role in the maintenance of bone tissue. CAMKK2 is expressed in both osteoblasts and osteoclasts in bone diseases [7]. Inhibiting CAMKK2 stimulates bone formation and reverses agerelated decline in bone health by promoting osteoblast differentiation and inhibiting osteoclast differentiation [15].
In general, these findings suggest CAMKK2 is an attractive drug target. However, high-quality CAMKK2 inhibitors are still lacking, which hinders progress in this field. Virtual screening is a fast and cheap technology to find potential drugs, compared with traditional drug discovery. So in order to find inhibitors of CAMKK2, the following work was carried out. Firstly, we filtered small natural molecule library based on some rules. Then, we used pharmacophore model and molecular docking to screen. The remaining compounds were predicted by ADME/T and the pharmacokinetic and pharmacokinetic characteristics of hit compounds after docking were explored. Finally, conducting molecular dynamics simulation and binding free energy calculation to find potential active compounds.

Preparation of ligand
InterBioScreen is one of the popular large natural product libraries in the world. Download databases (https://www. ibscreen.com) and save them in SDF format and MOL2 format, respectively.  [16]. According to Linpinski rule [17], Veber rule and PAIN rule, deleted the compounds that do not meet the characteristics of synthetic drugs or false positive compounds. The main tools were Open Drug Discovery Toolkit (ODDT) (https://github.com/oddt) [18] and Drug Likeness Tool (Dru-LiTo) (http://www.niper.gov.in/pi_dev_tools).

Pharmacophore model
Pharmacophore method has been widely used in drug discovery and as a classical computer-aided drug design method [19]. Pharmacophore is a model based on pharmacodynamic characteristic elements. There are seven kinds of pharmacodynamic characteristic elements, including hydrogen bond donor, hydrogen bond receptor, positive and negative charge centre, aromatic ring centre, hydrophobic group, hydrophilic group and geometric conformation volume collision. Pharao [20] and Pharmer [21] have been established and successfully applied to various drug discovery projects at present. However, the comprehensive analysis and correct understanding of the pharmacological characteristics of anchoring are still lacking up to now. Based on this, AncPhore tool (https://ancphore. ddtmlab.org) [22] was selected for virtual screening which combines pharmacophore characteristic analysis and anchored pharmacophore. The pharmacophore model established by AncPhore is shown in Supporting Information Fig.S1. Pharmacophore characteristics include hydrogen bond acceptor (HA), negatively charged centre (NE) and hydrophobicity (HY) (Supporting Information Fig.S2).
Meanwhile, pharmacophore validation is the important criteria for robustness in identifying the active compounds in decoys. The pharmacophore model was validated by selecting decoy set from DUD-E website (http://dude.docking.org). The reliability of the model was evaluated by Güner-Henry method and the results were based on enrichment factor (EF) and goodness of hit score (GH) [23]. When GH > 0.80, it indicates the model has good prediction ability employing the formula. The EF and GH value of the evaluation model were determined by equations: D is total number of compounds in decoy set; A is total number of actives in decoy set; H t is the total hits; H a is the active hits.

Molecular docking
Molecular docking is a process to find out the best binding mode by energy matching, geometric complementarity and chemical environment matching between ligand and active pocket [24]. There are some molecular docking software the commonly used including FlexX [25], AutoDock [26], LeDock [27] and Glide [28]. The conformation sampling ability is evaluated based on the root mean square deviation (RMSD) of the molecular docking conformation and the original conformation of the crystal structure. To verify the reliability of the docking method, the ligand was extracted from original crystal structure, and then the ligand was re docked into the docking pocket by Ledock software. When RMSD is less than or equal to 2.0 Å, it is considered the method was rationality. The re-docked ligand can superposition well with original ligand, and the RMSD value of GSK-650394 is 0.172 Å (Supporting Information Fig.S4), which indicates the docking method selected is reliable.

Prediction of ADME/T
The ADME/T (absorption, distribution, metabolism, excretion and toxicity) properties of compounds are significant in drug design and drug screening [29]. Early ADME/T performance evaluation can effectively solve the problem of species differences, significantly improve the success rate of drug development, reduce drug development costs, reduce drug toxicity and side effects, and guide the clinical rational use of drugs [30]. ADMETlab 2.0 [31] (https://admetmesh.scbdd.com) was used to predict the pharmacokinetic/toxicity properties of hit compounds.

Molecular dynamics simulations
To further explore the binding stability of hit compounds and proteins obtained by virtual screening, the complex systems after molecular docking was selected for 100 ns molecular dynamics simulation. All simulations were performed using GROMACS (v.2018.8) [32,33] software package, the protein was applied by use AmberFF99SB force field. The topological file of ligand was created by the antechamber module [34] in the AmberTools18 (https:// ambermd.org/AmberTools.php) program under the GAFF force field, and partial charge was assigned by the AMI-BCC model [35]. To maintain the electrical neutrality of the system, sodium ions and chloride ions were to replace some water molecules to produce solvent box of 0.15 mol/ L NaCl.
Firstly, the steepest descent method was used to minimise the energy of the system to ensure that there was no solid geometric conflict or improper. Then, balanced the NVT and NPT of each system. The atomic number, volume and temperature were kept constant for 100 ps uilibrium in the NVT ensemble. Then, the system continued to balance for 100 ps, keeping the atomic number, pressure and temperature remained unchanged in the NPT ensemble. The system was simulated with MD for 100 ns at 300 K and 1 atm with a step size of 2.0 fs. The Particle Mesh Ewald (PME) [36] was used to calculate the long-range electrostatic interaction. Root mean square displacement (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg) and hydrogen bond were analysed based on the trajectory of MD simulation.

Molecular mechanics Poisson-Boltzmann surface area
The binding free energy of complex system was calculated by Molecular Mechanics Poisson-Boltzmann Surface Area (MM/ PBSA) [37,38]. Using Gmx_mmpbsa (https:// github.com/ Jerkwin/gmxtools) program [39] to calculate binding free energy. According to the results of dynamic simulation, the last 10 ns equilibrium segment trajectory was selected for combined free energy analysis. Taking every 100 ps as the node, 100 snapshots were taken to perform the combined free energy and energy decomposition calculation. The calculation formula is as follows: ΔG bind is the binding free energy; ΔE vdw is the van der Waals action; ΔE ele is the electrostatic or coulomb action; ΔG PB,sol is the solvation free energy of polar components; ΔG nonpolar, sol is the solvation energy of nonpolar components; T is the temperature and ΔS is the entropy of the system.
The binding free energy was finally decomposed into the above four parts by MM/PBSA method.

Virtual screening
69,067 compounds from InterBioScreen datebase were screened by multiple rounds of virtual screening ( Figure 1). Firstly, the InterBioScreen datebase was screened based on 'Lipinski rule' and 'Veber rule', and 59,768 compounds were obtained. Then, the obtained compounds were refiltered by PAINS rule, and got 53,162 compounds ( Figure  1(A)). The pharmacophore model showed (Figure 1(B)) the ligand formed hydrogen bond interaction with the amino acid residue Val270 of CAMKK2. At the same time, EF value was 33.31 and GH value was 80.63 based on 533 decoy compounds in the Güner-Henry method (Supporting Information Table. S1), which indicated the pharmacophore model had specific and good prediction ability. 178 compounds were obtained through Ancphore screening. Finally, Ledock software was used for virtual screening and two hit compounds were obtained. The structure of hit compounds was shown in Figure 1(C). GSK-650394 was active control, STOCK1N-53910 and STOCK1N-66485 were hit compounds.

Molecular docking
Molecular docking showed the compound matched well with CAMKK2. GSK-650394 forms hydrogen bond interaction with Val270 of CAMKK2, salt bridge interaction with Lys194 and hydrophobic interaction with Phe267 (Figure 1(D)), which was basically consistent with the pharmacophore model generated by Ancphore.
Compounds interacted with CAMKK2 in the same binding pocket, and the compounds had similar effects with key amino acid residues (Figure 2). STOCK1N-53910 formed hydrogen bond interaction with Lys194, Val270 and Asn317 of CAMKK2 and salt bridge interaction with Lys194 (Figure 2(B)). STOCK1N-66485 formed hydrogen bond interactions with Tyr176, Lys194 and Val270 of CAMKK2 and salt bridge interaction with Lys194, respectively (Figure 2(C)). We found the docking scores of hit compounds were higher than GSK-650394 ( Table 1). The formation of these bonds will inhibit CAMKK2, thereby blocking the development of disease.

Prediction of ADME/T
The human intestinal absorption (HIA) of hit compounds was small, indicating these compounds should not be taken orally. Drugs flow to the other parts of the body through the blood and then metabolised in the liver [40]. A group of enzymes from the cytochrome P450 family can break down compounds and excrete bile and urine from human body [41]. Specific bioactive substances are as substrates for these enzymes and then were metabolised by the corresponding CYP 450 enzymes. On the contrary, some bioactive substances act as inhibitors of these enzymes and interfere with the biodegradation process [42,43]. The results reflected that the screened compounds STOCK1N-53910 and STOCK1N-66485 played the role of CYP 2C9 antagonism and GSK-650394 acted as CYP 2D6 antagonist. And hit compounds likely had a short half-life. Compared with compound GSK-650394, the hit compounds have smaller human hepatotoxicity and rat oral acute toxocity ( Table 2).

Molecular dynamics simulation
The above three complexes were simulated for 100 ns, respectively. The RMSD, RMSF, Rg and hydrogen bond (H-bond) of the complexes were monitored and analysed to evaluate the stability of the binding between the compound and CAMKK2 (Figure 3).

RMSD
The RMSD of complexes describes the change in conformation compared with the initial structure simulation, which can be used to check the convergence of the simulation and the stability of the complexes. The complex RMSD fluctuated or increased significantly at the beginning. The system fluctuated gently after 25 ns, and the fluctuation range of the three system curves was less than 0.1 Å. Therefore, the three systems reached equilibrium (Figure 3(A)).

RMSF
RMSF can reveal the fluctuation information of the average position of protein residues during the simulation period, and can be used to investigate the flexibility and binding effect of the complex system at the active site. Amino acid 159 was converted to amino acid 182 for convenience of observation. The RMSF (Figure 3(B)) of the two complexes showed similar kinetic fluctuation behaviour with active control, ranging from 0.05 Å to 0.5 Å. The residues Glu174, Ser175, Val278, Pro279, Glu393, Arg394 and Ile395 of STOCK1N-66485 have significant RMSF fluctuations. This is because they belong to the flexible region of protein, and other regions have low fluctuation.

Rg
The Rg is a criterion of system compactness [44]. When the Rg value remains stable, the protein is stable in the simulation process. Due to the dynamic characteristics of proteins, there are some fluctuations (Figure 3(C)). However, in the whole simulation process, the value of Rg fluctuates in the range of 1.9-2.0 Å during the whole simulation process. The values of all systems are consistent, which means the conformation of system remains steady.

H-bond
Hydrogen bonding is one of the most important non-bonding interactions to maintain the stability and plays a key role in the process of molecular recognition. To explore the molecular recognition mechanism between hit compounds and CAMKK2, the number of hydrogen bonds was studied. The  number of hydrogen bonds changed during the 100 ns simulation, because some weak hydrogen bonds would form and disappear easily ( Figure 3D). In the process of MD simulation, hit compounds formed the same or more hydrogen bonds with CAMKK2 comparing with GSK-650394, which is consistent with the results of molecular docking. These results show that the combination of the screened compounds with CAMKK2 is more stable.

Free energy landscape
The free energy landscape (FEL) is a kind of probability distribution function of free energy obtained by principal component analysis and statistics based on MD equilibrium trajectories, which predicts the conformational changes in the protein by different energy states [45]. FEL is a probability distribution function of free energy obtained by principal component analysis and statistics based on MD equilibrium trajectory to predict the conformational changes of proteins in different energy states. FEL analysis was performed by calculating the radius of gyration, RMSD and Gibbs free energy of trajectories. The purple area indicates the lowest energy, while the brown area indicates the higher energy. The cartoon below FEL is the lowest energy conformation (Figure4). The lowest energy conformation of these compounds is similar to the conformation after molecular docking, indicated the docking method is reliable.

MM/PBSA
To further analyse, screen and exclude false positive events, MM/PBSA method was used to evaluate the binding stability of the two compounds in the binding pocket. In general, the lower binding energy is the more stable composite system. The calculation results show the binding free energies of GSK-650394, STOCK1N-53910 and STOCK1N-66485 are, respectively, −14.712, −46.190 and −58.710 kJ/mol, respectively ( Table 3). The binding free energy of hit compounds is significantly lower than GSK-650394, indicating the binding between hit compounds with CAMKK2 is likely to be stronger than GSK-650394 with CAMKK2. The electrostatic force plays a major role in the stability of the composite system. Although van der Waals force is not as important as electrostatic interaction, it is negative, which means it is beneficial to the binding energy. The polar interaction is positive, which indicates it unfavourable to the binding energy. We performed the energy decomposition of amino acid residues to analyze the contribution of different residues based on the results of MD simulation and binding free energy. It can be seen that Ile171, Val179, Ala192, Lys194, Glu236, Val249, Phe267, Leu269, Val270, Leu319, Ala329, Asp330 have a great impact on the binding effect ( Figure 5). Among them, amino acid residues Ile171, Val179, Ala192, Val249, Phe267, Leu269, Val270, Leu319 and Ala329 are the most favourable for the binding of the compounds, while residues Lys194 and Asp330 are not conducive to the binding of compounds to CAMKK2. Firstly, Val179 has the largest contribution to energy, which is mainly due to the contribution of van der Waals force. Secondly, Val270 also contributes to the binding energy, which is mainly due to hydrogen bond interaction. The contribution of Leu319 to the binding energy is due to the influence of van der Waals, which is consistent  with its main formation of hydrophobic interaction. These are consistent with molecular docking results, indicating the docking method selected in this paper is reliable.

Conclusion
Lipinski rule, Veber rule and PAINS were used to filter out the compounds that do not meet the properties of proprietary drugs or false active, which could not only eliminate the interference of non-drug compounds to the results as soon as possible, but also reduce the unnecessary waste of resources in the later period. Then, virtual screening based on pharmacophore and molecular docking was performed for the drugs that met the patent rules. Finally, two hit compounds were selected for further research. During the MD simulation, the dynamic binding of complexes was evaluated by RMSD, RMSF value, cyclotron radius (Rg) and hydrogen bond (H-bond). Binding free energy was performed to further evaluate the binding mode of CAMKK2. All the results provide a reference for the discovery of new CAMKK2 inhibitors and help to guide the further finding of other lead compounds.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Declaration of intererst
Because in completing this article, Miss.Linan Zhao and I have same contribution. with the consent of all authors, authors intend to declare us as co-first authors.