Network pharmacology based high throughput screening for identification of multi targeted anti-diabetic compound from traditionally used plants

Abstract The incurable Type 2 diabetes mellitus (T2DM) has now been considered a pandemic with only supportive care in existence. Due to the adverse effects of available anti-diabetic drugs, there arises a great urgency to develop new drug molecules. One of the alternatives that can be considered for the treatment of T2DM are natural compounds from traditionally used herbal medicine. The present study undertakes, an integrated multidisciplinary concept of Network Pharmacology to evaluate the efficacy of potent anti-diabetic compound from traditionally used anti-diabetic plants of north east India and followed by DFT analysis. In the course of the study, 22 plant species were selected on the basis of their use in traditional medicine for the treatment of T2DM by various ethnic groups of the north eastern region of India. Initially, a library of 1053 compounds derived from these plants was generated. This was followed by network preparation between compounds and targets based on the docking result. The compounds having the best network property were considered for DFT analysis. We have identified that auraptene, a monoterpene coumarin for its activity in the management of Type 2 diabetes mellitus and deciphered its unexplored probable mechanisms. Molecular dynamics simulation of the ligand–protein complexes also reveals the stable binding of auraptene with the target proteins namely, Protein Kinase C θ, Glucocorticoid receptor, 11-β hydroxysteroid dehydrogenase 1 and Aldose Reductase, all of which form uniform interactions throughout the MD simulation trajectory. Therefore, this finding could provide new insights for the development of a new anti-diabetic drug. Communicated by Ramaswamy H. Sarma


Introduction
Diabetes mellitus (DM) is a chronic metabolic disease and has emerged as a major global health burden (WHO Library Cataloguing-in-Publication Data Global Report on Diabetes, 2016). The disease may lead to serious complications leading to mortality. According to the World Health Organization (WHO) fact sheets published on 15th May, 2020, 1.6 million DM related fatality has been recorded in 2016 (WHO Report on Diabetes, 2020). International Diabetes Federation (IDF) has reported 463 million diabetes cases in the year 2019. Based on the current trend, IDF estimated 1.5 fold global increase i.e. from 463 to 700 million diabetic cases by 2045 (IDF Diabetes Atlas-Ninth Edition, 2019). DM is characterized by high blood glucose level (hyperglycemia) for a prolonged period of time. Based on the body's insulin secretion or defective response to insulin, DM is classified as Type 1 and Type 2 diabetes mellitus (T2DM) where the later (i.e. T2DM) represents 90% of all the diagnosed DM cases (Y. Zheng et al., 2018). The major threat lies in the long-term complications of DM. Although available drugs provide supportive care by reducing the blood sugar levels, current anti-diabetic drugs pose serious side-effects like weight gain, gastrointestinal disturbances and diarrhoea, etc. (Bailey & Day, 2003;Miller et al., 2014). Considering the complex and multi-factorial nature of DM, developing a multi-targeted therapy may provide a promising solution. In this regard, traditional herbal medicine, due to their diverse chemical entity may be a better alternative for the treatment of T2DM. It is worth mentioning that from 1981 to 2010, 1073 new drugs were approved by US Food and Drug Administration (FDA) of which 34% were from natural origin (Newman & Cragg, 2012).
In this continuous effort towards the development of cost-effective and low toxicity drugs, network pharmacology provides a synergistic and holistic approach to elucidate the complex action mechanism of herbal medicine. Previous studies based on network pharmacology have illustrated anti-diabetic mechanism of natural products to some extent (Khanal et al., 2019;Li et al., 2017). There is substantial documentation across literature and folklore on plant species from northeast India being used for the treatment of diabetes. However, there exists a substantial lack of understanding on the mechanism of action of these traditional antidiabetic herbal medicines. In this study, we have employed a series of computer assisted methods, especially the Network Pharmacology approach in order to identify potential natural product (NP) from traditionally used plants with anti-diabetic properties in North East (NE) India for the development of superior drugs to ameliorate T2DM. Additionally, we have also deciphered the probable pathways in which the NP may work to combat diabetic complications.

Materials and methods
Natural product library development NE India, a biodiversity hotspot, has several plant species that have been traditionally used by different ethnic groups for the treatment of diabetes and hyper urination. For our study, we selected 22 species of plants (Supplementary Table  S1) on the basis of available literature as well as personal interviews with traditional healers, which are being used by the indigenous people of NE India. The existing pharmacological citation of the plants as anti-diabetic was also considered during the selection of the plants (Supplementary Table  S1). A non-redundant NPs library comprised of 1053 compounds from the chosen 22 different species was created on the basis of available literature. All the compounds were sketched using Marvin Sketch v6.0 and energetically optimized using CHARMm based force field at BIOVIA Discovery Studio (DS) v4.5 (D. . CHARMm based smart minimizer used 1500 steps of Steepest Descent followed by Conjugate Gradient algorithms with a convergence gradient of 0.001 kcal mol À1 . Diverse conformation option was applied and 250 conformations were generated using BEST generation module of DS by applying Poling Algorithm at an energy threshold of 15 kcal mol À1 . The principle of rigorous energy minimization in both torsional and Cartesian space that is employed in this option ensured the best coverage of conformational space by application of the poling algorithm Niu et al., 2013). A set of 925 small molecules from Drug Bank database was also retrieved and used as the reference dataset.

Generation of ADMET plot
For preliminary screening of drug-like compounds from the NP library, ADMET plot was generated using the ADMET descriptors algorithm and toxicity prediction extensible protocol of DS v4.5. Preliminary parameters including AlogP, Absorption 95, Absorption 99, Blood Brain Barrier 95 and Blood Brain Barrier 99 were checked. NPs with suitable druglike properties were only considered for the second phase of screening.

Collection and preparation of T2DM drug targets
Twenty-one protein molecules involved in the development of insulin resistance and T2DM were retrieved based on available literature. Three-dimensional structures of these proteins were retrieved from Protein Data Bank (PDB) of Research Collaboratory for Structural Bioinformatics (RCSB) ( Table 1). All the structures were prepared and optimized using Steepest Descent Algorithm (200 steps) at Protein Preparation module of DS v4.5. Potential Ligand binding site of all the structures were further computed using the Edit and Built binding site tool of DS v4.5. The resolution of protein targets ranged from 1.55 to 3.0 Å.

Chemical space analysis
Principal component analysis (PCA) was conducted on both the NPs and the reference dataset using the Library Analysis module of DS v4.5. It is a statistical method of transforming a large set of data into a new coordinate of a three-dimensional system. The variance of the data which was maximized on the first coordinate represented the first principal component. Similarly, another variance was maximized on the second coordinate in the PCA model. Herein, we have used the NPs library for Space analysis along with the drug library as reference dataset

Molecular docking
Molecular docking describes the quality and strength of interaction between small molecules or ligands and target receptor proteins during an interaction (Mahanta et al., 2020). In our study, in silico molecular docking analysis was performed using LibDock docking algorithms of Discovery Studio (DS) v4.5 software (Gogoi et al., 2016). Taking the advantages of LibDock methodology for rapid docking analysis of combinatorial libraries, a docking study was performed using the listed 21 proteins. A total of 419 compounds which satisfied the parameters of ADMET plot were used for the docking analysis against the listed 21 protein targets. To select the binding sites of the target proteins, "define and edit binding site" module under "receptor ligand interaction" tool of DS v4.5 was used. The docking results were analysed on the basis of positive LibDock score. LibDock provides positive value of binding affinity score (Bastikar et al., 2020) and higher LibDock score gives better binding of the ligand to its target proteins (Alam & Khan, 2018). Compounds which obtained more than 100 LibDock score were further used to generate a network between NPs-T2DM target proteins.

Network pharmacology
Based on the integration of System Biology and Computational Biology, the advent of network pharmacology plays a tremendous role in the development of new drug formulation. Network pharmacology predicts the association of a drug to its targets in the body. The drug-target network analysis also reveals the polypharmacological properties of the drug targeting various disease targets. In this study, we developed a network between NPs of traditional plants and the target proteins. The network comprised of nodes and edges where the nodes represent molecular species (compounds and proteins), and the edges specify intermolecular interactions between the nodes (compound-target or targettarget interactions). The "NPs -Target network" (NPs-TN) was developed based on combining NPs and all the 21 drug targets. In addition, important parameters such as Degree, Stress, Betweenness centrality, Closeness Centrality and Average Shortest Path Length were also computed to establish the NPs -Target relationship. This relationship was developed with Cytoscape v3.4.0 which is open-source software to visualize the biomolecular interaction networks and the biological pathways. It also helps in integrating these networks with annotations, gene expression profiles and other state data and makes it a unified conceptual framework (Shannon et al., 2003). The networking parameters were also analysed using the Network Analyzer Plugin. Compounds with Degree ! 10 were considered for further DFT (Density Functional Theory) computational analysis .

Density functional theory computation
Density functional theory (DFT) is a computational quantum mechanical modelling method to examine the electronic structure of matter along with the orbital energy values of compounds (Kavitha et al., 2015). Here, we employed the DFT computation using the Dmol3 software (DS v4.5) to determine the most reactive compounds as T2DM inhibitors (http://accelrys.com/products/datasheets/dmol3.pdf). Compounds with Degree !10 from the Networking Analysis were subjected to DFT analysis. Frontier Orbital energy descriptor, namely, HOMO and LUMO were calculated for all the screened compounds by using B3LYP module of DS . Further, compounds with least band energy gap (HOMO-LUMO) were selected as potential compounds with multiple targeted behaviours.

Pharmacokinetic study
To obtain the promising lead molecules in the process of drug development, analysis of the pharmacokinetic properties of the compounds is crucial. In this study, we analyzed the pharmacokinetic properties of the 3 best natural compounds post DFT analysis using the pkCSM online server (PkCSM, n.d.) (http://biosig.unimelb.edu.au/pkcsm/). This is a novel approach for the early evaluation of pharmacokinetic and toxicity property of small molecules with rapid and easy methods (Pires et al., 2015).

Redocking
Prior to perform molecular dynamics simulation study, the docking validation of the best ligand to its target proteins were carried out by re-docking analysis to confirm the binding of ligand to the active site of target protein (Shivanika et al., 2020;Gogoi et al., 2021). For that the target proteins were retrieved from PDB and prepared the protein for redocking analysis. Prior to redocking, the energy of the proteins was minimized. The energy minimized complex of protein and co-crystal ligand was considered as reference in this study. Then the co-crystal ligands were removed from their respective proteins and allowed them to redock with their proteins using the similar protocol used in the molecular docking study. After docking, the RMSD values of all the docked poses were calculated by comparing with the reference complex. Finally, the average RMSD were calculated.

Molecular dynamics simulation
Molecular Dynamics Simulation analysis for 50 ns was performed to investigate the stability of the best docked complexes (Guo et al., 2010). The MD Simulation was studied using the Desmond module of Schr€ odinger suite 2020-3 with the OPLS3 force field in the explicit solvent system. For solvation, the orthorhombic boundary condition with 10 Å buffer region was used. The molecular system was solvated with crystallographic water (TIP3P) molecules. During the process, the system was neutralised by the addition of Na þ ions with removal of overlapping water molecules. To maintain the constant temperature of the system at 300 K, an ensemble of Nose-Hoover thermostat was used. Similarly, barostat was applied to maintain the constant pressure at 1 bar. A hybrid energy minimization algorithm was utilized with 1000 steps of steepest decent protocol. This process was followed by conjugate gradient algorithms during the process of MD simulation.

Results and discussion
Preliminary ADMET profiling and principal component analysis (PCA) Initially, the entire library of 1053 compounds were screened for the preliminary ADME/Toxicity using ADMET plot as depicted in Figure 1. On filtering, 419 compounds exhibited drug like criteria and considered for further physicochemical property analysis using PCA analysis to visualize the distribution in the chemical space of NPs and the reference dataset comprised of DrugBank compounds. Eight physicochemical properties or molecular descriptors of the selected compounds were used in the PCA analysis and compared with the reference dataset. The overlapping distribution of NPs and drug compounds indicate similar properties in both groups ( Figure 2). Qualitative comparison of the NPs and the reference drug compounds may provide valuable input on the molecular properties and the likelihood of being a candidate molecule. From the study, it was found that most of the NPs followed the Lipinski's "rule of five". As observed in Table 2, the means of AlogP, number of rings, Molecular Fractional Polar Surface area (MFPSA) and hydrogen bond acceptor (HBA) of the NPs were similar to those from the Drugbank compounds suggesting that the compounds have good oral drug like properties. The mean number of rotatable bonds for the NPs was considerably lower than that of the reference molecules. It might indicate the rigidity of the NPs and may aid in gaining ligand affinity during interaction with protein. Moreover, the mean number of aromatic rings of the NPs was lower as compared to those in the DrugBank compounds. It has been earlier suggested that fewer aromatic rings may increase the developability of a compound and increase aqueous solubility (Ritchie & Macdonald, 2009).

Molecular docking analysis
Docking is a suitable approach to understand the atomic interactions of novel compound to its specific targets. In this investigation, 419 compounds were docked with 21 different T2DM target proteins using the LibDock module of DS v4.5. To get the compounds with best binding affinities, we considered the compounds with >100 LibDock scores against the target proteins (Chen et al., 2014). Of the 419 compounds, it was found that a total of 158 compounds had LibDock score >100 against the target Glucagon Synthase Kinase 3b. This target was found to be the most potential target among all the 21 putative target proteins followed by Protein Kinase C h (157 compounds), 11-b Hydroxysteroid Dehydrogenase 1 (103 compounds), Aldose Reductase (101 compounds). Figure 3 represents the docking profiles of the compounds against all the target proteins. The docking results showing LibDock score > 100 is listed in the Supplementary Table S2.

NPs-Target network analysis
Several complex diseases, including T2DM, involve the interaction of multiple genes as well as functional proteins. To determine the interactions between the compounds of the traditionally used medicinal plants of the NE India with the proteins involved in T2DM pathogenesis, network pharmacology approach was used. Since several NPs showed LibDock score > 100 to different protein targets, we developed NPs-Target network (NTN). Based on the LibDock result, a total of 216NPs were considered for Network preparation with 21 T2DM protein targets. As shown in Figure 4A, the prepared network comprised of 237 nodes and 1283 edges with average degree of 5.93. In this context, "degree" refers to the number of neighbours of a node. We considered the compounds having degree > 10 for the further study and found that of 216 NPs, only 32 NPs satisfied the criteria (Table3). It suggests that, the 10 NPs can bind to more than 15 different target proteins out of 21 T2DM target proteins with LibDock score > 100. Among these 10 compounds, C150 showed its binding affinity with 20 different receptors with LibDock score > 100. Additionally, the average of Betweenness Centrality, i.e. the primary parameter for evaluating the importance of the node in a network, was found to be0.001. The average of Closeness Centrality was 0.431 (Supplementary Table S3). The network between the target proteins and the NPs having degree > 10 is depicted in the Figure 4B.

DFT computation
In order to find out the most reactive molecule, DFT analyses of 32 compounds were performed. DFT-based descriptors  including atomic charges, molecular orbital energies, frontier orbital densities, and atom-atom polarizabilities are very useful in predicting the reactivity of atoms in molecules (Gogoi et al., 2016). The energy difference between Highest Occupied Molecular Orbital (HOMO) and Lowest Unoccupied Molecular Orbital (LUMO) of a compound is known as HOMO-LUMO gap (Band gap). The lower the HOMO-LUMO gap, molecules are considered as more reactive with lower kinetic stability. At this energetically unfavourable state, neither electrons are added to a high-lying LUMO nor extracted from a low-lying HOMO forming the active complex of a reaction. The result of the DFT computation of 32 compounds is listed in Table 4. Compounds having band gap (Ha) <0.2 were considered as the best reactive compounds among all the compounds considered for the study. From the DFT analysis, it was observed that auraptene has the lowest HOMO-LUMO gap in comparison to the other molecules suggesting the presence of strong inhibitory activity of auraptene followed by anisessine and adhatodine, respectively .

Pharmacokinetic study
Based on the reactivity, the best 3 molecules (top 10%) were considered for pharmacokinetic analysis using pkCSM online tool. Table 5 depicts the pharmacologically important parameters including absorption, distribution, metabolism, excretion and toxicity properties of the compounds. For the proper functioning of a drug at the target, the drug molecule has to enter into the systemic circulation from the extravascular site of administration (Tomioka, 2014). In this regard, Caco-2 (human epithelial colon cancer cells) permeability is studied to check intestinal absorption in terms of permeability co-efficient of a drug and other molecules. Higher Caco-2 cell permeability predicts the good human oral bioavailability (Yazdanian et al., 1998). In this study, to    check the permeability of the NPs, Caco-2 permeability was assessed and, it was observed that all the 3 compounds namely auraptene, anisessine and adhatodine show good Caco-2 permeability. P-gp (P-glycoprotein) plays an important role in distributing the drug molecule without their accumulation in a particular tissue. Inhibition and induction of Pgp induces the drugdrug interaction which may cause adverse drug reactions (Lin & Yamazaki, 2003;Palleria et al., 2013). From the toxicity point of view, none of the compounds showed the carcinogenic property for AMES toxicity test. In case of cardiovascular toxicity, the human Ether-a-gogo (hERG) related gene was investigated and observed that none of the compounds exhibited inhibitory effect against hERG I. However, the compounds showed inhibition for hERG II. Also, all the compounds revealed very low oral rat acute toxicity (LD 50 ).

Binding site analysis
From the above studies, we propose auraptene to be is the best candidate for the treatment of T2DM targeting multiple targets with good affinity followed by anisessine and adhatodine. Upon analysing the binding of auraptene with the target proteins, we observed that auraptene has the good binding affinity for 13 different proteins involved in T2DM with LibDock score > 100. The binding pattern of auraptene with these 13 targets have been elucidated and presented in Figure 5. From the docking result, it was observed that auraptene has the highest LibDock affinity towards Glucocorticoid receptor (PDB ID: 3K22) followed by Aldose reductase (PDB ID: 3RX3), 11-b hydroxysteroid dehydrogenase 1 (PDB ID: 3PDJ) and so on. The figures obtained from molecular docking analysis show that auraptene formed several non-covalent interactions with different target proteins. The overall interactions of auraptene with different target proteins are listed in Table 6 Interestingly, the compound auraptene has already been reported to have anti-diabetic activity by Takahashi et al. (2011). They showed that auraptene exhibited anti-diabetic activity through PPARa pathway. However, we did not observe high affinity binding of auraptene with PPARa and the LibDock score was found to be 94.212. Therefore, investigating pathways involving Glucocorticoid receptor, Aldose reductase or 11-b hydroxysteroid dehydrogenase for auraptene's anti-diabetic activity could be considered a promising direction.

Molecular dynamics simulation analysis
As observed from the above analyses, auraptene binds to 13 T2DM target proteins and therefore, to validate the binding     analysed by Molecular Dynamics simulation analysis for 50 ns. The change in RMSD values of the complexes were analysed by the RMSD plots showed in Figure 6. From Figure  6A, it has been observed that the RMSD of protein backbone became stable at 1.6 Å from approximately 28 ns onwards in case of 1XJD-auraptene complex, whereas the ligand was stabilised at 3.9 Å from 40 ns onwards. In case of 3K22-auraptene complex, protein backbone was stabilised at 1.8 Å during the entire period of MD simulation and ligand was stabilised at 7.5 Å from 28 ns onwards ( Figure 6B). Similarly, in case of 3PDJ-auraptene complex, protein complex was stabilised from 15 ns onwards at 8 Å whereas the ligand was stabilised at 5.5 Å from 5 ns onwards ( Figure 6C). The RMSD of 3RX3-auraptene was also analysed and is depicted in Figure 6D. From the figure, it is observed that the backbone of the protein was stabilised at 1.95 Å after 15 ns with slight shifting between 35 ns to 45 ns. The ligand was stabilised at 4 Å from 8 ns onwards. After MD simulation for 50 ns, it has been observed that auraptene interacted with various residues of their target proteins forming various interacting forces like H-bonds, hydrophobic interactions, ionic interactions etc. The interacting residues of target proteins 1XJD, 3K22, 3PDJ and 3RX3 with auraptene during the simulation is depicted in the Figure 7A. On analysing the intearcting residues it has been observed that during the course of MD simulation Lys 409 of 1XJD no longer interacted with auraptene which was observed before the simulation. Rather, auraptene interacted with some additional amino acid residues of 1XJD which were not present prior to the simualtion. Similarly, interaction of auraptene with Leu627 of 3K22 and Thr122, Ala65, Val142 of 3PDJ were alterd during the simulation period. In case of 3RX3, more changes in interacting residues have been observed. Residues Pro261, Pro211, Leu212 and Lus262 were not found in the interacting residues of 3RX3 with auraptene after 50 ns MD simulation. Instead, few more interacting residues were observed as depicted in the Figure 7A. In MD simulation, the total contacts of auraptene with different residues of the target proteins were also investigated and the resulting plots are shown in Figure 7B. From the figure it has been observed that the no of total contacts were incresed in case of 1XJD after 40 ns suggesting the formation of strong interaction between auraptene and 1XJD. In case of 3K22, the no of total contacts were reduced after 8 ns and again increased from 28 ns till the end. The no of total contacts were observed to be reduced from 10 ns onwards in 3PDJ. Unlike 1XJD, 3K22 and 3PDJ, no changes in total contacts of 3RX3 were observed.

Conclusions
The treatment of T2DM calls for a holistic approach considering the complex and interlinked mechanisms involved in the progression of the disease. Although, a few drugs are presently available in the market for the treatment of diabetes, most of them have unavoidable side effects with limited effectiveness. Therefore, the treatment of T2DM continues to remains a challenge for medical fraternity. Due to their structural diversity, natural compounds from plants provide a limitless source for developing new potent lead entities (Amirkia & Heinrich, 2015). However, available studies are inadequate to understand the mode of actions and to validate the complex interactions of plant compounds and proteins used in the traditional Indian medicinal system for the treatment of T2DM. In this study we have demonstrated the probable multi-targeting therapeutic choices for the treatment of T2DM based on Network Pharmacology and DFT based approach. Using multiple computational approaches, we identified auraptene as the promising anti-diabetic compound followed by anisessine and adhatodine. Molecular dynamics simulation study also revealed stable binding of auraptene to its target proteins suggesting it as a probable anti-diabetic agent. However, the dosage and the effectiveness of auraptene necessitates further in vitro and in vivo validation.
for Biotechnology and Bioinformatics, Dibrugarh University and Head, Department of Molecular Biology and Biotechnology, Tezpur University, Assam, India.