Identification of natural products against enoyl-acyl-carrier-protein reductase in malaria via combined pharmacophore modeling, molecular docking and simulations studies

Abstract Plasmodium falciparum is counted as one of the deadly species causing malaria. In that respect, enoyl acyl carrier protein reductase is recognized as one of the attractive druggable targets for the identification of antimalarials. Thus, from the structural proteome of ENR, common feature pharmacophores were constructed. To identify the representative models, all the hypotheses were subjected to validation methods, like, test set, enrichment factor, and Güner-Henry method, and the selected representative hypotheses were used to screen out the drug-like natural products. Further, the screened candidates were advanced to molecular docking calculations. Based on the docking score criteria and presence of essential interaction with Tyr277, seven candidates were shortlisted to conduct the HYDE and QSAR assessment. Further, the stability of these complexes was evaluated by employing molecular dynamics simulations, molecular mechanics‐generalized born surface area approach‐based free binding energy calculations with the residue-wise contribution of PfENR to the total binding free energy of the complex. On comparing the root mean square deviation, and fluctuation plots of the docked candidates with the reference, all the candidates displayed stable behavior, and the same outcome was depicted from the secondary structure element. However, from the free energy calculations, and residue-wise contribution conducted after dynamics, it was observed that out of seven, only five candidates sustain the binding with Tyr277 and cofactor of PfENR. Therefore, in the current work, the hybrid study of screening and stability lead to the identification of five structurally diverse candidates that can be employed for the design of novel antimalarials. Communicated by Ramaswamy H. Sarma


Introduction
According to World Health Organization (WHO), in 2019, around 229 million malaria cases and over 400,000 deaths were reported because of malaria (World Health Organization, 2020). These statistics represent malaria as one of humankind's fatal diseases (World Health Organization, 2020). Malaria is caused by four parasitic protozoans, and among them, Plasmodium falciparum is the most lethal one (Rich et al., 2009). It has developed resistance/multidrug resistance against the current antimalarials, which has intensified the need for the development of new potential antimalarial drugs (White, 2004). In this pursuit, past studies have shown that malaria parasites consist of an essential organelle, i.e. apicoplast, with a wide range of metabolic pathways (Ralph et al., 2001;Wilson, 2002). Among these pathways, the fatty acid biosynthesis-II (FAS-II) pathway can be exploited for antimalarial drug discovery (Surolia et al., 2004). FAS-II pathway is involved in the de novo biosynthesis of the fatty acid (Surolia et al., 2004). In this pathway, the enoyl-acyl carrier protein reductase (ENR) enzyme is necessary for the initiation of type II fatty acid biosynthesis. It catalyzes the NADH-dependent reduction of enoyl-acyl carrier protein into a saturated acyl chain (Heath & Rock, 1995;Ralph et al., 2001). Moreover, the FAS-II pathway is absent in the mammalian cells (Lu et al., 2005;Ralph et al., 2001). Therefore, considering its importance in the parasitic body, and its absence in the host makes it an attractive druggable target for the exploration of the novel antimalarial inhibitors. Thus, various crystal structures of the ENR enzyme were reported with different inhibitors. This opens the possibility for the use of various computational approaches in the discovery of new inhibitors (Kumar et al., 2015;Manhas et al., 2018;Shah & Siddiqi, 2010). Various studies report the successful utilization of different in silico methods in combination with in vitro studies on various enzymes. For instance, Rahul Singh et. al. have identified four quinoline-based molecules targeting the lactate dehydrogenase enzyme of Plasmodium falciparum (Singh et al., 2021). They have employed molecular docking studies on the 30 in-house synthesized quinolone molecules. Based on the outcome of docking studies, they have conducted molecular dynamics simulations, and Molecular Mechanics Poisson Boltzmann surface area (MM-PBSA) calculations on the four shortlisted candidates, and finally concluded with one of the molecules offering a new interaction mechanism. Kumar et al. have also evaluated the antiplasmodial activity of the synthesized novel quinolone-based molecules via in vitro and in silico studies (Kumar et al., 2019). They perform molecular docking studies and decipher the binding mechanism of the molecules against the heme and Falcipain-2 enzyme of Plasmodium falciparum. Vijay Kumar Bhardwaj and his group targeted the two chemokine receptors, i.e. CCR5 and CXCR4 to identify the new bioactive molecules based on the plant origin that could inhibit the functioning of both chemokine proteins . They employed computational studies in the combination of thermodynamics parameters to study the binding between the inhibitors and receptors for the better evaluation of the new potent molecules. Moreover, numerous structure-guided antimalarial inhibitors have been reported using several computational, biochemical, and structural approaches (Freundlich et al., 2007;Manhas et al., 2018;Perozzo et al., 2002). Among the studies reported on the ENR enzyme, for the first time, we are conducting multicomplexbased pharmacophore modeling on the enzymatic proteome of ENR. This method considers all the possible interaction patterns present in the crystallized protein-ligand complexes (Manhas et al., 2019). The technique will decipher the important chemical entities present in the binding pocket of the enzyme which helps in screening out the candidates possessing the same chemical features. Therefore, it enhances the possibility of retrieving malaria inhibiting candidates. The approach used in the current research aims at obtaining top-scored hypotheses, which can be utilized as an essential screening tool against the natural product database. Prior to the screening, the database was subjected to various drug-likeness and pharmacokinetic parameters to retrieve drug-like candidates. Subsequently, the selected drug-like compounds were subjected to molecular docking, HYDE assessment, QSAR studies, molecular dynamics simulations, molecular mechanics-generalized born surface area approach-based free binding energy (MM-GBSA) calculations with the residue-wise contribution of PfENR to the total binding free energy of the complex to identify the potent antimalarial inhibitor (Scheme 1). We anticipate this study may help to advance the knowledge to synthesize new potent antimalarial drugs.

Selection and preparation of the proteinligand complexes
There are many protein-ligand complexes of PfENR reported in the RCSB Protein Database (Bernstein et al., 1977) as shown in the ESI Table 1 (Supplementary material). The selection of the complexes to perform multicomplex-based pharmacophore modeling was based on the criteria used in the recent publication (Manhas et al., 2019). The first criteria involve the selection of only those complexes that possess known inhibitory activity, and the second criteria ensure the selection of the complexes with unique ligands only. In the case of two complexes having the same inhibitor, the complex with a good resolution value was picked. The main aim for the selection of the compounds with the reported inhibitory activity was to obtain the models from the inhibitors having the capability to inhibit the enzyme (Supplementary material, ESI Table 1). Based on the presence of inhibitors, chain A from each complex was selected for the protein preparation process. The protein preparation process was conducted using the Protein Preparation Wizard of Accelrys Discovery Studio version 4.0 (Accelrys 4.0, S. D., USA, 2021). During preparation, the missing atoms, residues, and loops were added, crystallographic water, alternate conformation were removed, acid dissociation constant, and protonation was assigned to the proteins at the physiological pH (Supplementary material, ESI Table 2).

Pharmacophore generation and validation
Before pharmacophore construction, the prepared complexes were distributed into two separate groups (Supplementary material, ESI Table 1). These groups were defined based on the nature of the inhibitory activity, i.e. IC 50 and K i, and were represented as Group 1 and Group 2 respectively. After that, each group's respective complexes were subjected to superimposition based on protein main chain atom to retrieve a common coordinate file. This step was performed using the Align and Superimpose protein protocol of Accelrys Discovery Studio version 4.0 (Accelrys 4.0, S. D., USA, 2021). The prepared file incorporates all the crucial interactions for the Scheme 1. Flow chart representing the workflow employed in the current study.
inhibition of the ENR enzyme. Within the file, the reference protein was selected based on the higher crystallographic resolution. On superimposition, it was observed that all the ligands gathered at a common interaction site in the protein ( Figure 1). Subsequently, the pharmacophore construction was performed from the superimposed file having the bioactive conformation of the inhibitors. Pharmacophore generation was done by using the Common Feature Pharmacophore Generation module of Accelrys Discovery Studio version 4.0 (Accelrys 4.0, S. D., USA, 2021). This protocol employs the HipHop algorithm to search for the common features present within the complex. The process of construction was performed at default parameters. With the inbuilt LUDI program's help in Discovery studio, the common interactions present in the complex were converted into Catalyst supported features like hydrogen bond donor (D) and acceptor (A); ring aromatic (R); hydrophobic (H); hydrophobic aromatic (Y) and aliphatic (Z); and positive (P) and negative ionizable (N) groups. Pharmacophore validation was executed to obtain a model having the capability to differentiate between the actives and presumed inactive. This step helps in evaluating the quality of the models. Hence, all the hypotheses were subjected to a couple of validation processes. First, they were validated by using the test set process, and later by employing the enrichment factor (EF) and G€ uner-Henry (GH) scoring method (G€ uner, 2000). In the test set validation method, the statistical parameters like specificity (model ability to pick actives), sensitivity (model ability to exclude inactive), and receiver operator characteristic (ROC) curves (predicts the accuracy of hypothesis in the selection of the actives) (Fawcett, 2006) were retrieved. Based on the outcome obtained from the test set method, models were subjected to the secondary process (EF and GH). Two datasets were prepared to accomplish the primary validation procedure, i.e. active (60) and inactive (319) datasets (Supplementary material, ESI Table 3). The active molecules were retrieved from the literature; however, the inactive were retrieved from the in-build ZINC all dataset (Irwin & Shoichet, 2005) using the Molecular Descriptor Based module of DecoyFinder (Cereto-Massagu e et al., 2012). DecoyFinder is a graphical tool that helps in the identification of decoy compounds for a given dataset of active ligands. In our study, we use the Molecular Descriptor Based module of the tool which finds molecules with a similar number of hydrogen bond acceptors, hydrogen bond donors, logP value, molecular weight, and the number of rotational bonds. Though it searches for the molecules having the same features, but, they are chemically different as they are defined by the maximum threshold of Tanimoto value between the actives and decoy molecules. Moreover, to ensure the chemical diversity in the decoy set, we use the maximum Tanimoto value threshold, i.e. Active ligand vs Decoy Tanimoto threshold as 0.90, Decoy vs Decoy Tanimoto threshold as 0.90. Therefore, we obtained 432 decoys for the dataset of 15 actives from the in-build ZINC all dataset. For conducting the EF and GH studies, a single dataset was generated, which was comprised of 15 experimental actives and 432 corresponding decoys (Supplementary material, ESI Table 4). Thus, the dataset of 447 molecules was prepared using the Prepare-Ligand protocol of Discovery Studio version 4.0 (Accelrys 4.0, S. D., USA) at default parameters. CHARMm force field (Brooks et al., 1983) based BEST conformation search method was employed in the conformation generation step. After this step, the prepared dataset was mapped over the models using the Ligand-Pharmacophore Mapping module of Discovery Studio version 4.0 (Accelrys 4.0, S. D., USA, 2021) to procure the mapped actives and presumed inactive. The hypotheses which show acceptable values of GH and EF studies were selected as the representative pharmacophores. These models were then subjected to conduct the virtual screening process.

Database preparation and virtual screening
In the present study, the natural product database, which contains 229,358 molecules was collected from the online Universal Natural Product Database (UNPD) (Gu et al., 2013). This database was accessed in 2017 as mentioned in the original paper (Gu et al., 2013), and was employed to search for the inhibitors against the druggable targets of malaria (Manhas, Dubey et al., 2020;Manhas, Kumar et al., 2020). As per the studies conducted by Gu et al., this database is one of the largest non-commercial and openly accessible databases of the natural product prepared for the in silico screening of the drugs. Although, this dataset is no longer accessible via the link provided by the Gu et al. (2013), however, a copy of the chemical structures of the UNPD molecules is still available on the ISDB website (http://oolonek. github.io/ISDB/) (Sorokina & Steinbeck, 2020). The dataset was prepared as per drug-likeness (Lipinski's rule of five (Lipinski et al., 1997) and Veber's rule (Veber et al., 2002)) and pharmacokinetic ADMET filters by using Calculate Molecular Property menu of Discovery Studio version 4.0 (Accelrys 4.0, S. D., USA, 2021). After the drug-likeness and ADMET screening, the dataset was subjected to the Catalyst algorithm-based Build 3D Database module of the Discovery Studio version 4.0 (Accelrys 4.0, S. D., USA, 2021) for the preparation of the database. This step ensures the selection of unique ligands only (structural duplicates were removed), the addition of the missing atoms, and the correction of bond types and atoms. Consequently, the prepared UNPD dataset was screened over shortlisted pharmacophores by employing the Search 3D Database protocol of Discovery Studio version 4.0 (Accelrys 4.0, S. D., USA, 2021). Finally, the screened molecules were saved in SYBYL mol2 file format (SYBYL, 1994), which was then subjected to perform the molecular docking calculations.

Molecular docking, HYDE assessment, and QSAR calculations
Docking studies were conducted by the FlexX module (Rarey et al., 1996(Rarey et al., , 1997 of the LeadIt 2.1.8 program (LeadIt, 2021). This module uses Incremental Construction (IC) based algorithm in which the molecules were disintegrated first and then assembled incrementally. During the process, IC provides conformational flexibility to the ligand scaffold in the receptor's binding cavity. Thus, several poses were generated having different binding free energy, which is calculated by utilizing the modified form of B€ ohm's scoring function (B€ ohm, 1994). In the docking calculation, the protocol was validated by performing the redocking. After redocking, the docking score of the redocked pose having a lower RMSD value was selected as the threshold to prioritize the docked compounds. Docking studies were performed on 3LT0. Though numerous complexes were reported, based on the resolution criteria selection, 3LT0 was selected (Supplementary material, ESI Table 1). The FlexX module (Rarey et al., 1996(Rarey et al., , 1997 of the LeadIt 2.1.8 program (LeadIt, 2021) prepares the protein's binding cavity with a defined radius of 6.5 Å to conduct the docking studies. The top 50 poses were selected from the docking calculation. The hits possessing higher docking scores than the cut-off criteria, and having crucial interactions were selected. The finalized docked complexes were then subjected to HYdrogen bond and DEhydration energies (HYDE) assessment using the HYDE module (Reulecke et al., 2008) of the LeadIt 2.1.8 program (LeadIt, 2021). This module is based on the empirical scoring function that incorporates the effects like hydrogen bonding, hydrophobic, and desolvation effect during the complex formation, and interpret the outcome as ligand binding affinity which helps in examining the stability of the complex. Further, these HYDE assessed molecules were subjected to QSAR studies and molecular dynamics simulation studies to scrutinize the predicted inhibitory activity and protein-ligand stability respectively.

QSAR studies
QSAR studies help to predict the biological activity of untested novel inhibitors. In the QSAR studies, the models were generated by using the reported biological active inhibitors of the respective enzyme. In the present study, to conduct the 3D-QSAR studies, a dataset of 148 molecules that displayed binding activity against the PfENR enzyme was collected from the binding database (https://www.bindingdb. org/bind/index.jsp). The molecules in the dataset showed inhibitory potencies ranging from 35 to 920000 nM, which was converted to molar values. Using the formula, pIC 50 ¼ 9 À log 10 [IC 50 ], these molecules were then converted to pIC 50 values. The 3D structures of the ligands were created with the Maestro's builder panel and then optimized with the LigPrep module of the Schrodinger software 2020-4 (LigPrep, 2020). At a pH of 7.0, partial atomic charges were assigned and potential ionization states were generated. The OPLS-2005 force field (Jorgensen et al., 1996) was employed to optimize the generation of the ligands with low energy. Each ligand was then subjected to an energy minimization process until it passes a root mean square deviation (RMSD) limit of 0.01 Å. The resulting molecules that were used for ligands alignment were chosen automatically after clustering by the largest common Bemis-Murcko scaffold by utilizing the Structure Alignment Builder of PHASE module of Schrodinger 2021-4 (Dixon et al., 2006). These aligned molecules were then employed to generate the 3D-QSAR model. The PHASE module provides a framework for developing the 3D-QSAR models by utilizing the ligand activities. Partially least square regression was used to represent a high number of binaryvalued variables in PHASE 3D-QSAR models. The generated models were then evaluated by using several parameters, like, standard deviation of regression (SD), regression coefficient (R 2 ), regression cross validation ratio (R 2 -CV), ratio of model variance to observed activity variance (F), significance level of variation (P), stability (S), root means square error (RMSE), and Pearson-R value (P-R). Based on the validation parameter, the best model was selected to predict the activities of the docked inhibitors. Thereafter, molecular dynamics simulations were conducted on the prepared docked complexes to check their binding stability.

Molecular dynamics simulations
All the simulations were conducted by using OPLS-2005 force field parameters (Jorgensen et al., 1996) on the . The systems were prepared using a TIP3P water model (Jorgensen et al., 1983) added in a 10 Å 3 orthorhombic box to perform the simulations. Furthermore, the systems were neutralized by incorporating the respective number of anions. Primarily, the systems were relaxed by using the six-step default relaxation protocol. In the protocol, the first two steps were dedicated to energy minimization, which occurs with and without restraints on the solute atoms. This method employed the convergence threshold of 1.0 kcal/mol/Å with the steepest descent method with limited memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithms. The remaining four steps were involved in system equilibration. In the first two steps, simulations were performed with NVT and NPT ensembles having restraints on the heavy atoms (solute) with the simulation temperature of 10 K for 12 ps of Berendsen dynamics. The final two steps were performed using NPT ensemble on the heavy atoms with and without restraints for 12 ps and 24 ps, respectively, using Berendsen dynamics. After the relaxation protocol, the production run of 120 ns was performed using the NPT ensemble with Martyna-Tuckerman-Klein barostat  and Nose-Hoover thermostat (Nos e, 1984). During the simulations, Smooth particle-mesh Ewald (PME) approximation (Darden et al., 1993) was exercised to study long-range electrostatic interactions, and the MSHAKE algorithm (Lambrakos et al., 1989) was implemented to study non-bonded interactions with the cut-off of 9 Å. However, to check the short-range bonded, non-bonded, and long-range non-bonded interactions, RESPA integrator was applied (Tuckerman et al., 1992) for the time step of 2.0 fs (shortrange bonded and non-bonded interactions), and 6.0fs (long-range non-bonded interactions). The coordinates and energies were saved at the time duration of 10 ps and 1.2 ps, respectively. To analyze the results, root-mean-square-deviation (RMSD) and fluctuation (RMSF) of protein and ligand were studied. Moreover, secondary structure elements (SSE) were monitored to check the stability of the docked protein backbone during the simulations. Along with that, the protein-ligand interactions were also studied.

Binding free energy calculation
Binding free energy (DG bind ) of the protein-ligand complexes helps in the qualitative estimation of the inhibitory affinity towards the PfENR enzyme. Based on the outcome of the molecular dynamics simulations, frames generated from the last 20 ns (100 ns to 120 ns) trajectory were exploited to calculate the binding free energy (MM-GBSA) of the complexes. The free energy calculations were conducted by using the MM-GBSA method of Prime-MMGBSA V 6.6 module of maestro V 13.0 (Li et al., 2011) at default parameters. The Prime module uses the OPLS-2005 (Jorgensen et al., 1996) force field method with VSGB 2.1 solvation model (Li et al., 2011) to perform the minimization of the protein-ligand complex. The calculation of the DG bind was conducted in a three-step default procedure, i.e. minimization of the protein-ligand complex, receptor only, and ligand only. In general, the minimized energy of the complex is subtracted from the minimization energy of ligand and protein to get the DG bind of the protein-ligand complex. DG bind can also be calculated as the addition of the molecular mechanics part (DE MM ) and solvation free energy. DE MM calculates the columbic and Van der Waals interactions, and covalent and hydrogen bonding with the protein and ligand. The second part, solvation-free energy includes the non-polar and polar part contribution. The solvent-accessible surface area (DG SA ) contributes to the non-polar part, and an approximate solution of the Poisson-Boltzmann equation, i.e. Generalized Born electrostatic solvation which contributes to the polar part of the system.

Pharmacophore generation and validation
The pharmacophore models generated from the two superimposed files of Group 1 and 2 ( Figure 1) are shown in Table 1. From the outcome, it can be depicted that Group 1 and 2 generate four and five feature hypotheses respectively. Among the models of Group 1, one 'Z' group was commonly identified, and in Group 2, 'Y', 'Z', and 'H' features were commonly observed (Table 1). Though most of the models (Group 1 and 2) have the same chemical features, but, they have different spatial orientations and inter-feature distance, as shown in the ESI Table 5 (Supplementary material). The difference in the alignment of models and variance in interfeature distance enhance the probabilities of screening diverse candidates. Furthermore, all the hypotheses were subjected to validation protocols (test set, GH, and EF studies) to select the representative model/s for the virtual screening process.  Table 6) that all the generated models from Group 1 enzymes possess well-defined sensitivity, i.e. they are efficient in screening their active candidates. However, only models 1, 2, 3, 5, and 9 display the capacity (>50%) to exclude the presumed inactive (specificity). On observing the ROC plots, it seems that all the models can screen out their active first, followed by presumed inactive. In the case of the parameters obtained for the Group 2 enzymes, it was observed that all pharmacophore models were able to display reasonable specificity and sensitivity along with ROC plots. Thus, the models can screen in the actives and reject the inactive. Based on the threshold stated in our previous studies (specificity and sensitivity > 0.50, and AUC > 0.70), 15 models (5 from Group 1 and 10 from Group 2) were listed to conduct the secondary validation (GH and EF studies).
For secondary validation of the models, we prepared a dataset of 447 molecules, and among 447, 15 compounds were the experimental inhibitors of the ENR enzyme (Supplementary material, ESI Table 4). On screening, the prepared dataset over the primary validated hypotheses, various parameters like total hits, active hits, % yield of actives, the ratio of actives, enrichment factor, false negatives, false positives, and Goodness of hit score (Table 2) were retrieved.
From the outcome of the secondary validation parameter (Table 2), it was observed that out of 15, only 11 models were able to display acceptable results. The shortlisting was done based on the Goodness of hit (>70%) (Lu et al., 2011). GH value determines the performance of the hypotheses. It is clear from Table 2 that hypothesis 3 from Group 1 and all models from Group 2 have the highest accuracy in collecting the actives from the prepared dataset of mixed actives and inactives. Moreover, the models mentioned above displayed the GH score of > 0.72, which satisfies their selection. With the help of EF studies also, we can predict the reliability of the hypotheses in recognition of the experimental actives. It is evident from Table 2 that all models have the probability in-between 21.52 and 27.81 in selecting the actives before inactive. Therefore, based on the primary and secondary validation, we picked 11 pharmacophore models as the representative models to conduct the virtual screening process (Table 2 and Figure 2).

Database preparation and multicomplex-based pharmacophore screening
The database was prepared with the focus on including the molecules with desirable toxicity, pharmacokinetic and physicochemical properties. To obtain the focused dataset, we employ the Lipinski rule of five (Lipinski et al., 1997), Veber rule (Veber et al., 2002), and ADMET studies. Thus, finally, we come up with 15,771 drug-like molecules out of the retrieved 229,358 UNPD molecules. As per literature, the probability of retrieving novel anti-malarial increases on applying the drug-like rules (Yamashita & Hashida, 2004). Therefore, the prepared dataset of 15,771 molecules was subjected to the virtual screening process. All eleven models were used to perform the screening as it increases the chance of retrieving diverse candidates. Based on the cut off criteria of fit value (>3.5), from virtual screening, it was observed that model 3 from Group 1 screen out 291 molecules, whereas, ten models (1 to 10) from Group 2 screened out 115,77,78,84,73,283,141,144,132, and 133 molecules respectively. These compounds were collected in a single file which was further subjected to preparation via Prepare Ligand protocol of Discovery Studio version 4.0 (Accelrys 4.0, S. D., USA, 2021). This step helps in removing the structural duplicates from the dataset. Thus out of 1551 screened molecules, a sum of 570 unique candidates were retrieved which were further subjected to molecular docking studies.

Molecular docking studies, HYDE assessment, and QSAR modeling
A molecular docking study was performed on the binding site of 3LT0 to search for the inhibitors that can bind within the active site of the ENR protein. However, before the docking, redocking was done on 3LT0. The objective behind the redocking was to validate the docking software robustness along with the finding of candidates with higher docking scores. From the redocking calculations, the RMSD of 0.574 Å with the docking score of À20.85 kcal/mol was observed between the experimentally crystallized and redocked pose of FT1. This score was considered as the cut-off criteria to screen out the top-scored docked candidates. Also, in the 2D-interaction plot of the redocked pose, the binding with the crucial amino acid (Tyr277) was observed ( Figure 3). Subsequently, 570 candidates were docked into the active site of 3LT0-FT1, and only 346 molecules were able to bind within the active site of the protein. On shortlisting the docked compounds based on the cut-off criteria of the docking score, it was observed that only 16 molecules possess a higher docking score (Supplementary material, ESI Figure 1 and ESI Table 7). According to the 2D-interaction plots of all the shortlisted candidates (Supplementary material, ESI Figure 1), it is obvious that out of 16, only 7 showed the presence of the interaction with the crucial amino acid, Tyr277 as mentioned in the literature (Perozzo et al., 2002). The 2D-interaction plot of the seven prioritized docked candidates is shown in Figure 4A and B, and their docking score is presented in Table 3. From the Table 3, it is obvious that UNPD283 displayed the higher docking score of À29.69 kcal/mol, followed by UNPD92 (À28.47 kcal/mol), UNPD24 (À24.83 kcal/ mol), UNPD17 (À23.88 kcal/mol) UNPD116 (À21.44 kcal/mol), UNPD398 (À21.10 kcal/mol), and UNPD96 (À21.08 kcal/mol). As seen in Table 3, the docking score of the protein-ligand complex was composed of the parameters like match score, lipo score, clash score, ambig score, and rot score. Match score represents the contribution from the matched interacting groups, lipo score displays the lipophilic contact area, ambig score signifies the lipophilic-hydrophilic contact areas, clash score shows the clash penalty, and rot score denotes the ligand rotational entropy.

Interaction analysis
On examining the binding pose of the shortlisted candidates depicted in Figure 4A and B, it was observed that all the candidates were docked within the common binding position which is similar to the reference 3LT0. Most importantly all the candidates displayed hydrogen bonding interaction with the crucial amino acid Tyr277, where ligands form HBD interaction with the hydroxyl group of the Tyr277 residue. As depicted in the literature, the interaction with the Tyr277 residue can lead to the inhibition of the ENR enzyme (Nicola et al., 2007;Perozzo et al., 2002).
Apart from binding with Tyr277, several other hydrogen bondings were also witnessed. Common hydrogen bonding with Ala219 was also observed in all ligands apart from the 3LT0-UNPD24 complex. In the case of the 3LT0-UNPD283 complex, the ligand form HBA interaction with the amine Table 2. List of GH scoring and EF parameters for the validation of the shortlisted pharmacophore models from the primary validation method.  group of Ala219, and HBD interaction with the carbonyl group of Ala219. In the 3LT0-UNPD92 complex, the ligand form bifurcated bonding with the carbonyl group of Ala219. In the 3LT0-UNPD24 complex, the ligand showed HBD interaction with the carbonyl group of Asn218 and Ala319 residues of the binding cavity. In the 3LT0-UNPD17 and 3LT0-UNPD116 complexes, the ligand showed HBA interaction with the amine group of Ala219. In the 3LT0-UNPD398 complex, the ligand displayed HBD interaction with the carbonyl group of Asn218 and form bifurcate HBA bonding with the amine group of Ala219. In the 3LT0-96 complex, the ligand displayed the HBA interaction with the amine group of Asn218 and Ala219. It also displayed HBD interaction with the carbonyl group of Ala319. In addition, some common amino acids, like, Ala217, Asn218, Ala219, Val222, Ala319, and Ile323 were interacting with the ligands through van der Waal's forces of interactions ( Figure 4A and B). They were defining the hydrophobic interactions. Moreover, all the complexes also showed common interaction with the NAD501. The interaction of the ligands with NAD (Nad501) is also considered important for enhancing the affinity of the compounds towards the enzyme binding site (Nicola et al., 2007).
The outcomes of docking studies showed that seven out of 16 docked candidates showed crucial interactions within the binding pocket of 3LT0. There are studies reported that propose the interaction of the inhibitor with the residue Tyr277 prohibits the interaction of the same amino acid with the natural substrate. Therefore, the compounds can be considered as successful ENR inhibitors as their interaction with Tyr277 blocks the role of the residue in catalysis. Therefore, the seven shortlisted drug-like candidates can show the biological activity against the enzyme. These seven molecules were then subjected to HYDE assessment to check their affinity toward the protein. Their outcome was depicted in terms of free energy of binding, and ligand efficiency which helps in differentiating the active and nonactive molecules (Table 3). It is clear from Table 3 that all seven candidates displayed good binding scores (DG) and ligand efficiency. The docking score and HYDE score follows the same trend, i.e. UNPD283> UNPD92 > UNPD24 > UNPD17 > UNPD116 > UNPD398 > UNPD96. Thus, we can say that these compounds can be easily ligated within the active site of the ENR protein. Further, these shortlisted candidates along with the reference 3LT0 were subjected to QSAR studies to check the predicted inhibitory activity of the docked candidates.

QSAR modeling
The comprehensive statistics of the generated 3D-QSAR models based on the random test set selection approach is summarized in Supplementary material Table 8. The aligned data set correlated by using PLS factor four to improve the model. Based on a statistical analysis that includes the RMSE and SD value, the best PLS model was picked. The PLS factor model 3 was chosen as the best model since the RMSE and SD values (0.45 and 0.329, respectively) demonstrate a minimum value, which follows the criteria to be a reliable model. Model 3 also displayed the F values of 29.0 and a small variance ratio (P) that statistically enhanced the predictability of the model. Furthermore, the low regression standard deviation and RMSE suggest that the dataset used to generate the model was adequate for QSAR analysis. Also, the validity of each model was predicted using the estimated correlation coefficient (R 2 ) for a randomly chosen test set of various architectures. Thus, we selected model three to predict the activity of PfENR docked molecules, and we compare it with the reference. On observing the predicted activity in terms of pIC 50 , it was observed that UNPD283 (4.598) and UNPD92 (4.593), displayed higher activity than Reference (4.506), whereas, UNPD24 (4.405), UNPD17 (4.461), UNPD116 (4.443), UNPD398 (4.409), and UNPD96 (4.405) predicted activity near to the reference. Overall, from the outcome of QSAR studies, we can say that these candidates especially, UNPD283 and UNPD92 can be used as novel chemical starting points for further ENR inhibitor structural optimization. After QSAR studies, these docked candidates were subjected to molecular dynamics simulations.

Molecular dynamics simulations
Molecular dynamics simulations were performed on the selected seven docked candidates along with the reference (3LT0) and apo-protein. Their dynamic behavior was observed and compared to 3LT0 and apo-protein. Hence, a sum of nine simulations were performed for the time duration of 120 ns. The behavior of the candidates was studied via protein and ligand RMSD, RMSF, SSE analysis, and several contacts formed during the molecular dynamics simulation. From Figure 5, it can be observed that most of the complexes have attained stable behavior by the end of the simulations. By the RMSD plot of ligand and protein of the Reference, it can be observed that the complex has attained stability after 65 ns of simulation. In the case of the UNPD283 complex, the ligand had attained stability in the first 5 ns of the simulation, whereas, in the case of the protein RMSD, an increase in the deviation was observed at 45 ns, which got stable after 55 ns of the simulation run. In the UNPD92 complex, the protein chain attain the equilibrated behavior after 90 ns of the run, and the ligand showed stable variation apart from 60 and 90 ns where there was a decrease and increase reported in the fluctuation respectively. After that, the ligand UNPD92 displayed stable behavior. In the UNPD24 complex, ligand attains the stable deviation after 80 ns, whereas, protein displayed the increase in the deviation only. In the UNPD17 complex, both the ligand and protein displayed stable deviation after 90 ns of simulation. In the case of the UNPD116 complex, the protein attain the stable behavior after 70 ns, and the ligand showed a stable deviation up to 70 ns, thereafter it showed a larger deviation but in a lower RMSD range. In the case of the UNPD398 complex, the protein showed stable variation after 80 ns, and the ligand showed stable behavior after 50 ns of the run. In the UNPD96 complex, the protein and ligand both displayed stable variation after 90 ns of the simulation.

Protein backbone stability
Protein backbone RMSD and RMSF calculations were performed to check the conformational change in the backbone of the ENR protein in the presence of docked ligands. From the Figure 5, the average RMSD value of protein backbone for the Apo, Reference, UNPD283, UNPD92, UNPD24, UNPD17, UNPD116, UNPD398, and UNPD96 was recorded as 2.39 Å, 2.99 Å, 2.68 Å, 2.85 Å, 2.49 Å, 2.60 Å, 2.56 Å, 3.09 Å, and 2.44 Å respectively. Hence, it can be depicted that the docked complexes (except UNPD398) displayed deviation values lesser than or near to the apo-protein, and lesser than the selected Reference. Thus, it can be concluded that the protein chain of docked complexes was stable, and no larger conformational change was observed. However, the residual fluctuation of the protein chain was measured by calculating the protein-RMSF and is shown in the ESI Figure 2 (Supplementary material). From the protein RMSF analysis, it can be understood that all the amino acids which were involved in the interaction displayed lower deviation, which suggests the stable interaction of the ligand with the protein (Supplementary material, ESI Figure 2). The average RMSF value of the protein backbone for the Apo, Reference, UNPD283, UNPD92, UNPD24, UNPD17, UNPD116, UNPD398, and UNPD96 was observed as 1.00 Å, 1.35 Å, 1.32 Å, 1.31 Å, 1.18 Å, 1.07 Å, 1.16 Å, 1.37 Å, and 1.05 Å respectively. Thus, it is obvious from the outcome that the protein chain of the docked ligands (except UNPD398) showed a deviation value  lesser than or near to the reference. Also, the RMSF value of the main residues involved in bonding with the inhibitors, i.e. Tyr277 and Ala219 was analyzed and are represented in ESI Figure 2 (Supplementary material). From the data obtained, it can be observed that except for UNPD116, the fluctuation with respect to the Tyr277 decreases in the docked complexes. Whereas, in the case of binding with Ala219, the fluctuation was decreased in all the complexes. The decrease in the fluctuations of the crucial amino acids depicts the stable binding of the ligand in the active site of the protein. As per the previous calculations conducted on the PfENR-triclosan complex and its derivatives, it was depicted that the equilibrated system of the ENR complex displayed a protein-RMSD value in between 2.5 Å to 3.5 Å, and protein-RMSF value lesser than 2.0 Å (Lindert & McCammon, 2012). The current study's findings were according to the previously published benchmarked values (Lindert & McCammon, 2012).
In addition, to study the flexibility of the protein-backbone in the presence of the docked ligands, the SSE plots were generated (Supplementary material, ESI Figure 3). On observing the SSE elements values, it was seen that %Helix increases or remains in the same range of the reference (except UNPD398 and UNPD96), which depicts the folding of alpha-helix during the simulation run (Supplementary material, ESI Figure 3 and Table 4). However, in the case of %Strand, in comparison to the Reference, UNPD92, UNPD24, and UNPD 398 displayed marginal higher value (folding of beta-strand), and the rest all displayed marginal lower value (the unfolding of beta-strand) ( Table 4). The %Total SSE displayed the values near to the Reference (except UNPD96), which overall depicts that apart from UNPD96, other display decreases in the fluctuations in the presence of the UNPD ligands respectively (Table 4). Thus, the protein-backbone has attained stability in the presence of the UNPD molecules.

Ligand stability
To access the binding stability of the ligand in the active site of the protein, ligand RMSD was computed ( Figure 5). The average RMSD of Reference, UNPD283, UNPD92, UNPD24, UNPD17, UNPD116, UNPD398, and UNPD96 was 0.99 Å, 2.16 Å, 1.54 Å, 1.96 Å, 2.01 Å, 1.39 Å, 1.95 Å, and 2.33 Å respectively. All the natural products displayed a similar type of dynamic behavior within the window range of 0.2 Å to 2.7 Å. However, in the case of UNPD96, significant deviations were recorded in the range of 0.36 Å to 3.0 Å ( Figure 5). The maximum deviation is observed may due to the maximum number of rotatable bonds present in the ligand. Ligand RMSF plots were also generated to observe the fluctuations of the atoms of the ligands (Supplementary material, ESI Figure 4). The average fluctuation of Reference, UNPD283, UNPD92, UNPD24, UNPD17, UNPD116, UNPD398, and UNPD96 was 0.74 Å, 0.85 Å, 0.32 Å, 1.36 Å, 1.41 Å, 0.71 Å, 0.55 Å, and 0.63 Å respectively. Moreover, all the atoms which were involved in forming the bonding with the residues displayed lower fluctuations (Supplementary material, ESI Figure  4). Thus, the outcome depicts the acceptable range of the stability parameters, which defines the stable binding of the ligands.

Protein-ligand interactions
It is obvious from the protein-ligand interaction plot obtained in the ESI Figure 5 (Supplementary material), that there are several contacts witnessed during the simulation. These contacts incorporate all types of interactions formed during the simulation, i.e. hydrogen bonding, hydrophobic, ionic, or water bridges. The number of contact formed by Reference, UNPD283, UNPD92, UNPD24, UNPD17, UNPD116, UNPD398, and UNPD96 were in the range of 0-8, 3-9, 3-9, 0-12, 0-9, 0-6, 0-12, and 0-8 respectively. These contacts formed in between the ligands and protein depict the stable dynamic behavior of the docked natural products as explained by ligand RMSD and RMSF plots. Thus, all the ligands display stable binding within the active site of the ENR protein.
3.5. Binding free energy calculations and residue wise energy contribution The binding free energy of all the complexes was calculated by using the MM-GBSA method. The estimated value of DG bind calculated for Reference, UNPD283, UNPD92, UNPD24, UNPD17, UNPD116, UNPD398, and UNPD96 was À123.75 ± 7.37 kcal/mol, À127.87 ± 6.26 kcal/mol, À141.97 ± 7.86 kcal/mol, À121.42 ± 7.36 kcal/mol, À113.28 ± 6.51 kcal/mol, À134.11 ± 8.41 kcal/mol, À134.59 ± 8.46 kcal/mol, À128.20 ± 5.84 kcal/mol respectively. Therefore, from the outcome, it was observed that in comparison to the reference, UNPD92, UNPD398, UNPD116, UNPD96, and UNPD283 showed the highest protein-ligand binding energy. The overall energy order energy order was found to be UNPD92 > UNPD398 > UNPD116 > UNPD96 > UNPD283 > Reference > UNPD24 > UNPD17. To know the contribution of the important residues in binding with the ligands, per-residue wise energy decomposition was conducted by using the Prime-MMGBSA V 4.0 module of maestro (Li et al., 2011) at default parameters. ESI Figure 6 (Supplementary material) displayed the energy-wise plots generated from the crucial binding site residues of the PfENR protein. As depicted in the literature, interaction with Tyr277 played a major role in showing the inhibition of the enzyme. From the ESI Figure 6 (Supplementary material), it is clear that in comparison to the Reference, the contribution of Tyr277 to total binding energy increases in UNPD283, UNPD92, UNPD24, UNPD116, UNPD398, and UNPD96 complex, however, in the case of UNPDF17, the interaction was absent. Apart from the interaction with Tyr277, it was also reported that the ligand must form interaction with the oxidized cofactor of the ENR enzyme, i.e. NAD (Nad501) (Massengo-Tiass e & Cronan, 2009) to hinder its function. On observing the contribution of Nad501 towards the DG bind (Supplementary material, ESI Figure 6), it was witnessed that five inhibitors (UNPD283, UNPD92, UNPD116, UNPD398, and UNPD96) out of seven displayed higher contribution in comparison to the Reference. Therefore, taking into account of the presence of crucial interactions (Tyr277 and Nad501), and DG bind , we can say that 5 (UNPD283, UNPD92, UNPD116, UNPD398, and UNPD96) out of 7 candidates can be considered as potent PfENR inhibitors.

Conclusions
In this work, multicomplex-based pharmacophore modeling was applied on the PfENR proteome. Later the study was combined with molecular docking, HYDE assessment, molecular dynamics simulations, MM-GBSA calculations, and residue-wise contribution analysis to identify the natural product inhibitors against the ENR enzyme. To conduct the multicomplex-based VS, the UNPD database was used in this study which was prepared as per the drug-likeness properties and acceptable ADMET parameters. Based on the validation studies conducted on the generated pharmacophores, eleven hypotheses were selected to conduct the VS process. After screening, a sum of 570 molecules was selected based on the fit value to perform molecular docking studies. In the binding studies, it was observed that out of 570, only 346 were able to form bonding in the active site of the ENR protein. On further evaluation, it was observed that only 16 displayed a higher docking score than the reference, and among 16, only 7 were involved in displaying interaction with the crucial amino acid Tyr277. As reported in the literature the interaction of the molecules with Tyr277 enhances the possibility of the molecule being a potent ENR inhibitor. Further, they were subjected to HYDE assessment, and HYDE analysis showed that all the ligands can be easily ligated in the binding pocket of the enzyme. On conducting QSAR studies on these molecules, the predicted inhibitory activity was calculated, which showed that these compounds dislay the activity similar to the Reference. Thereafter, molecular dynamics simulations were conducted on these seven candidates along with the reference and apo-protein. It was observed that the protein and ligand RMSD and RMSF plots displayed stable behavior during the simulation. Moreover, SSE elements depict the decrease in the fluctuation of the protein backbone, thus, indicating the stability in the presence of the docked candidates. After dynamics, the free binding energy and residue-wise calculations were conducted on the seven docked molecules. From the outcome, it was observed that in the case of UNPD283, UNPD92, UNPD116, UNPD398, and UNPD96, the interaction with Tyr277 and Nad501 cofactor was witnessed, and their contribution was higher in comparison to the Reference to the total binding free energy, which makes these compounds suitable for the inhibition of the ENR enzyme. Moreover, these compounds are structurally diverse from each other, and they can be utilized as the template for the development of ENR inhibitors. However, in vitro and in vivo testing of the compounds is needed to confirm the results. This methodology may be applied to the screening of various chemical repositories, thus, making it useful for scaffold building of the potential candidates.