Screening of phytochemicals as potential anti-breast cancer agents targeting HER2: an in-silico approach

Abstract Breast cancer is the most common cancer among women around the world. Human Epidermal growth factor Receptor-2 (HER2) is a membrane tyrosine kinase overexpressed in 30% of human breast cancers; thus, it serves as an important drug target. Currently available HER2 inhibitor lapatinib targets the ATP binding site of the cytoplasmic kinase domain, blocking autophosphorylation and activation of HER-2. However, it causes side effects like diarrhea, nausea, rash and possible liver toxicity. As phytochemicals have fewer side effects and are relatively affordable, they offer an effective alternative. Hence, we aimed to identify potential phytochemicals that could act as HER2 inhibitors employing computational methods such as molecular docking, molecular dynamic simulation, and ADMET prediction. Out of 1500 phytochemicals docked to the ATP binding site of the HER2 kinase domain, luxenchalcone, rhinacanthin Q, subtrifloralacton D, and 7,7″-dimethyllanaraflavone exhibited higher binding affinity than the reference inhibitor and satisfied the Lipinski’s rule of five. Analysis of molecular dynamics simulation trajectory showed that Rhinacanthin Q, subtrifloralacton D, and 7,7″-dimethyllanaraflavone formed a stable and compact complex without vast conformational fluctuations. MM/PBSA binding free energy analysis revealed that Rhinacanthin Q, subtrifloralacton D, and 7,7″-dimethyllanaraflavone have high binding affinity to HER2. Therefore, Rhinacanthin Q, subtrifloralacton D, and 7,7″-dimethyllanaraflavone could be potential bioactive molecules to act as inhibitor of HER2 protein. Eventually, experimental studies are needed to evaluate the potentials of these phytochemicals further. The development of drug for HER2 positive breast cancer could be accelerated with the findings of our research. Communicated by Ramaswamy H. Sarma


Introduction
Breast cancer is one of the most prevalent and lethal cancers boasting a high death rate among women (Benson & Jatoi, 2012). An estimated 2.26 million new instances of female breast cancer were identified in 2020 with 685,000 breast cancer-related deaths (GLOBOCAN, 2020). Normally there are various signaling pathways such as estrogen receptor pathway, Human Epidermal growth factor Receptor 2 (HER2) pathway and Wnt/b-catenin pathway which control normal breast growth by regulating cell proliferation, cell death and cell differentiation (Feng et al., 2018). When signaling molecules such as estrogen receptor, progesterone receptor, androgen receptors, and HER2 are overexpressed or underexpressed, it results in an aberrant cell cycle leading to cancer (Aftimos et al., 2018).
HER2 is a part of the human epidermal growth factor receptor family of tyrosine kinases (Iqbal & Iqbal, 2014), which has an extracellular ligand-binding region, a transmembrane region, and a cytoplasmic kinase domain (Hsu & Hung, 2016). The kinase domain of HER2 is activated when a ligand binds to the extracellular portion of the protein (Wieduwilt & Moasser, 2008). This results in the activation of intracellular signaling pathways essential in cell proliferation, such as the Mitogen-Activated Protein Kinase (MAPK) and the Phosphatidylinositol 3-Kinase (P13K)-Akt pathways (Schlessinger, 2004).
Since molecular studies have shown that HER2 is expressed at low levels in normal cells, it is a desirable drug target (Furrer, 2018). Trastuzumab and lapatinib are the two most successful drugs targeting HER2. Trastuzumab is a monoclonal antibody, which binds to the extracellular domain of HER2 (Bartsch et al., 2007) whereas lapatinib is a small molecule inhibitor that competes with ATP for the HER2 protein kinase domain (Scaltriti et al., 2009). Although trastuzumab can reduce HER2 expression, it has been reported to cause cardiotoxicity and medication resistance (Guarneri et al., 2006) while lapatinib treatment can cause adverse side effects such as diarrhea, rash, nausea and possible liver toxicity (Amir et al., 2010;Geyer et al., 2006).
The current drugs' side effects prompted us to look for an alternative to discovering new drugs. Since phytochemicals are reasonably safe and economical, they offer an effective and feasible alternative (Jacobs, 2018). This study aims to identify potential phytochemicals that could act as HER2 inhibitors employing computational methods such as molecular docking, pharmacokinetics and molecular dynamic simulation studies.

Target protein preparation
The crystal structure of HER2 protein was taken from the RSCB Protein Data Bank (PDB ID: 3PP0). The co-crystallized ligand 03Q and water molecules were removed with the help of Discovery Studio Visualizer. The missing amino acid residues were added using MODELLER (Eswar et al., 2006). The missing sidechain atoms were added with the help of Swiss-PdbViewer (Guex & Peitsch, 1997). Polar hydrogens and Kollman charges were added, and the protein was converted into PDBQT format using AutoDock tools (Morris et al., 2009). Swiss-PdbViewer was employed to perform protein structure minimization in order to relieve the steric clashes caused by the addition of hydrogen atoms and missing residues.

Preparation of ligands
NPACT (Naturally Occurring Plant-based Anti-cancer Compound-Activity-Target) database was downloaded to be used as a ligand library (Mangal et al., 2013). 03Q was added to the library as a reference ligand. Open Babel was implemented to convert the ligands into 3D PDBQT format as well as to add polar hydrogens (O'Boyle et al., 2011). Open Babel was used to minimize ligands using conjugate gradient algorithm and MMFF94 force field (Halgren, 1996).

Molecular docking
The interaction between ligands and protein was studied using a molecular docking tool, AutoDock Vina (Trott & Olson, 2010). The ligands were docked into the ATP binding site of the HER-2 kinase domain. The center coordinates (x, y, z) of binding sites were acquired through Discovery Studio Visualizer as (35.18, 44.25, À10.96). The dimensions of the grid box were set as X: 26, Y:26, and Z:26 (unit of dimensions is Å). The exhaustiveness of the search was set to 20, and the number of CPUs to use was set to 12. The number of binding modes that could be generated was limited to 20. Moreover, the maximum energy difference between the best and worst binding modes was set to 4. Validation of docking protocol was done by performing the docking of the co-crystallized ligand at the active site of the protein. Therefore, docking was performed with reference molecule 03Q, and the RMSD value between experimental and docked reference was calculated to validate the docking protocol. The analysis of docking result was carried out using Discovery Studio v19.1.0.18287 (Biovia, 2017). parameters include gastrointestinal absorption, blood barrier permeant, Cytochrome P50 3A4 (CYP3A4) inhibition and Cytochrome P50 2D6 (CYP2D6) inhibition. All these parameters were predicted using open web servers pkCSM (Pires et al., 2015), SwissADME (Daina et al., 2017) and PreADMET (https://preadmet.bmdrc.kr/).

Toxicological predictions
The phytochemicals were also tested for mutagenicity through Ames toxicity and hepatotoxicity using PreADMET server. The tests were subjected to being plausible in mammals and can alert us about the potential toxicological effects of the compounds (Ramos et al., 2020).

Prediction of the cardiotoxicity
The prediction of cardiotoxicity of the selected ligands was done using the PreADMET web server. The prediction strategy for the cardiac toxicity risk depends upon the inhibition property of the human Ether-a-go-go-Related Gene (hERG) (Lee et al., 2019). PreADMET web server generates cardiotoxicity alerts with the classification of low risk, medium risk, high risk, and ambiguous for the hERG inhibition property (Radchenko et al., 2017).

Prediction of toxicity lethal dose (LD50)
Toxicity lethal dose prediction was carried out using the webserver Protox-II (http://tox.charite.de/protox_II). The prediction mechanism is the analysis of 2D similarity to compounds with known LD50 values and the identification of fragments over-represented in toxic compounds (Supandi & Merdekawati, 2018). The results are depicted as average lethal dose (LD50) in mg/kg of weight with the toxicity classes of I, II, III, IV, V, and VI. VI being non-toxic to I being highly lethal when ingested (Banerjee et al., 2018).

Molecular dynamics simulation
Best docked ligands following Lipinski's rule of five were taken for molecular dynamics simulation analysis using GROMACS 2019.6 (Spoel et al., 2005). Protein interactions were approximated by the CHARMM36 force field (Huang & MacKerell, 2013). Ligand parameters were generated using CHARMM General Force Field (CGenFF) (Vanommeslaeghe et al., 2010). The protein-ligand complexes were solvated in the dodecahedron box with simple point charge (SPC) water (Mark & Nilsson, 2001). 2 Cl À were added for the overall neutrality of the system. Energy minimization of the system was performed by using the steepest descent algorithm for 50,000 steps. Equilibration was performed in two 500 ps phases; the first phase was performed with a constant number of particles, volume, and temperature (NVT), whereas the second phase was performed with a constant number of particles, pressure, and temperature (NPT). Temperature and pressure were set at 300 K and 1 bar, respectively, and controlled by a modified Berendsen thermostat and Berendsen barostat (Berendsen et al., 1984). Covalent bonds were constrained using the LINear Constraint Solver (LINCS) algorithm (Hess et al., 1997). A cut-off distance of 1.2 nm was used for calculations of Lennard jones and Coulomb interactions (Halliday et al., 2013;Lennard-Jones, 1931). Long-range electrostatics was calculated using Particle Mesh Ewald (PME) method (Essmann et al., 1995). The final production step of molecular dynamics simulation was carried out for 100 ns, each step of 2 fs. Root mean square deviation (RMSD), root mean square fluctuation (RMSF), Radius of gyration (Rg), number of hydrogen bonds, and Solvent Accessible Surface Area (SASA) were calculated to analyze the results from the simulation.
2.6. MM/PBSA (molecular mechanics/Poisson-Boltzmann surface area) binding free energy analysis The binding free energy DG bind was calculated by MM/PBSA approach through the g_mmpbsa script for the selected molecules in order to investigate binding affinity toward the HER2 protein (Kumari et al., 2014). The MD simulation trajectory of last 10 ns was considered for the DG bind calculation. The detailed procedure of DG bind is described below. Equation (1) was used to calculate the DG bind .
Here, the G complex represents the total free energy of protein and ligand complex. Individual binding energy of protein and ligand in the solvent can be defined as G protein and G ligand , respectively. The separate binding energy of each component, such as complex (G), protein (G) and ligand (G), is calculated as per Equation (2).
The E MM describes the average molecular mechanics (MM) potential energy in a vacuum. The temperature and entropy are represented by the T and S, respectively. The free energy of solvation is denoted by G solvation . Further, the E MM is calculated using Equation (3).
The term, E bonded represents the bonded interaction such as bond length, bond angle and dihedral angle. On the other hand, the E nonbonded explains the nonbonding interactions, including the electrostatic and van der Waals interactions. The term G solvation in Equation (2) is the energy required to transfer a solute from a vacuum to a solvent. This term can be calculated by Equation (4).
The electrostatic and nonelectrostatic contribution to the solvation free energy is represented by the G polar and G nonpolar , respectively.
The last ten ns of data were used for the analysis. The frames were selected at a regular interval of 100 ps covering a wide range of trajectory to cover different conformational space of the docked complexes for better structure-function correlation (Padhi et al., 2020).

Validation of docking
Firstly, we validated the docking procedure before conducting virtual screening using Autodock Vina. Validation was done by finding how closely the lowest energy pose of the co-crystallized ligand (03Q) predicted by Autodock Vina resembles an experimental binding mode determined by X-ray crystallography. The superposition of the docked pose and the experimental binding pose presented in supplementary Figure S2 shows that the conformations of these two poses are similar. The RMSD between the heavy atoms of two poses is 1.1 Å. The previous study suggested that the RMSD value should be below 2.0 Å for the procedure to be reliable (Bourne et al., 2003).

Binding energy of phytochemicals
Docking analysis was performed on 1500 phytochemicals to find out the best candidate based on binding energy. The binding energy of the reference ligand (03Q) to the HER-2 ATP binding site was À11.2 kcal/mol, which was used as cutoff to filter phytochemicals. Twelve compounds exhibited a higher binding affinity to the receptor than the reference ligand (Table 1). Luxenchalcone had the highest binding affinity with binding energy of À12.2 kcal/mol.

Lipinski rule of five
The phytochemicals should have specific chemical and physical properties in order for them to be orally active drugs. Lipinski's rule of five was used to determine if the best phytochemicals have drug-likeness properties. Among twelve phytochemicals, the phytochemicals which comply with Lipinski's rule were luxenchalcone, rhinacanthin Q, subtrifloralactone D and 7,7 00 -dimethyllanaraflavone. The molecular properties of these phytochemicals are presented in Table 2.

ADMET prediction
The ADMET properties of four compounds are summarized in the Table 3. Based on Ames toxicity, all phytochemicals were found to be non-toxic. Luxenchalcone was the only phytochemical to be hepatotoxic. The gastrointestinal absorption of rhinacanthin Q, subtrifloralactone D and 7,7 00dimethyllanaraflavone was high, whereas it was low for luxenchalcone. P-glycoprotein (P-gp) is a membrane transporter protein that acts as a physiological barrier by exporting drugs and toxins out of the cell (Finch & Pillans, 2014). Out of four phytochemicals, subtrifloralactone D was the only compound that was a substrate of P-gp. Furthermore, the blood-brain barrier which defends the brain from toxins and pathogens by allowing certain substances to cross from the bloodstream into the brain (Weiss et al., 2009) was taken into account to assess the permeability of the selected phytochemicals. BBB penetration is presented as concentration ratio of steady-state of radiolabelled compounds in the brain (Cbrain) and peripheral blood (Cblood). BBB permeability of luxenchalcone (0.052), rhinacanthin Q (0.026) and subtrifloralactone D (0.087) were relatively lower than that of 7,7 00dimethyllanaraflavone (0.55). Cytochrome p50 enzymes are required for the metabolism of drugs and foreign substances (Lynch & Price, 2007). Lynch & Price have also mentioned that CYP3A4 and CYP2D6 are the most significant cytochrome p50 enzymes. Our four selected phytochemicals do not inhibit CYP3A4 and CYP2D6.
hERG inhibition is another important parameter to assess the toxic property of compounds (Lee et al., 2019). Drug compounds may create an undesirable blockage of the K þ ion channel of the human ether-a-go-go-related gene (hERG). This blockage results in long QT syndrome (LQTS): a lethal cardiac side effect (Bhat & Houghton, 2018). While assessing the hERG inhibition, 7,7 00 -dimethyllanaraflavone, rhinacanthin Q and luxenchalcone possessed a medium risk of cardiotoxicity while subtrifloralactone D showed ambiguous nature. It is clear to see that subtrifloralactone D requires further investigation on its hERG inhibition to assess the cardiotoxic effect.
While assessing the oral lethal dose via Protox-II server (Table 4), rhinacanthin Q and subtrifloralactone D have LD50 value of 500 mg/kg and 1000 mg/kg respectively with both possessing a toxic class of IV which is considered harmful if ingested. Meanwhile, 7,7 00 -dimethyllanaraflavone and luxenchalcone have an LD50 value of 4000 mg/kg and 2100 mg/ kg, respectively, with both being classified as a class V which may be or may not be toxic when swallowed. This implies that more preferable clinical studies, should be conducted before declaring these phytochemicals as potential antibreast cancer leads. Two compounds 7,7 00 -dimethyllanaraflavone and luxenchalcone both being in the class V toxicity, could be non-toxic when orally ingested. However, our studies are not enough to validate this statement. Hence further wet lab tests should be carried out to investigate its toxic nature.

Molecular interactions of best phytochemicals with the HER2 kinase domain
Non-covalent interactions between the four phytochemicals following Lipinski rule of five and HER2 kinase domain are listed in Table 5. These interactions are presented in 2D and  Figures 1 and 2, respectively. 7,7 00 -dimethyllanaraflavone showed H-bond interactions with SER728, LYS753 and ASP863 amino acid residues. Similarly, luxenchalcone exhibited H-bonding with PHE731, GLY732 and LYS753 amino acid residues. Likewise, rhinacanthin Q formed Hydrogen bonds with amino acid residues SER783, THR798 and THR862. Subtrifloralactone D also interacted with the protein through hydrogen bonding with the same amino acid residues as that of rhinacanthin Q along with one more SER783 residue. Meanwhile 03q formed hydrogen bonds with residues MET801 and ASP863. All four ligands formed an average of 3 Hydrogen bonds with the HER2 protein with subtrifloralactone D forming 4 H-bonds. These hydrogen bond interactions are also complemented by strong hydrophobic interactions. Hydrophobic interactions help keep the protein stable as well as maintain its biological activity. These interactions decrease the surface area of the protein structure and reduces the unwanted effects with water (Kellis et al., 1988). From Figures 1 and 2, we observed that the ligands occupied the ATP binding site of the protein and made non-covalent interactions with the residues of catalytic regions like DGF motif, P-loop and a-helix C. Residues of non-catalytic regions were also involved in interactions with ligands. 03Q interacted with ASP863 of the DFG motif through conventional hydrogen bond formation. 7,7 00 -dimethyllanaraflavone formed conventional H-bond with SER728 of P-loop and ASP863. Luxenchalcone formed conventional H-bond with two residues from P-loop: PHE731 and GLY732. Rhinacanthin Q and subtrifloralactone D did not form any conventional hydrogen bonds with important catalytic regions of the HER2 kinase domain. Numerous hydrophobic interactions were also observed between the phytochemicals and the catalytic residues. Four selected phytochemicals and 03Q formed pi-alkyl interaction with VAL734 of P-loop. LEU726, another P-loop residue, formed pi-alkyl interaction with all phytochemicals except 7,7 00 -dimethyllanaraflavone. Luxenchalcone interacted with GLY729 of P-loop through   conventional hydrogen bonds or establishing hydrophobic interactions, all of the four ligands depicted strong interactions with the catalytic region of the protein. Since these ligands have a high binding affinity with the protein which is also supported by the interaction findings, they can be viewed as a potential inhibitor for the protein.

Molecular dynamics simulation
Best four phytochemicals following the Lipinski rule of five were taken for the molecular dynamics analysis. We calculated the RMSD value for the backbones of all residue to verify the stability of complexes ( Figure 3A). 03Q complex showed a stable RMSD throughout the simulation with an average RMSD value of 0.28 nm ( Table 6). The RMSD of the 7,7 00 -dimethyllanaraflavone complex reached 0.4 nm within five ns. However, it decreased and then became stable till the end (average RMSD $ 0.35 nm). The RMSD value of the luxenchalcone complex remained stable for initial 35 ns with an average RMSD of $ 0.275. The value increased to $ 0.35 nm, where it stabilized for a short time before showing fluctuations. However, the value attained equilibrium at around 80 ns which was maintained till the end (average RMSD $ 0.31 nm). The RMSD of the subtrifloralactone D complex increased steadily to $0.39 nm during the initial 18 ns of simulation and remained unstable till 50 ns. However, the RMSD attained stability after 50 ns with an average value of $0.33 nm. Rhinacanthin Q complex had a stable RMSD throughout the simulation (average RMSD ¼ 0.32 nm).
We calculated the RMSF of the backbone atoms of residues in protein-ligand complexes to inspect the flexibility of the protein backbone ( Figure 3B). All complexes presented similar RMSF graph. N-terminal and C-terminal amino acids displayed high fluctuations in all protein-ligand complexes. Other residues with high flexibility in all complexes were residues from loop regions such as PHE731, GLU744, ARG756, LYS762, LYS883, and ARG929.
The radius of gyration was calculated to evaluate the overall compactness of the ligand-bound complex structure of the HER2 kinase domain ( Figure 3C). Rhinacanthin Q complex, subtrifloralactone D complex, and 03Q complex exhibited stable Rg throughout the simulation with an average Rg value of 2.03 nm, 2.04 nm, and 2.03 nm, respectively ( Table  6). The radius of gyration of the 7,7 00 -dimethyllanaraflavone complex reached 2.1 nm within five ns. However, it dropped quickly and attained equilibrium, which was maintained until 100 ns (average Rg $ 2.03 nm). The luxenchalcone complex showed a stable Rg for the first 35 ns with an average value of $2.01 nm. However, the Rg fluctuated till 75 ns, which might be due to conformational change in the complex structure. The Rg achieved stability after 75 ns which was maintained till 100 ns with an average value of around 2.02 nm.
We also examined the number of hydrogen bonds between the ligands and the receptor ( Figure 3D). Subtrifloralactone D formed one to four hydrogen bonds with the receptor; however, three hydrogen bond interactions were stable, which could be seen consistently throughout the simulation. Rhinacanthin Q and 7,7 00 -dimethyllanaraflavone formed one or two hydrogen bonds with the receptor, out of which one hydrogen bond was observed more consistently. The hydrogen bond plot of luxenchalcone showed a maximum of seven hydrogen bonds and a minimum of one hydrogen bond.
SASA is the surface area of protein available for the surrounding water to interact with the protein ( Figure 3E). The average SASA values of complex formed by luxenchalcone, rhinacanthin Q, 7,7 00 -dimethyllanaraflavone, subtrifloralactone D, and 03Q were 162.19 nm 2 , 160.63 nm 2 , 160.12 nm 2 , 159.39 nm 2 , and 159.17 nm 2 , respectively ( Table 6). The SASA value of the luxenchalcone complex was the highest, which indicated the slight expansion of the protein volume when luxenchalcone was bound to it. The reference complex had the lowest SASA value, suggesting a negligible expansion of the HER-2 kinase domain. The complexes of rhinacanthin Q, 7,7 00 -dimethyllanaraflavone, and subtrifloralactone D had lower SASA values similar to that of the reference complex.
Further, we employed Visual Molecular Dynamics (VMD) software (Humphrey et al., 1996) to analyze the hydrogen bond formation during MD simulation. Only hydrogen bonds with occupancy > 5% are listed in Table 7. The majority of hydrogen bonds seen during docking (Table 5) were also observed during MD simulation.

Interactions obtained from molecular dynamics simulation
To check the stability of the selected molecules in the binding pocket of HER2, we extracted MD simulation conformations at different intervals and visualized the interactions between protein and ligands (supplementary Figure S5). The detailed information on the interaction is illustrated in Table 8. The superposition of the ligand conformation at the beginning (10 ns) and at the end (90 ns) of the simulation can give us information about the stability of the complex. The orientation of 03q at 90 ns was more or less similar to its orientation at 10 ns with H-bonding from residue ASP863 missing at the end of the simulation. Meanwhile the residue MET801 was seen to be forming H-bonds at both the intervals implying that the interaction is conserved throughout the simulation. This can be further supported by the hydrogen occupancy data mentioned earlier (Table 7) where MET801 had a hydrogen bond occupancy of 55.38% in 03Q. According to these data, MET801 is a conserved amino acid residue for Hbonding in 03Q. Since 03Q is the reference ligand, we can confer its magnitude of orientation and conformation with other ligands to comparatively analyse their stability.
Subtrifloralactone D has similar conformation during both 10 ns and 90 ns (supplementary Figure S5). THR862 and ASP863 are the residues depicted to be conserved sites for conventional H-bonding interactions as they are seen interacting with subtrifloralactone D during 10ns as well as 90ns. Hydrogen occupancy study also complements this finding as the conserved residues ASP863 and THR862 both have high H-bond occupancy of 72.12% and 42.16% respectively. Hydrophobic interactions with VAL734, LYS753, MET774, LEU785, LEU796 are present in both 10 ns and 90 ns.
Rhinacanthin Q is seen to have almost no change in its orientation or conformation as depicted in supplementary Figure S5. The conventional hydrogen bond interaction regarding Rhinacanthin Q involves the residues THR862, SER783 and ASP863. Here interaction with THR862 is seen to be conserved meanwhile SER783 and ASP863 are seen to be interacting exclusively at 10 ns and 90 ns respectively. Hydrogen bond occupancy studies shows THR862 having 40.41% of H-bond occupancy in Rhinacanthin Q complex. As for its hydrophobic interactions, as depicted in the Table 8, interactions with VAL734, ALA751, LYS753 and LEU852 are conserved.
Luxenchalcone is seen here to be the odd ball among the four ligands. Its conformation and orientation at 10 ns are different from the one obtained at 90 ns. This implies a slight discomfort in its binding stability but nevertheless the luxenchalcone complex is depicted to be stable since there are some instances of conserved conventional H-bonding interactions with residues GLY729 and PHE731. Other residues involved in conventional H-bonding in luxenchalcone include MET801 and ASP863 at 10 ns and GLY732, SER783 and GLU770 at 90 ns. Also, some hydrophobic interactions in luxenchalcone are conserved throughout the simulation. GLY727, ALA751, LEU796 and LUE852 are the conserved residues involved in hydrophobic interaction in luxenchalcone. Finally, 7,7 00 -dimethyllanaraflavone has conserved conventional H-bond interactions with residues LYS753 and ASP863. According to supplementary Figure S5, 7,7 00 -dimethyllanaraflavone has the same orientation and conformation at both 10 ns and 90 ns. These data are further complemented by the Hydrogen occupancy study where both LYS753 and ASP863 have high H-bond occupancy of 24.21% and 30.76% respectively. Also, its hydrophobic interactions are conserved as VAL734 and ALA751 is visualized to form hydrophobic interactions in both 10 ns and 90 ns. In hindsight, all of the mentioned ligands have conserved interactions throughout the time frame of the simulation. They had more or less the same orientation and conformation at both 10 ns and 90 ns implying its binding stability with the protein 3.7. MM/PBSA calculations Our binding energy analysis from MD simulation trajectories have shown that rhinacanthin Q (À255.85 ± 27.62 KJ/mol), subtrifloralactone D (À268.91 ± 31.13 KJ/mol) and 7,7 00 -dimethyllanaraflavone (À249.07 ± 35.56 KJ/mol) have a higher binding affinity towards HER-2 inhibition site as compared to the native co-crystal ligand inhibitor 03Q (À229.67± À26.39 KJ/mol) as reflected in Table 9.
Interestingly, the ligand, luxenchalcone, that showed the highest binding affinity (À12.2 kcal/mol) during docking displayed the lowest binding affinity (À214.34 ± 39.67 KJ/mol) as per MM/PBSA free energy. This may be because luxenchalcone breaks away from protein frequently during the simulation. Results indicate van der Waals, electrostatic, and SASA energy negatively contribute to the total interaction energy while only polar solvation energy positively contributes to the total free binding energy. Considering the negative contribution, the contribution of van der Waals interaction is much more than that of the electrostatic interactions for all the cases. The contribution of SASA energy is relatively less as compared to the total binding energy. The high negative value of van der Waals energy also points to the massive hydrophobic interaction between the ligands and the HER-2 kinase domain.

Discussion
Computer-aided drug design (CADD) is essential in drug discovery because it can significantly reduce costs and labor. Molecular docking, molecular dynamics and pharmacokinetics tools have become important techniques in CADD because of their reliable prediction. Molecular docking predicts the interaction of the protein with the ligand and calculates the binding energy between them. In this study, 1500 phytochemicals were docked to the ATP binding site of the HER-2 kinase domain. Out of twelve phytochemicals exhibiting high binding affinity to the receptor than the reference ligand, only four were found to have druglikeness properties. The four compounds, along with their binding energies, are luxenchalcone (À12.2 kcal/mol), rhinacanthin Q (À11.6 kcal/mol), subtrifloralactone D (À11.4 kcal/mol), and 7,7 00 -dimethyllanaraflavone (À11.3 kcal/mol).
Non-covalent molecular interactions play a crucial role in shaping and stabilizing the docking complexes (Wade & Goodford, 1989). Conventional hydrogen bonds are strong non-covalent interactions that play a vital role in molecular recognition and overall stability of ligand-receptor complexes (Bulusu & Desiraju, 2020). Subtrifloralactone D formed four conventional hydrogen bonds with the receptor, while all other phytochemicals formed three conventional hydrogen bonds. The best four phytochemicals were also stabilized by hydrophobic interactions as they had multiple rings in their structure. Hydrophobic interactions are weaker than conventional hydrogen bonds; however, multiple hydrophobic interactions contribute significantly to the binding of ligand to the receptor.
In the experimental binding pose determined by X-ray crystallography, reference ligand 03Q interacted with two residues of the DFG motif: ASP863 and PHE864. A previous study reported a 50-fold improvement in inhibition of the enzyme when inhibitors targeted the DFG motif (Peng et al., 2013). Hence, ASP863 and PHE864 could be crucial residues involved in the inhibition of the HER-2 kinase domain. Both docking and MD results showed ASP863 forming hydrogen bond will all phytochemicals except luxenchalcone, where bond formed by subtrifloralactone D had the highest occupancy of 72.12%. The only interaction PHE864 formed with phytochemicals during docking was hydrophobic interaction with rhinacanthin Q. However, PHE864 formed hydrogen bond with subtrifloralactone D (occupancy ¼ 14.73%) and luxenchalcone (occupancy ¼ 21.06%) during MD simulation. All phytochemicals interacted with the residues of the P-loop through hydrophobic interactions. Luxenchalcone formed conventional hydrogen bonds with two residues of P-loop: PHE731 and GLY732 with occupancy 10.32% and 38.75%, respectively. As phytochemicals made interactions with crucial catalytic regions such as DFG motif and P-loop, they showed good potential in inhibiting HER2.
Comparing and analysing the interaction results obtained from docking and MD simulations, we can draw a few contrasting observations and information. 03Q formed conventional H bonding with the catalytic site of the protein at 2 amino acid residues namely MET801 and ASP863. But from our MD results, in Table 7, we observed that GLN799 also constitutes a significant portion when considering hydrogen bond occupancy in the protein ligand complex of 03q and the HER2 protein. Similarly, docking results (Table 5) did not show THR798 as a residue for conventional Hydrogen bonding with luxenchalcone which was contrastingly depicted in MD results (Table 7) where the residue constituted for a mere 5.39% of hydrogen bond occupancy. ASP863 was a major residue forming hydrogen bond in all the ligands except luxenchalcone during MD simulation (Table 7). However, ASP863 formed hydrogen bond with only 7,7 00dimethyllanaraflavone and 03Q during docking. The most notable contrast came while analysing the data of subtrifloralactone D. During MD simulation, hydrogen bond formed by ASP863 had a prominent 72.12% occupancy. Moreover, subtrifloralactone D had formed conventional hydrogen bond through ASP863 in 10 ns as well as 90 ns (Table 8). But during docking, subtrifloralactone D did not have ASP863 as a residue for H bonding. The MD results depicted THR863 as a residue forming H bond in subtrifloralactone D (Table 8) which was not observed during docking (Table 5).
From this, we can see that a few of our findings from the docking studies and molecular interactions do not align head to toe with the findings from the molecular dynamics simulation. Although most of the information from both the analysis conclude how these selected ligands form a good binding potential with the protein, some minor and a few major observations are conflicted when considering the information obtained from the simulation. This entails the importance of performing MD simulation. Without the results of MD simulation, we would have missed out on key data that explains the binding interaction between the ligand and the protein. In hindsight, the results obtained from docking studies alone is not enough for validating the binding potential of the ligand. Molecular docking provides static poses of the most favoured conformations of molecules in the binding pocket of a protein to present a stable complex. The static images are not able to present the other crucial features involved in providing stability to a protein. These features include the flexibility of residues and secondary structural elements (Purohit, 2014). The conformational changes arising from the dynamic behaviour of a protein might affect its actual biological functioning (Bhardwaj & Purohit, 2020). The actual movement and structural perturbations of a protein in its biological environment could be visualized by MD simulations. Overall information about the stability of protein backbone when bound with a ligand or small molecule can be analysed by RMSD parameter. Lower value of RMSD throughout the molecular dynamics simulation suggests the high stability of protein-ligand system and higher RMSD value infer comparatively low stability of the system. The reference complex (03Q) has the lowest average RMSD value (0.28 nm) among all ligands with very few fluctuations indicating that 03Q is stabilized inside the pocket and doesn't change its orientation in the active site of HER2. RMSD of luxenchalcone complex varied most of the time. When we visualized the trajectory of luxenchalcone complex in PyMOL, we found out that luxenchalcone broke away from the protein many times during the simulation which may have caused the RMSD to fluctuate. Rhinacanthin Q complex and 7,7 00 -dimethyllanaraflavone complex showed low fluctuations in RMSD like the reference complex. This minimal fluctuation in the RMSD trajectories depicts that these protein complexes were stable and comparable to experimental structures. RMSD of subtrifloralactone D complex was fluctuating during the first half of the simulation. We did not observe subtrifloralactone D breaking away from the protein; however, we observed that subtrifloralactone D had several conformational changes during the simulation, which explains the fluctuating RMSD value.
The RMSF value can describe changes in the conformation of the protein due to binding with ligand during MD simulation. Higher degrees of RMSF values infer the protein structure will attain greater flexibility. On the other hand, marginal flexibility of the protein-ligand system results in lower RMSF values. All the complexes showed similar average values as compared to reference complexes, inferring that the complexes are stable. The RMSF results represented that all complexes were thermodynamically stable, and hence, these predicted compounds have the potential to inhibit the catalytic activity of HER2 protein. The terminal residues exhibit highest fluctuation and can be clearly observed in Figure 3B. These residues are located in the flexible loop region of this protein. The rigid structures of protein like helix and sheets showed low RMSF value. Throughout the 100 ns simulation, the low RMSF value was found for amino acids from the catalytic DFG motif (ASP863, PHE864, GLY865). This describes the residues' reduced movements which prevented the binding site to undergo drastic structural changes and made it less susceptible to rearrangement. Since the shape of the site was not altered during the simulation, this allowed the ligands to stay and fully latch themselves inside the cavity.
The Rg parameter is defined as the mass-weighted root mean square distance of a set of atoms from their typical center of mass (Lobanov et al., 2008). Rg provides information about the overall dimensions and compactness of protein structure. Large variation in Rg value indicates that the protein-ligand system is not stably folded and compact throughout the simulation. Uniform variation of Rg value infers the compactness and stability of the protein during simulation. All phytochemical complexes exhibit an average Rg value that is similar to that of the reference complex (2.03 nm). However, some phytochemical complexes show big fluctuations in Rg, unlike the reference complex. The Rg value of 7,7 00 -dimethyllanaraflavone complex fluctuated greatly during the first 10 ns. This is because the complex was adapting to the environment and various conformational changes and folding occurred during this period. However, the stable nature of Rg from 10 ns onward suggests the stable folding of the protein along with the ligand. The luxenchalcone complex exhibits fluctuation in Rg throughout the simulation because the ligand luxenchalcone breaks away from protein several times, which may have hindered the complex to fold stably. The rhinacanthin Q complex and the subtrifloralactone D complex have shown stable Rg value throughout the simulation which is similar to that of the reference complex (03Q), suggesting that both complexes exist in stable folded conformation throughout the simulation.
The SASA information will be helpful to analyse, whether the ligand retained inside the shallow binding pocket or it expels out from the binding cavity. The value of SASA for the complexes of rhinacanthin Q (160.63 nm 2 ), 7,7 00 -dimethyllanaraflavone (160.12 nm 2 ) and subtrifloralactone D (159.39 nm 2 ) were similar to that of the reference complex (159.17 nm 2 ). However, the SASA of luxenchalcone complex (162.19 nm 2 ) was higher, indicating luxenchalcone may expel out of binding cavity during simulation.
MM/PBSA is the most commonly used methods to support the results of molecular docking. The binding energies measured by this method are more efficient than the AutoDock Vina scores. For every complex, negative overall binding energies were obtained indicating spontaneity of their binding. The MM/PBSA results confirmed that the identified potent molecules bind at the active site of HER-3 with a higher affinity as compared to reference (03Q) and results in the formation of a higher stable HER2-inhibitor(s) complex than HER2-03Q complex. We can observe from the results that the MM/PBSA energies do not necessarily follow the docking scores. The difference in the scores could be due to the inclusion of polar and non-polar solvation energies in the MM/PBSA method. Luxenchalcone had the highest binding affinity in docking results; however, MM/PBSA free energy analysis show that it has the lowest binding affinity. Subtrifloralactone D, 7,7 00 -dimethyllanaraflavone and rhinacanthin Q show good binding affinity to HER-2 as depicted by low MM/PBSA binding free energy. Van der Waals interactions showed more impact on the binding of HER-2 kinase domain to the selected compounds. Therefore, it suggests the importance of the non-covalent interactions in these complexes.
In order to be a promising drug candidate, the phytochemicals should have good pharmacokinetic properties. Since, subtrifloralacton D is the substrate of P-gp, it is effluxed out of cells, reducing its oral bioavailability. However, subtrifloralacton D could be derivatized to improve its pharmacokinetics. All compounds were not found to inhibit p50 enzymes CYP3A4 and CYP2D6. Further, blood brain barrier permeability evaluation depicted low values for all ligands except 7,7 00 -dimethyllanaraflavone which had a BBB value of about 0.55. 7,7 00 -dimethyllanaraflavone performed poorly regarding the BBB permeability prediction while other ligands proved to be good drug candidates which do not permeate through the brain.
The cardiotoxicity assessment through hERG inhibition requires further investigations especially since subtrifloralactone D showed ambiguous nature whereas the other three ligands showed medium risk of cardiotoxicity. Finally, the oral lethal dose prediction data helped to identify that subtrifloralactone D and rhinacanthin Q were toxic when swallowed whereas luxenchalcone and 7,7 00 -dimethyllanaraflavone, both being in the LD50 classification of V, were either toxic or non-toxic when ingested. This means that luxenchalcone and 7,7 00 -dimethyllanaraflavone requires further studies regarding its oral toxicity and if they turn out to be non-toxic, then they can be treated as oral drugs. Conclusively, all four ligands do not classify as class VI of LD50 classification which is the ideal qualification for a drug candidate as it identifies as being non-toxic. However, being classified as a class IV or class V suggests that these compounds are orally toxic when introduced in relatively large amounts considering the standard dosages of oral drugs. This would mean a possibility that the four ligands can be used in very low dosages to compensate for their toxicity.

Conclusion
In summary, molecular docking, drug-likeness filter, molecular dynamics, and pharmacokinetics are successfully employed to determine the best phytochemicals against the HER-2 kinase domain. ASP863 and PHE864, the residues from the DFG motif, play an essential role in inhibiting the HER-2 kinase. Rhinacanthin Q showed high binding affinity (both autodock vina and MM/PBSA) and formed stable complex with HER-2 as depicted by RMSD, RMSF and Rg analysis. Although luxenchalcone has the highest binding affinity as per docking results, MM/PBSA binding energy is the lowest. Moreover, it doesn't form stable complex with HER-2 as depicted by RMSD, RMSF and Rg analysis. Subtrifloralacton D and 7,7 00 -dimethyllanaraflavone exhibited good binding affinity during docking as well as MD simulation. Both formed stable complex with HER-2 as portrayed by RMSD, RMSF and Rg analysis. Therefore, rhinacanthin Q, subtrifloralacton D, and 7,7 00 -dimethyllanaraflavone show promise and can be used to design effective anticancer drugs against HER2 kinase domain. However, further experimental studies are required to better assess these phytochemicals. Rhinacanthin Q can be extracted from the roots of the plant Rhinacanthus nasutus (snake jasmine) (Wu et al., 1998). Deprea subtriflora species of the Solanaceae family contains abundant subtrifloralactone D in its extract (Su et al., 2003). 7,7 00 -dimethyllanaraflavone can be found in the plant extracts of the Ouratea and Luxemburgia genus (de Carvalho et al., 2007).