In silico identification of small molecule protein-protein interaction inhibitors: targeting hotspot regions at the interface of MXRA8 and CHIKV envelope protein

Abstract Chikungunya virus (CHIKV) is an arthritogenic arbovirus responsible for re-emerging epidemics of Chikungunya fever around the world for centuries. Chikungunya has become endemic in Africa, Southeast Asia, the Indian subcontinent, and subtropical regions of the Americas. The unavailability of antiviral therapy or vaccine against the CHIKV and its continuous re-emergence demands an urgent need to develop potential candidate therapeutics. CHIKV entry into the host cell is mediated by its envelope proteins engaging the cellular receptor MXRA8 to invade the susceptible cells. We report here two essential target binding sites at the CHIKV E1-E2 proteins by identifying hotspot regions at the E1-E2-MXRA8 binding interface. Further, we employed high throughput computational screening to identify potential small molecule protein-protein interaction (PPI) modulators which could effectively bind at the identified target sites. Molecular dynamics simulations and binding free energy calculations confirmed the stability of three compounds, viz., ZINC299817498, ZINC584908978, and LAS52155651, at both the predicted interface binding sites. The polar and charged residues at the interface were responsible for energetically holding the ligands at the binding sites. Altogether, our findings suggest that the predicted target binding sites at the E1-E2 dimer could be essential to block the receptor interaction as well as the fusion process of the CHIKV particles. Thus, we identified a few small molecule PPI inhibitors with great potential to block the E1-E2-MXRA8 interaction and act as promising templates to design anti-CHIKV drugs. Communicated by Ramaswamy H. Sarma


Introduction
Chikungunya virus (CHIKV) is a re-emerging arthropod-borne alphavirus belonging to the Togaviridae family of viruses. It was first identified in Tanzania in 1952 following an outbreak on the Makonde plateau and since then has caused frequent outbreaks in tropical countries of Africa and Southeast Asia (Robinson, 1955). It re-emerged in Kenya in 2004, widely spreading to the La reunion island and the Indian ocean. The epidemic then spread from the Indian Ocean to India and affected more than 1.5 million people in 2005-2006. However, the recent re-emergence of the virus has been documented in Western countries and temperate zones around the world (Geographic Distribution j Chikungunya virus j CDC). In 2014 over 1 million suspected cases of Chikungunya infection were reported to the Pan American Health Organization (PAHO). Overseas travel was identified as one of the major risk factors for the rapid global spread of the disease. Currently, chikungunya has been identified in over 60 countries throughout Asia, Africa, Europe, and the Americas. The most recent outbreaks are reported in Sudan (2018), Yemen (2019), and more recently in Cambodia and Chad (2020) (Chikungunya Fact Sheet). CHIKV transmission mainly occurs through the bite of an infected Aedes aegypti and Aedes albopictus mosquitoe, although cases reporting vertical transmission of the virus have also been reported (G erardin et al., 2008). Chikungunya infection is characterized by severe arthralgia accompanying fever, headache, rashes, joint swelling, and conjunctivitis. The infection is sometimes associated with severe neurological, cardiovascular, and haemorrhagic manifestations (Moiz eis et al., 2018).
CHIKV is an enveloped virus owning a single strand positive sense RNA of about 11.8 kb in length. Its genome encodes two polyproteins, the non-structural proteins (nsP1-4) and structural proteins. The structural polyprotein precursor is proteolytically cleaved into a capsid protein (C), small polypeptides 6K, E3, and the envelope glycoproteins E1 and E2 (Li et al., 2012). The CHIKV invasion by endocytosis into the host cell is mediated by the E1 and E2 glycoproteins, organized as trimers of heterodimers to form an icosahedral symmetry at the virion surface (Voss et al., 2010). These two glycoproteins, particularly E2, carry the main antigenic determinants and is responsible for receptor binding, whereas the E1 protein has a hydrophobic fusion loop at one end to mediate the membrane fusion. After the receptor binding, the virion is endocytosed and the low pH of the endosome induces an irreversible conformational change in the E1-E2 heterodimer; as a result, E2 dissociates and E1 forms homotrimers exposing the fusion loop (FL) to initiate the fusion of viral membrane and the endosomal membrane (Li et al., 2010). Hence, the envelope proteins of alphaviruses are classified as class II fusion proteins (Kielian et al., 2010;Kielian, 2006).
The foremost step in a viral infection is the binding of the virus to its host cell receptor. Since CHIKV infects a wide range of cell types, its cellular receptor may be expressed ubiquitously among various cell types (van Duijl-Richter et al., 2015). Several studies have been carried out to determine the receptor and attachment factors for CHIKV (Schnierle, 2019). Glycosaminoglycans, prohibitin, and phosphatidylserine-mediated virus entry enhancing receptors are among the previously identified CHIKV receptors (Silva et al., 2014;Wintachai et al., 2012;Moller-Tank & Maury, 2014). A study published in the year 2018 has identified MXRA8 (Matrix Remodelling Associated 8) as a cellular receptor for multiple arthritogenic alphaviruses, such as Chikungunya, Ross River, Mayaro, and O'nyong nyong viruses (Zhang et al., 2018). However, recently the molecular basis of the MXRA8 binding to the envelope protein of CHIKV has also been elucidated, along with the crystal structure of mouse MXRA8 and human MXRA8 (hMXRA8) in complex with the CHIKV envelope proteins (Song et al., 2019). The authors have described the structure of hMXRA8 and the atomic details of the interaction between the receptor and the CHIKV E1-E2 protein. MXRA8 is an adhesion molecule expressed in a diverse set of cells, for instance, epithelial, myeloid, and mesenchymal cells (Yonezawa et al., 2003;Jung et al., 2008). It has two Ig-like domains (D1 and D2) connected with two hinge loops and an interdomain disulphide bond. The receptor MXRA8 binds with the CHIKV E protein involving both its domains and the hinge loops, interacting with the residues from both E1 and E2 ( Figure 1a) (Song et al., 2019). A single MXRA8 protein binds to two adjacent E1-E2 dimers at distinct sites. Although the interface between the receptor and a single E1-E2 dimer involves domain A and domain B of E2, interacting with the D2 domain and the hinge region of MXRA8; while the FL and domain II of E1 protein interact with the D1 domain and hinge loop 1 of the receptor (Figure 1a). In context with the viral infection, the binding of MXRA8 and the CHIKV envelope protein facilitates the attachment and entry of the virus into the host cell. This interaction is critical to direct the virus into the endosome and establish the subsequent infection of the virus. Hence, targeting the specific receptor binding regions on the CHIKV E1-E2 dimer appears as a promising strategy to develop antivirals against CHIKV.
Inhibition of protein-protein interactions (PPIs) between the viral envelope protein and the host cell receptor or within the envelope proteins is an attractive strategy to block the viral infection at an initial stage (Teissier et al., 2011). Moreover, various small molecule inhibitors have been reported against several viral envelope proteins as the therapeutic intervention of viral infection (Verma et al., 2020). CHIKV structural proteins are a potential target of interest to block the entry and internalization of the virus particles. Broad spectrum antiviral compounds such as chloroquine and arbidol interfere with the cell entry step of CHIKV as chloroquine prevents endosomal acidification disrupting endosome mediated internalization, and the latter may interfere during attachment and entry of CHIKV into the host cell (Khan et al., 2010;Delogu et al., 2011). Further several experimental and computational studies have revealed molecules targeting envelope protein E2 or the virus-cell binding of CHIKV particles (Subudhi et al., 2018;Deeba et al., 2017;Mounce et al., 2017). These small molecules either directly bind to the envelope protein of the virus or interfere with the attachment of the virus, subsequently blocking the viral replication. A few examples for such compounds are Phenothiazine, Bafilomycin, Doxycycline, Curcumin, Arbidol, Demethoxycurcumin and Epigallocatechin gallate (EGCG) (Delogu et al., 2011;Deeba et al., 2017;Weber et al., 2015).
The advances in understanding the virus-host interactome and PPIs has shed light in the direction of developing small molecule PPI inhibitors as effective antivirals (de Chassey et al., 2014). In fact, from the past decade, various small molecule therapeutics have been developed for protein complexes, and several inhibitors have reached clinical trials (Arkin et al., 2014). The significant involvement of MXRA8 D1, D2, and hinge loop region in the interaction with E1-E2 dimer potentiates the hypothesis to target the critical receptor binding regions on the envelope protein. In the present work, we employed a computational approach using high throughput computational screening along with molecular dynamics (MD) to identify potential compounds which can interfere with the interaction between the CHIKV E1-E2 dimer and the receptor MXRA8. Our study reports an essential binding pocket as a hotspot in the E1-E2 dimer, which is critical for the hinge loop binding of the MXRA8. This hotspot region can be used to design modulators of the receptor binding as well as to restrict the movement of FL during the fusion process. We have identified a few small molecule PPI inhibitors with the potential to block the interaction between E1-E2 dimer and MXRA8.

Identification of hotspot regions at the PPI interface to determine the active sites
The crystal structure of CHIKV envelope protein bound to its receptor provides important structural insights into their interaction. The 3D crystal structure was retrieved from the RCSB protein databank (https://www.rcsb.org/) having PDB ID 6JO8. The detailed interaction between the interface of the two binding entities, E1-E2, and hMXRA8, has revealed the critical regions of the PPI. The receptor binding regions at the E1-E2 dimer were critically evaluated as already described in the literature (Song et al., 2019). Further, we categorized them into different hotspot regions following the literature reported regions with the highest number of contacts between the MXRA8 and E1-E2. The MXRA8 hinge loop 1 binding region and D2 binding regions at the E1-E2 were classified as hotspot. Residues at the hotspot regions were then used to define the active sites for further studies. We selected two different sites, site 1 and site 2, at the envelope protein to design the inhibitors with adequate binding energy.

Selection and preparation of chemical libraries
The discovery of the ability of natural products and small molecules to inhibit PPIs led us to use them as chemical libraries to design PPI inhibitors against CHIKV envelope protein (Jin et al., 2018;Bojadzic & Buchwald, 2018). A library of commercially available natural compounds was downloaded from the ZINC database. This database has a collection of millions of commercially available compounds which can be used for computational screening (http://zinc.docking.org/) (Irwin & Shoichet, 2005). A total of 27256 molecules were initially present in the natural compound library. We also retrieved PPI modulators libraries from ChemDiv, Asinex, and Enamine containing 210386, 21136, and 59373 compounds, respectively. We extracted the known inhibitors of the CHIKV E2 or virus-cell binding from the literature. Although many compounds are known to inhibit CHIKV but for this study we selected compounds that specifically are reported to inhibit the E2 protein or the receptor binding step of viral entry. The selected inhibitors are Phenothiazine, Bafilomycin, Doxycycline, Curcumin, Arbidol, Demethoxycurcumin and Epigallocatechin gallate (EGCG). The 3 D structure of all the inhibitors was downloaded from the PubChem compound database (https://pubchem.ncbi.nlm.nih.gov/). In order to use these chemical compounds for docking, we need to optimize the ligands with correct molecular mechanics parameters. Therefore, all the four different sets of libraries and the known inhibitors were prepared using the LigPrep module of the Schr€ odinger suit (version 2019.1) (LigPrep j Schr€ odinger). The ligand structures were optimized using the OPLS3e force field with desalting and tautomer generation. The inbuilt Epik module of the Schr€ odinger suit (version 2019.1) was used to generate all the possible ionization states of the ligand molecules at pH 7 ± 2 (Epik j Schr€ odinger; Shelley et al., 2007). Lastly, for every ligand, at most 32 possible stereoisomers were generated. After the preparation, the total number of output conformations were 598435, 535239, 89540, and 200481 for the natural compound library, ChemDiv PPI, Asinex PPI, and Enamine PPI, respectively.

Molecular docking of chemical libraries against the selected sites at E1-E2 dimer
In the next step, we employed molecular docking approach to predict the binding affinities and model the interactions between the selected ligand molecules and the active sites at the E1-E2 dimer. We used the Glide module of the Schr€ odinger suit (version 2019.1) to perform molecular docking based virtual screening of selected datasets (Glide j Schr€ odinger; Halgren et al., 2004;Friesner et al., 2004). The coordinates of the receptor hMXRA8 (Chain M/N/O) from the PDB file (PDB ID 6JO8) were removed, and only the coordinates of the protein E1 (Chain B) and Togavirin/E2-E3 (Chain A) were taken to define the protein for docking. Further, to use this protein molecule as a receptor in the docking process, the structure needs to be processed and minimized. This step was accomplished using the protein preparation wizard of the Schr€ odinger suit (version 2019.1) (Protein Preparation Wizard j Schr€ odinger). Pre-processing of the protein includes the removal of unwanted water molecules, the addition of missing hydrogen atoms, and fixing missing side chains. Finally, restrained energy minimization was performed with force field OPLS3e to remove steric clashes and strained bonds/angles, allowing heavy atoms displacement up to 0.3 Å. Separate receptor grids were generated for two distinct active sites in the envelope protein. The receptor grid box of size 20 Å 3 was generated around the selected active site residues, individually for both site 1 and site 2. The X, Y and Z coordinates for grid box at site 1 were 25.536, 47.310 and 38.310, respectively. The grid box generated around site 2 has the X, Y and Z coordinates, 17.441, 53.629 and 35.783, respectively. We started with the virtual screening of the known inhibitors, Phenothiazine, Bafilomycin, Doxycycline, Curcumin, Arbidol, Demethoxycurcumin and EGCG to get their binding pose and binding energy at site 1 as well as site 2. Docking for these compounds was performed in extra precision (XP) mode to achieve higher accuracy (Friesner et al., 2006). The other selected chemical libraries, viz. natural compounds, ChemDiv PPI, Asinex PPI, and Enamine PPI, were docked both at site1 and site 2 separately with three different accuracy levels. The screening was first performed in the High throughput virtual screening (HTVS) mode to enrich the large datasets, followed by the Standard precision (SP) mode with higher accuracy and the XP mode for extensive sampling and advanced scoring. This enrichment of the datasets at every level was achieved by studying only 10% of the compounds at the next higher accuracy level. The ligand Table 1. Interactions between MXRA8 and E1-E2 at the hot spot regions. Number in the parentheses represents the number of atom-toatom contacts between the hMXRA8 residues and CHIKV E1-E2 residues. Hydrogen bonds are mentioned separately and highlighted in red color.

Post-docking optimization and interaction analysis
After the successful screening of all the datasets against the two active sites at the CHIKV envelope protein, compounds with desirable results were considered for further enrichment. Several studies have revealed that the consensus scoring method whereby more than one scoring algorithm is used to predict the binding affinity is a more superior method in terms of accuracy than using only one scoring algorithm (Houston & Walkinshaw, 2013). Hence, a consensus scoring function was applied along with detailed interaction analysis, which was followed by the final filtering of compounds. Further, conformational optimization of the protein-ligand complexes was achieved using MD simulation combined with Molecular mechanics to predict the binding free energy of the complexes.

Consensus scoring and filtering
The molecular docking programs used in the previous section have different docking algorithms with different scoring methodologies and hence, it is difficult to directly compare the docking scores of Glide and GOLD software. Therefore, to circumvent this problem we used a consensus scoring function, X-score (version 1.2.1), to evaluate the binding affinities of the molecules independent of docking (Wang et al., 2002). It predicts the negative logarithm of dissociation constant (-logK d ) of the ligand to the given protein molecule by utilizing three different empirical scoring functions (HPScore, HMScore, and HSScore) to finally calculate the binding energy of the ligand. The same subset of molecules from each library which were selected after molecular docking at two different binding sites were used for consensus scoring and interaction analysis.
The pose of the ligands at their respective binding sites were analysed using PyMol and LigPlot þ (version v1.4.5) (PyMOL j pymol.org; Laskowski & Swindells, 2011). Detailed interaction analysis was performed for the docked complexes generated through Glide docking. The final filtering was done among all the ligands from different datasets to select fewer ligand molecules on the basis of their Glide gscore value, GOLD fitness score, X-score, and the involvement of critical residues from hotspot regions. Essentially, the compounds which were common and binds both at site 1 and site 2 were determined for further molecular mechanics analysis.

Molecular dynamics simulation and binding free energy prediction
MD simulation associated structural optimization was carried out by taking into account the flexibility of the protein molecule. It provides dynamical structural as well as energetics information of the protein-ligand complex (Liu et al., 2018). Therefore, to assess the binding stability of the known inhibitor and the identified PPI modulators which could effectively bind at both site 1 and site 2; nanosecond (ns) MD simulations were performed. We carried out these simulation studies using GROMACS (version 2019.4) simulation package (Berendsen et al., 1995). We designed total 10 different simulation systems, and the parameters for all these systems in every step of MD simulation were identical. The protein molecule was parameterized using the AMBER99SB force field to generate a gromacs compatible topology file (Hornak et al., 2006). All the ligand structures were parameterized with AMBER force field and BCC ligand charges using the ACYPE server (Acpype Server; Sousa Da Silva & Vranken, 2012). Subsequently, a triclinic box was generated with protein at the centre and at least 1.0 nm distant from the box edge.
The system was solvated using TIP3P water models, and the required number of water molecules were replaced by Na þ / Clcounter ions to neutralize the system (Mahoney & Jorgensen, 2000). After the system was assembled, each simulation system was subjected to energy minimization by the steepest descent minimization algorithm with 50000 steps. Before equilibrating the systems, a positional restraint was applied to the ligand molecules. The equilibration of the systems was performed in two phases, NVT (Number of particles, volume, and the temperature kept constant) and NPT (Number of particles, pressure, and the temperature kept constant) equilibration. Both of these phases were carried out for 100 picoseconds (50000 steps), maintaining the temperature and pressure at 300 K and 1.0 bar, respectively, with Particle Mesh Ewald (PME) algorithm to calculate long range electrostatic interactions. Finally, for each system after equilibration, a 100 ns (50000000 steps) production run was performed at 2 femtoseconds (fs) time step, and trajectories were saved after every 10 ps (5000 steps).  Table 4. Envelope protein residues involved in the interaction with the PPI modulators at binding site 1 (A) and site 2 (B). Number in parenthesis are the atomto-atom contacts between the residue and ligand. Non-bonded contacts include all atom-to-atom contacts (Hydrophobic contacts also) between 2.9-3.9 Å.

Selection of high affinity small molecule PPI inhibitors
Molecular docking based virtual screening of natural products library and small molecule PPI libraries was performed to identify compounds having high affinity for the hotspot regions in the envelope protein. Initially, the compounds were ranked according to the Glide gscore (binding energy) in their respective libraries for both site 1 and site 2. Further, GOLD score prediction, binding affinity prediction (X-score), and interaction analysis were performed for 20 top ranking compounds from each library. Hence total 160 compounds were analyzed (20 Â 4 þ 20 Â 4) to finally select a few small molecule PPI inhibitors with a greater possibility to interfere with the protein-receptor interactions. Finally, each dataset was individually examined, and compounds were shortlisted on the basis of all the components, viz, lowest binding energy (Glide gscore), high fitness score (GOLD score), more negative binding affinity (X-score), and the involvement of critical residues from hotspot regions. Considering all these factors, we shortlisted 20 PPI inhibitors, 10 from each binding site mentioned in Table 2. The binding energy for natural compounds was much lower at both the sites as compared to the PPI modulators from Enamine, ChemDiv, and Asinex libraries. Since the scoring function of Glide is relative in terms of the binding energies, compounds from different data sets cannot be compared to each other solely on the basis of their gscore. Therefore, the compounds were selected independently with different cut-off values for different datasets at both sites. The binding energy for ligands docked at site 1 was between À17.268 to À5.937 kcal/mol with the fitness score in the range 75-47, whereas for ligands at site 2 the range of binding energy was À11.477 to À4.35 kcal/mol and fitness score was between 70-39. The compounds with ID S1C2/S2C1 (ZINC299817498), S1C4/ S2C3 (ZINC584908978), and S1C9/S2C10 (LAS52155651) obtained a binding pose at both site 1 and site 2. The binding energies, predicted binding affinities, and the fitness scores of these compounds implicate the possibility of better binding potential and selectivity towards site 1 than site 2 for all the three compounds ( Table 2). The binding free energies of the known inhibitors (Phenothiazine, Bafilomycin, Doxycycline, Curcumin, Arbidol, Demethoxycurcumin and EGCG) obtained through molecular docking are reported in Table 3. Doxycycline was selected as the benchmark compound on the basis of the Glide gscore and the reported activity against the CHIKV envelope protein.
Analyzing the atom-to-atom contacts of ligand and the interface residues was also an essential criterion to shortlist the PPI inhibitors. However, the number of interactions with the interface residues was found to be higher in the case of natural products, essentially due to their larger size. The residues at the interface involved in the hydrogen bond interactions and other non-bonded interactions with the ligands are specified in Tables 4A and 4B. The interaction details for the literature reported 7 known inhibitors are mentioned in Supplementary Table 1. A higher number of interactions were observed for the proposed inhibitors than the literature reported inhibitors at both the sites. The binding pose and interactions for the known inhibitors at site 1 and site 2 are shown in Supplementary Figures S1 and S2. While analyzing the binding pose of compounds docked at site 1, it was observed that all the ligands interact with the Trp89 of the E1 FL. The residue, Trp89, forms two hydrogen bonds with Arg65 in the hinge loop of MXRA8 and encounters a conformational change in its side chain (Song et al., 2019). The ligands also interact with the other residues in the FL, Phe87, Met88, Gly90, and Gly91, which are involved in the receptor binding. The compounds S1C3, S1C6, S1C8, S1C9, and S1C10 interacts with Gly91, which is a conserved residue in the E1 fusion loop critical to drive the membrane fusion process (Kuo et al., 2012). Other residues with the highest number of interactions are Cys92, His93, Asp135, Asn136, His187, and Ser188 from E2 domain A. The ligands binding at site 2 essentially interact with the E2 domain A and domain B residues. However, the ligands S2C4, S2C6, and S2C7 also interact with fusion loop residues of E1. The residues, His137, Met138, Asp241, Thr243, Thr255, Asn257, and Gln259, are the common residues forming critical interactions with the ligands bound at site 2. The binding poses of Doxycycline and the compounds ZINC299817498, ZINC584908978, LAS52155651 interacting both at site 1 and site 2 are shown in Figures 3-6, respectively. The compounds at site 1 were packed in a V-shaped cavity involving the tip of the fusion loop. At site 2, the compound LAS52155651 was bound in a sandwich pose between domain B and domain A of E2. A similar pose was observed for Doxycycline, bound in a cleft between domain B and domain A of E2. The natural compounds ZINC299817498 and ZINC584908978, when bound at site 2, are oriented towards the surface of E2 domain B. Moreover, the stability of these compounds at both sites was analyzed through MD simulation studies and binding free energy calculations.

Stability of the docked poses of compounds binding at both site 1 and site 2
The dynamics of ligands bound at two different active sites within the PPI interface were monitored to evaluate the stability of the protein-ligand complexes. Throughout the 100 ns simulation, the E1-E2 in the unbound form (without receptor or ligand) has a root mean square deviation (RMSD) of backbone atoms in the range 0.18 to 0.5 nm (Figure 7). The protein experienced more deviation in the last 20 ns of the simulation period. However, the deviation of protein backbone atoms in the receptor bound form (E1-E2 þ MXRA8) was between 0.2 to 0.4 nm, and hence, the system appears to be stable. Further, we analyzed the stability of protein complexed with ZINC299817498, ZINC584908978, LAS52155651, and Doxycycline both at site 1 and site 2 (Figure 7a and b). At site 1, the RMSD of backbone atoms of all the inhibitor bound protein complexes lies in the same range of the receptor bound form. The overall trajectory of small molecule PPI inhibitor LAS52155651, when bound at site 1, was similar to that of the benchmark compound Doxycycline. Moreover, ZINC299817498 and ZINC584908978 bound proteins experienced deviations and fluctuations during 20-60 ns of simulation time; however, both the complexes were stable in the last 40 ns. In contrast, at site 2, all these inhibitors bound protein complexes experienced more deviations in the backbone atoms than at site 1. The overall range of the RMSD remained the same, but there were a greater number of fluctuations observed for natural compound bound protein complexes at site 2. Although, when the RMSD of ligands were computed, we observed that the reported PPI inhibitors were more stable at site 1 than site 2 (Figure 7c and d).
Next, the radius of gyration (R g ) was measured, which is a characteristic to indicate the protein structure compactness (Lobanov et al., 2008). It was observed that the R g of the envelope protein dimer in the unbound form and the inhibitor bound form lies in the range 3.9 to 4.0 nm (Figure 8a and b). The consistent values of R g for all the complexes throughout the simulation indicate that ligand binding keeps the protein structure compact. Hydrogen bonds are the major contributor in the protein-ligand interactions while predicting the binding energy of the complex. Hence, the number of hydrogen bonds is another important factor influencing the stability of the protein-ligand complex. Throughout the simulation, the hydrogen bond network of ligands was consistent at site 1 in comparison to site 2 (Figure 8c and d). The average number of hydrogen bonds was significantly higher in the case of natural compounds ZINC299817498 and ZINC584908978, irrespective of the binding site. The root mean square fluctuation (RMSF) of Ca atoms of each residue from E1 and E2 protein chains were calculated. The residue fluctuations of protein-ligand complexes were compared with the apo-protein and the receptor bound form. The E1 FL (res 83-98) region experiences a shift after the receptor binding, and the RMSF value for these residues is lower in the receptor bound form. A similar trend of residue fluctuation can be observed for all the protein ligand complexes at site 1, suggesting that ligand binding at site 1 may stabilize the fusion loop. The selected PPI inhibitors, when bound at site 1, involve residues from E2 domain A (res  and domain B (res 237-295). The residues 82-95, 130-137, 181-188, and 241-243 interacts with the ligands binding at site 1, and no major fluctuations were observed at the region corresponding to these residues (Figure 9a and c).
The same set of PPI inhibitors, when bound at site 2, also lowers the RMSF value of FL residues in comparison to the apo-form of the protein (Figure 9b). Upon receptor binding, the residues of E2 domain B are stabilized and experience low fluctuation during the simulation. However, the apo-protein and the ligand bound protein residues at the E2 domain B region have RMSF values between 0.2-0.4 nm (Figure 9d). The domain A of the E2 chain is also involved in the ligand binding at site 2; hence, the RMSF values of receptor bound form and the ligand bound protein residues are less than 0.2 nm (Figure 9d). These observations suggest that the PPI inhibitors were stable at both sites; although, few parameters reflected increased stability at site 1.
Further, we performed essential dynamics to determine the principal motions of the protein in complex with the proposed inhibitors. The MD trajectory was used to construct a covariance matrix which gives a set of eigenvectors or principal components (PCs) with their corresponding eigenvalues. The dynamics of the protein backbone along PC1 and PC2 for all the complexes are shown in Figure 10a and b. The E1-E2 protein in the apo form and in complex with the receptor explores a large conformational subspace along both PC1 and PC2. Comparatively, smaller cluster of motions are observed for the protein-inhibitor complexes at both the sites. The collective motion of the protein backbone in complex with the inhibitor Doxycycline, ZINC299817498, ZINC584908978 and LAS52155651 binding at site 1 occupy relatively similar conformational space. However, the protein dimer explores a conformationally wide phase space along both PC1 and PC2, when the inhibitors Doxycycline and LAS52155651 binds at site 2. In contrast the protein complexed with ZINC299817498 and ZINC584908978 at site 2, remains confined to relatively smaller region. Finally, the free energy landscape (FEL) of all these complexes were compared to evaluate their lowest energy conformation ensembles. The FEL plot for the apo-protein and the receptor bound protein shows an overall single cluster of conformations achieving the local minima. Doxycycline in complex with E1-E2 dimer generates a cluster of stable populations at both the sites; however, the structural ensembles are distributed in a broad conformational space along PC1 for site 2. The FEL for proposed inhibitor ZINC299817498 at site 1 shows three population of conformations confined to the same energy basin. Whereas, at site 2, two small clusters are observed for ZINC299817498 with a transition barrier of < 2 kcal/mol. The FEL plot for the protein in complex with ZINC584908978 shows populations of low energy conformers clustered at different energy basins with transition barrier of < 4 kcal/mol. The E1-E2 dimer in complex with compound LAS52155651 shows ensembles at wide conformational space with energy difference $ 2 kcal/mol. Therefore, the conformational dynamics and the FEL results are consistent with the structural parameters described above, reflecting stable structural conformations of the inhibitors at site 1.

Estimated free energy of binding between proteinligand complexes
A detailed analysis of binding free energy was executed through MM-PBSA in order to understand the molecular interactions and energetics associated with the binding of PPI inhibitors at different receptor binding sites. The MM-PBSA method precisely calculates different energy components of the potential energy and the solvation energy terms. The binding free energy, when conjugated with the MD simulation, emerge as an efficient method to understand the biomolecular association. This analysis was performed on the trajectories of the selected PPI inhibitors (ZINC299817498, ZINC584908978, LAS52155651) and Doxycycline complexed with the protein. The binding energy was averaged over 1000 frames of the 100 ns simulation with a solute and solvent dielectric constant value of 2 and 80, respectively. The contribution of electrostatic energy and van der Waals energy to the total binding free energy of Doxycycline when bound at site 1 and site 2 is comparable ( Figure 11). However, the high binding affinity of Doxycycline towards site 1 with the value of DG bind, À34.76 kcal/mol in comparison to site 2 with DG bind, À24.76 kcal/mol is by virtue of polar solvation energy (Supplementary Table 2). At site 1, the binding energy of PPI inhibitor ZINC299817498 has the equal contribution of van der Waals energy as well as the electrostatic energy. Although for the other two compounds, ZINC584908978 and   Free energy landscape plots of the E1-E2 protein in apo form (c), receptor bound form (d) and inhibitor complexes at site 1 (e) and site 2 (f). The free energy is given in kcal/mol and indicated by the color code panel from lower to higher energy. LAS52155651 van der Waals energy has a more favorable contribution to their respective binding energies. On the other hand, when these compounds bind to site 2, both the non-bonded energy terms are comparable with no significant difference in their values. The compound LAS52155651, when bound at site 2, has higher electrostatic energy, DE ele À22.00 kcal/mol than the van der Waals energy DE vdw À16.91 kcal/mol, predominantly because of the polar and charged amino acid residues at the binding site. The polar and charged amino acid residues are usually present on the protein surface and play a crucial role in the stabilization of PPIs (Hu et al., 2000). The equal proportion of non-polar residues involved in the ligand binding at the interface reflects the equivalent contribution of hydrophobic interactions or van der Waals interactions. These interactions all together contribute to the specificity and the affinity of the PPI inhibitor binding at the hot spot regions. As illustrated in Figure  11, the total binding energy of the compound LAS52155651 is more negative (DG bind À34.60 and À30.33 kcal/mol) than the other two natural compound PPI inhibitors ZINC299817498 (DG bind À17.15 and À5.84 kcal/mol) and ZINC584908978 (DG bind À21.59 and À14.75 kcal/mol). Interestingly, the van der Waals energy and the electrostatic energy of the natural compounds when bound at either of the sites is higher (more negative); however, the unfavorable contribution of polar solvation energy to the total binding energy supports the fact that solvation effects play an important role in determining the binding affinity of a ligand. The total binding free energy for all the ligands was more negative when bound at site 1 of the protein. Thus, the binding free energy analysis revealed that all the three PPI inhibitors bind with a stronger affinity at site 1 in comparison to site 2.
Further, to quantify the energy contribution of each binding site residue in molecular interaction with the PPI inhibitors, we employed free energy decomposition per residue. Figure 11. Estimated binding free energy (DG bind ) of Doxycycline and the selected PPI modulators (ZINC299817498, ZINC584908978 and LAS52155651) when bound at site 1 (a) and site 2 (b) by MMPBSA method.
The plot for the total binding energy contribution of residues involved in interaction with each compound at both the sites is shown in Figure 12. At site 1, the residue Trp89 of the E1 FL energetically favors the binding of all the ligands. Additionally, other fusion loop residues, Met88, Gly90, and Gly91, positively contribute to the binding energy. The binding of ZINC299817498 and ZINC584908978 at site 1 shows that the amino acid residues, Cys86, Cys92, Ser94, Pro95, and Asn136 contributed the most to their total binding energy. The residue wise free energy plot of compound ZINC299817498, when bound at site 2 (Figure 12e), shows the equal contribution of both polar and non-polar residues. The compound ZINC584908978 possess high van der Waals energy DE vdw at both the sites, essentially due to the contribution of hydrophobic residues Ile185 (DG bind À1.83 kcal/ mol) at site 1, Met138 (DG bind À1.56 kcal/mol) and Leu244 (DG bind À1.33 kcal/mol) at site 2. The favorable binding of Doxycycline at both sites shows a significant contribution of the Asp135 with DG bind À5.01 kcal/mol at site 1 and DG bind À7.15 kcal/mol at site 2. Indeed, the compound LAS52155651 is predominantly stabilized in both the binding pockets through the charged residues Asp and Glu at various positions, Asp85 (DG bind À5.76 kcal/mol), Glu88 (DG bind À7.11 kcal/mol), Asp135 (DG bind À2.99 kcal/mol) at site 1 and Asp241 (DG bind À2.24 kcal/mol) at site 2. It is interesting to note that the residue Asp241 has a favorable contribution to the binding of Doxycycline and LAS52155651 at site 2, whereas an unfavorable contribution is seen in the case of natural compounds. Thus, the higher contribution of charged residues to the total binding energy of Doxycycline and LAS52155651 collectively results in lower binding energies of these compounds at both the binding sites. However, a significant contribution of hydrophobic residues in the case of natural compounds, ZINC299817498 and ZINC584908978, typically drives van der Waals energy for stabilizing the ligand in the binding site.

Conclusion
Protein-protein interactions are predominant in any viral infection, ranging from the initial binding of the viral envelope protein to the host receptor to seizing the host cellular machinery by viral proteins. Mapping and identifying viral-host protein interactions provides a better understanding of the viral pathogenesis and develop new avenues for the treatment of viral infections. Substantial progress has been achieved from the past decade to develop peptides and small molecule PPI modulators targeting viral-host protein interactions (Loregian et al., 2002). The present study demonstrates the identification of small molecule PPI inhibitors targeting the hotspot regions of the CHIKV envelope protein involved in MXRA8 binding. We employed the 'hot spot region' approach to select two essential binding sites critical in receptor binding. After molecular docking of small molecules, including natural compounds at both the sites, we shortlisted 10 high affinity inhibitors from each site. For further selection of reliable positive hits, we identified three compounds ZINC299817498, ZINC584908978, and LAS52155651, which could bind with more than one pocket at the PPI interface to achieve adequate binding energy. These selected small molecule PPI inhibitors were energetically stable at both the binding sites, suggesting their significant potential as E1-E2-MXRA8 interaction inhibitors. The critical E1 fusion loop residues, Trp89 and Gly91, were involved in the ligand binding, positively contributing to the total binding free energy of the ligands, thus, revealing the diversity of the binding site. This V-shaped binding pocket (site 1) included the tip of the fusion loop, contributing to the ligand stability, supporting the hypothesis that the inhibitors targeting this site could essentially restrict the conformational change in the FL during the fusion process. Thus, our study represents a good template for designing inhibitors against the CHIKV receptor binding and fusion process. However, to validate our results, the identified PPI inhibitors needs to be evaluated experimentally against the CHIKV. Figure 12. Energy contribution of each interacting residue to the total binding free energy of the ligand. Binding energy of residues interacting with a) ZINC299817498, b) ZINC584908978, c) Doxycycline and d) LAS52155651 at site 1. Binding energy of residues interacting with e) ZINC299817498, f) ZINC584908978, g) Doxycycline and h) LAS52155651 at site 2. The residues labelled in black are the E2 protein residues and the E1 protein fusion loop residues are highlighted in red color.