In silico drug designing for COVID-19: an approach of high-throughput virtual screening, molecular, and essential dynamics simulations

Abstract Severe acute respiratory syndrome-coronavirus2 (SARS-CoV2), a new coronavirus has emerged in Wuhan city of China, December 2019 causing pneumonia named Coronavirus disease-19 (COVID-19), which has spread to the entire world. By January 2021, number of confirmed cumulative cases crossed ∼104 million worldwide. Till date, no effective treatment or drug is available for this virus. Availability of X-ray structures of SARS-CoV2 main protease (Mpro) provides the potential opportunity for structure-based drug designing. Here, we have made an attempt to do computational drug design by targeting main protease of SARS-CoV2. High-throughput virtual screening of million molecules and natural compounds databases were performed followed by docking. After that, the protein-ligand complexes were optimized and rescoring of binding energies were accomplished through molecular dynamics simulation and Molecular mechanics Poisson Boltzmann surface area approaches, respectively. In addition, conformational effect of various ligands on protein was also examined through essential dynamics simulation. Three compounds namely ZINC14732869, ZINC19774413, and ZINC19774479 were finally filtered that displayed better binding affinities than N3 (known) inhibitor and formed conformationally stable complexes. Hence, the current study features the potential novel inhibitors against main protease of SARS-CoV2 which might provide an effective therapeutic strategy against COVID-19. Communicated by Ramaswamy H. Sarma


Introduction
The emergence of severe acute respiratory syndrome (SARS) in 2002-03 from Guangdong, China and middle east respiratory syndrome (MERS) in 2012-13 from Jeddah, Saudi Arabia toll up to $10,617 cases with a mortality of $1640 as per the records of World Health Organization (WHO) by January 2020 (Al-Osail & Al-Wazzah, 2017;Xu et al., 2004) (Supporting material Table S1). A similar viral infection erupted out in Wuhan city of China in December-2019 and spread throughout the world. WHO declared pandemic and named this as 2019-novel coronavirus disease (COVID-19) . By January 2021, the number of confirmed cases were increasing in USA, India, Brazil and placed them in top 3 distressed countries (Supporting material  Table S2).
Coronavirus (CoV) consist of positive-sense single-stranded RNA (ssRNA) as a genetic material, comprising about 26-32 kb genome size (Angelini et al., 2013). CoV is spherical in shape with spikes projecting on its outer surface, which help it to adhere on the host cell, followed by structural changes in the host cell that allows entry of virus (Guo et al., 2020). The virus usually affects the lungs and cause severe respiratory disorders such as pneumonia and acute respiratory distress syndrome. Additionally, it can also cause other systemic disorders such as diarrhea and acute cardiac injury (Rothan & Byrareddy, 2020).
As an effective drug discovery strategy, there are many potential targets that are accessible for the sighting of an antiviral drug and one such target is main protease (M pro ) (Anand et al., 2003). M pro or 3C-like protease (3CL pro ) is a highly conserved nonstructural protein 5 (Nsp5) encoded by ORF1. It plays a crucial role in processing of polyproteins that regulates the viral life cycle by forming component of replication or transcription machinery. The inevitable role of M pro in viral replication cycle makes them potential target for the discovery of antiviral drugs against coronavirus (Hilgenfeld, 2014). Recently, X-ray diffracted crystal structure of SARS-CoV2 M pro provides an excellent target for structurebased drug designing against COVID-19 (Jin et al., 2020). M pro of SARS-CoV2 forms dimer (protomer A and B) and this dimerization is essential for its catalytic activity. Each protomer of M pro further consists of 3 domains, such as domain I: comprises about 8-101 amino acids long and domain II which is 102-184 amino acids long, having an antiparallel b-barrels. Domain III is 201-303 amino acids long segment, comprising an antiparallel globular cluster of five a-helices (Figure 1(A)). Domain II and III are connected by a large loop that provides the site for substrate binding. Additionally, the protein has Cys-His (Cysteine-Histidine) catalytic dyad in between domain I and II which also forms binding pocket together with a large loop.
Computational methods such as molecular docking and dynamics simulation are promising approaches to discover the potential novel inhibitors for various diseases (Bhardwaj et al., 2020a(Bhardwaj et al., , 2020bKamaraj & Purohit, 2016;Rajendran et al., 2018). In the present study, high-throughput virtual screening target MD (molecular dynamics) optimized tertiary structure of M pro of SARS-CoV2 was carried. Hit compounds were docked and high scored docking complexes were optimized and refined by MD simulations. Rescoring of binding free energy was done and conformational study was further analyzed by essential dynamics simulation. Hence, three ligands were filtered that showed preferential binding affinities with M pro .

Structure preparation and validation
The tertiary structure of Coronavirus2 (CoV2) main protease (M pro ) was taken from Protein data bank (PDB:6LU7) (Rose et al., 2011). X-ray crystallographic structure of CoV2 with N3 inhibitor complex was diffracted at a resolution of 2.1 Å . Three-dimensional structure of protein was extracted and cleaned in PyMOL (The PyMOL Molecular Graphics System, Version 1.3 Schrodinger, LLC). The structure was energy minimized in SwissPDB viewer (Guex & Peitsch, 1996). After that, stereochemical properties of minimized protein structure were assessed through SAVES (structure analysis and verification server) and ProSA (protein structural analysis) web servers (L€ uthy et al., 1992;Wiederstein & Sippl, 2007).

Molecular dynamics simulation protocol
Protein was subjected to atomistic MD simulation by GROMACS 5.0 in conjunction with Gromos54a7 force field (Oostenbrink et al., 2004;Van Der Spoel et al., 2005). Initially, protein topology was prepared using pdb2gmx module and structure was placed in triclinic box with periodic distance 1.5 nm between structure and edge of the box. Protein was solvated using SPC/E (simple point charge/extended) water model and counterions were added to neutralize the system. After that, the system was energy minimized by steepest descent method. NVT and NPT equilibration simulations were performed for 100 and 500 ps, respectively. All bonds were constrained using LINCS (linear constraint solver) algorithm (Hess et al., 1997). The temperature of 300 K and pressure of 1 bar were maintained using V-rescale (modified Berendsen thermostat) and Parrinello-Rahman barostat, respectively. Particle Mesh Ewald (PME) method was used to process the electrostatic interactions. Finally, production simulation of 50 ns was carried whereas 0.002 ps of time steps were selected and trajectories were recorded at every 10 ps.

High-throughput virtual screening and molecular docking
High-throughput virtual screening was performed against million molecules and natural compounds databases of ZINC server by employing RASPD tool (Irwin & Shoichet, 2005;Mukherjee & Jayaram, 2013). Top scored compounds from each library were further docked individually by Autodock Vina (Trott & Olson, 2010). Protein and ligand structures preparation and different file format conversions were accomplished through MGL Tools1.5.6 and OpenBabel tools, respectively (Morris et al., 2009;O'Boyle et al., 2011). Drug likeliness of screened compounds were monitored through Lipinski filter (Lipinski, 2004). In Autodock Vina, the receptor molecule was prepared in Auto Dock tools in which polar hydrogens were added and nonpolar hydrogens were merged. Ligand molecules were prepared with the addition of Gasteiger charges. Grid box of 28 Â 28 Â 28 dimensions and size of x, y, z-coordinates (-13.820, 21.037, 73.265) were prepared to cover the target binding site. Dimension and size of the binding pocket were taken from the experimentally known binding site in 3D structure available at PDB . Ligand binding pocket was also verified through Metapocket server (Huang, 2009). ADMET (Absorption, distribution, metabolism, excretion, and toxicity) properties of top scored ligands were predicted through admetSAR and SwissADME servers (Daina et al., 2017;Shen et al., 2010). The toxicity profile of different ligands were assessed through OSRIS molecular property explorer tool (https://www.organicchemistry.org/prog/peo/). Binding affinity was measured in kilocalorie per mol and top-scored protein-ligand complexes were taken as input for MD simulation.

MD simulation and binding free energy calculation
MD simulation of top scored protein-ligand complexes were carried through GROMACS suite for 50 and 100 ns time periods. Protein and ligand topologies were derived from Gromos54a7 and Prodrg server, respectively (Sch€ uttelkopf & van Aalten, 2004). Procedure of MD simulation for complexes was same as used in case of apo protein (M pro ). Molecular mechanics Poisson Boltzmann surface area (MM/PBSA) approach was used to calculate the binding free energies of protein-ligand complexes. Binding free energies of all protein-ligand complexes were computed from last 20 ns MD simulated trajectories by g_mmpbsa tool in GROMACS (Kumari et al., 2014) as given in Equation (1).
where G complex is the total free energy of protein À ligand complex and G protein and G ligand are the total free energies of individual protein and ligand in solvent, respectively. Total binding free energy was computed from electrostatic, polar, van der Waals, and solvent accessible surface area energies.
In addition, energy decomposition on residues basis was also calculated as given in Equation (2).
where A bound i and A free i are the energies of ith atom from x residue in bound/unbound forms respectively. n is the total number of atoms in the residue. Energy contribution summed over all residues is equal to the binding energy of that particular residue, i.e., DG binding ¼ P m x¼0 DR BE X , where m is the total number of residues in protein-ligand complexes (Kumar et al., 2019). Since, we are comparing different ligands with the same receptor therefore, the entropy term was not included in our analyses. All energies were measured in kilojoule per mol.

Essential dynamics and free energy landscape studies
Essential dynamics (ED) approach was applied for both apo and complex protein systems to understand the overall motion of the protein that are significant to biological functions (Berendsen & Hayward, 2000). After removing the translational and rotational motions, the covariance matrix was constructed and diagonalize as given in Equations (3) and (4).
4 where q i and q j represent mass-adjusted Cartesian coordinates of particles i and j, respectively, where . . . h i is the ensemble average of all MD simulation structure sampled over the course of simulation. ED analysis was restricted to backbone atoms to avoid the statistical noise. In Equation (4C) is symmetric matrix, A and k represented the eigenvectors and associated eigenvalues, respectively. Eigenvectors and eigenvalues were obtained through gmx covar and analyzed by gmx anaeig tools of GROMACS, respectively.
Free energy landscape (FEL) analysis was performed using first two principle components (PC1 and PC2) or eigenvectors as given in Equation (5).
where k b is Boltzmann constant, T is temperature, and P ðPC1, PC2Þ is the probability distribution of molecular system along with principle component 1 and 2. FEL was constructed by gmx sham module of GROMACS using the reaction coordinates of PC1 and PC2.

Analysis and data presentation
Root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), hydrogen bond (H-bond), solvent accessible surface area (SASA) and distance analysis were calculated through GROMACS utility. Secondary structure plots were calculated by do_dssp tool from the trajectories of MD simulation (Kabsch & Sander, 1983). Twodimensional plots of protein-ligand complexes were plotted in LigPlot þ tool (Laskowski & Swindells, 2011

Structure and dynamics of SARS-CoV2 main protease (m pro )
Three-dimensional structure of CoV2 M pro was elucidated by X-crystallography at a resolution of 2.1 Å . M pro consists of 3 domains such as Domain I, II, and III ( Figure 1(A)). Domains II and III were connected by a large loop that provides the site for substrate binding. Tertiary structure taken from PDB was cleaned and validated by inspection of stereochemical properties. About 99% and 0.4% residues were placed in favored and disallowed regions of Ramachandran map, respectively. Verfiy3D and ERRAT scores suggested that structure was of better 3D quality (Supporting material Figure S1(A) and Table S3). MD simulation was performed to investigate the stability and dynamics of M pro protein. RMSD (root mean square deviation) was plotted at function of time and showed stable behavior throughout the simulation with an average value $0.28 nm (±0.04) except small deviation at 42 ns ( Figure 1B). RMSF or root mean square fluctuation of amino acid was analyzed and found no major fluctuations, except the region comprised $140-145 residues which showed $0.5 nm of RMSF (Figure 1(C,D)). Protein compactness was estimated by measuring Rg (radius of gyration), plotted against function of time and showed steady behavior with an average Rg value $2.23 nm (±0.01) (Supporting material Figure S1(B)). The overall structure of protein remained stable during MD simulation. But what will be the dynamics about protein local structures, such as secondary structure of protein? Therefore, we wanted to check the local stability of protein estimated through the inspection of secondary structures formation during MD simulation. Different moieties such as a-helices, b-sheets, coils, bends, and turns were examined and found that coils content was decreased concomitantly with an increased in the b-sheets at the end of simulation period while a-helices remained stable ( Figure  2(A,B)). Additionally, the loop comprises $15 residues (185-200) connecting domains II and III, interrupted with coils and bends in between and at the end of simulation periods (Figure 2(A)). MD simulation results suggested that protein remained stable with minor alterations in disordered regions.

Virtual screening and docking
High-throughput virtual screening was performed through RASPD tool and $1 million compounds each from million molecule and natural compound libraries were screened from ZINC database. The database contains libraries of small and natural compounds available for virtual screening during docking. It supports tautomeric forms, multiple protonation models, suppliers and 3D conformational sampling. Top RASPD scores of $100 compounds from each library were further docked individually by Autodock Vina tool (Supporting material Tables S4 and S5). X-rays structure of M pro (protein-N3 inhibitor complex) was taken as a reference for the binding site during docking . Binding pocket of protein was also verified through computational methods. Binding site of protein was comprised of about 38 residues with a high Z score (14.2) and composed of both hydrophilic and hydrophobic residues (Supporting material Figure S2A). Asparagine142 (present at turn between 4th and 5th b-sheets of domain II), Histidine164 (present at 6th b-sheet of domain II), Proline168 (present at the turn between 6th and 7th b-sheets of domain II) and Glutamine189 (present at loop connecting domains II and III) act as key residues for ligand binding . In addition to that, residues located at the cleft or surface of pocket were also assisted in binding of ligand. Approximately 6 compounds namely ZINC14732869 (-8.5 kcal/mol), ZINC19774413 (-8.8 kcal/mol), ZINC12338080 (-8.6 kcal/mol), ZINC123845408 (-9.3 kcal/mol), ZINC19774479 (-8.7 kcal/mol), and ZINC31 (-7.3 kcal/mol) exhibited higher docking scores than N3 (known) inhibitor (-7.2 kcal/mol). We had also docked hydroxychloroquine that showed À5.6 kcal/ mol docking score. Low docking scores exhibited by novel compounds over N3 (known) and hydroxychloroquine inhibitors demonstrating that novel compounds have higher binding affinities toward the M Pro protein. Docking of novel inhibitors were superimposed with N3 inhibitor to check the similarities between common binding mode of all compounds and found that known and novel compounds have similar orientation of binding (Supporting material Figure  S2B). Drug likeliness of novel compounds were examined through Lipinski rule of 5 and observed that all novel compounds obeyed the rule of 5, implying that newly screened compounds have drug-like properties (Supporting material Table S6). All known and novel compounds were bonded by both hydrophilic and hydrophobic residues, out of which N3 inhibitor formed 16 bonds (hydrophobic:10; hydrophilic:6), hydroxychloroquine formed 10 bonds (hydrophobic:9; hydrophilic:1), ZINC14732869 formed 16 bonds (hydrophobic:14; hydrophilic:2), ZINC19774413 and ZINC12338080 also formed 16 bonds (hydrophobic:15; hydrophilic:1), ZINC123845408 formed 12 bonds (hydrophobic:9; hydrophilic:3), ZINC19774479 formed 15 bonds (hydrophobic:14; hydrophilic:1) and ZINC31 formed 10 bonds (hydrophobic:9; hydrophilic:1) (Figure 3(A-H) and Supporting material Table S6). Quantitative analysis of different bonding revealed that the compounds namely ZINC14732869, ZINC19774413, ZINC12338080, and ZINC19774479 formed fair number of bonds with receptor molecule, similar to a known inhibitor.
Top scored ligand compounds were further carried for ADMET (Absorption, distribution, metabolism, excretion, and toxicity) analysis. ADMET properties were predicted by admetSAR and SwissADME servers and found that almost all compounds were having better absorption and distribution, good metabolic and excretion profiles (Supporting material Table S7). The toxicity profile of all compounds were also predicted through OSIRIS molecular property explorer tool and found that all reported novel ligands were nontoxic and noncarcinogens (Supporting material Table S8). Top scored protein-ligand complexes along with known and hydroxychloroquine were further optimized and validated through MD simulation.

MD simulation of protein-ligand complexes and
rescoring of binding free energy MD simulation is advantageous for studying the stability of binding pose obtained from docking, to rescore the overall binding energy as well as energy contribution by different residues and to study the conformational changes occurring in protein due to ligand binding/unbinding (Chen, 2015;Kumar et al., 2017;Zhao & Caflisch, 2015). Here, MD simulations of 50 and 100 ns were carried to monitor the proteinligand stability and rescored the binding energies of top docking complexes. RMSD of protein-ligand complexes were . RMSD of all proteins in different complexes remained stable and consistent. Since, the binding pocket of protein has large volume ($38 residues) which consists of small turns and sheets therefore, we were interested to know the RMSD behavior of pocket in apo and complex form of proteins. RMSDs of pocket remain steady and slightly increased as compared to the RMSDs of whole protein (Supporting material Figure  S3  RMSDs of novel compounds implying that they were stabled in the binding surface. To reveal the internal fluctuations of protein in presence of ligands, the RMSFs of residues were measured ( Figure  4(E,F); Supporting material Figure S4). Moderate fluctuations of Ser46 (Serine), Glu47 (Glutamate), Asp48 (Aspartate), Met49 (Methionine), Leu50 (Leucine), Tyr154 (Tyrosine), Asp197, Trp218 (Tryptophan), Arg222 (Arginine), Gly275 (Glycine), Met276, Asn277 (Asparagine), Gly278, and Arg279 were observed in almost all the protein-ligand complexes with an average RMSFs ranging from 0.35 (minimum) to 0.59 (maximum). three-dimensional structure inspections of these residues were investigated and observed that these residues occupied at loop or turn regions of protein. RMSF result suggested that moderate fluctuations were found in loop and turn regions of protein indicating that these regions help in assisting the binding of ligands, as no such fluctuations were observed in apo protein. Fluctuations of different ligands were also monitored at atomic level and found that N3 and hydroxychloroquine along with ZINC12338080 showed higher mobility consistent with RMSD result (Supporting material Figure S5). Rest of the compounds displayed fluctuations at flanking or extreme ends, which normally existed.
Protein secondary structural changes in presence of known and novel ligands were examined through DSSP method. b-sheets content were slightly increased in case of protein complexed with N3, ZINC19774413 and ZINC19774479 with concomitantly decreased in the bend contents (Supporting material Figure S6(A,D,G) and Table S9). Coil contents of protein-bound with hydroxychloroquine, ZINC12338080, ZINC123845408, ZINC19774479, and ZINC31 were increased, while bend and turns were slightly decreased (Supporting material Figure S6(B,E,F,G,H) and Table S9#x00029;. Secondary structures of Protein-ZINC14732869 complex remain unaffected (Supporting material Figure 6C and Table S9). a-helices remained stable and unaffected during binding of all ligands (Supporting material Figure S6 and Table S9). Coils were mostly found in domain I and b-sheets were restricted to domains I and II which formed binding cleft, therefore, variation in the coils and sheets aid in providing the stable regime for proper ligand binding.

Energy decomposition and protein-ligand interaction studies
Binding energy results suggested that 4 novel compounds namely ZINC14732869, ZINC19774413, ZINC19774479, and ZINC31 along with hydroxychloroquine displayed higher binding affinities than N3 inhibitor. Here, we are interested to know how many residues contribute in binding energy ( Figure 5 and Supporting material Figure S7)? Energy distribution on residues basis has showed that protein complexed with N3, ZINC19774479 and ZINC31 compounds exhibited a smaller number of residues contributed to total binding energy ( Figure 5(A,E,F) and Supporting material Figure  S7(A,E,F)). Also, the magnitude of energy contribution by these residues were lower from minimum (Thr25: -1.2 kJ/mol) to maximum (Asp48: -1.7 kJ/mol) in case of protein-N3 complex ( Figure 5(A) and Supporting material Figure  S7(A)), while protein -ZINC19774479 (Met165: -8.1 kJ/mol) and -ZINC31 (Leu50: -4.8 kJ/mol) complexes showed maximum values of energy contribution ( Figure 5(E,F) and Supporting material Figure S7(E,F)). Rest of protein-ligand complexes along with hydroxyquinoline contained a fair number of residues contributing in binding free energy ( Figure 5(B-D) and Supporting material Figure S7(B-D); Supporting material Table S6). ZINC14732869, ZINC19774413 compounds bonded with a maximum number of residues with lowest binding energies ( Figure 5(C,D) and Supporting material Figure S7(C,D); Supporting material Table S6). Energy decomposition results demonstrated that Leu27, Met49, Leu50 and Met165 were the key residues in ligand binding in all protein-ligand complexes. Protein-ligand interactions were monitored through the inspection of morphology of binding pocket, hydrophobic interactions, and number of hydrogen bonds formed during simulation. Above results indicating that hydroxychloroquine and 4 novel compounds such as ZINC14732869, ZINC19774413, ZINC19774479, and ZINC31 had high binding affinities than N3 inhibitor. From the above results, it is also clear that van der Waal energy was the major contributor for binding free energy (Table 1) of all ligands and hydrophobic residues (Methionine and Leucine) played a major role in protein-ligand binding (Figure 3 and Supporting material Table S6). Binding pocket analysis of protein complexed with known and novel ligands showed that all ligands were well accommodated in the binding pocket of protein and remained stable during MD simulation except known ligand that orient against the surface of the binding pocket (Supporting material Figure S8(A-F)). Moreover, hydrogen bond (H-bond) formation was higher in known ligand with an average of 4-5 H-bonds were existed during MD simulation by contrast to 3-4 H-bonds were found in novel ligands (Supporting material Figure S9(A-F)). Similar results were also observed during docking as hydrophilic interactions were more in known compounds as compared to the hydrophobic interactions which were mainly found in novel ligands (Figure 3 and Supporting material Table S6). Protein-ligand interactions study demonstrated that novel compounds were mainly stabilized by hydrophobic and partially by hydrophilic interactions. RMSD, RMSF, Rg, SASA, and binding energy results suggested that novel compounds such as ZINC14732869, ZINC19774413, ZINC19774479 were most stable and potent inhibitors for M pro . Similar patterns of RMSD, RMSF and Rg were also observed when simulation time was increased up to 100 ns (Supporting material Figure S10). A consistent and steady behavior of proteins RMSD were found in all protein-ligand complexes while ligand RMSD showed higher values and large deviations in case of N3 and hydroxychloroquine as compared to novel ligands.

Protein conformation and free energy landscape analyses
Ligand-induced conformational changes were existed during MD simulation. Superpositions of best binding affinity complexes such as protein-ZINC14732869, -ZINC19774413, -ZINC19774479 along with N3 and hydroxychloroquine from the final snapshots of MD simulation were analyzed and show that different ligands induced different conformational changes, even the starting conformation of proteins were same during the MD simulation (McNicholas et al., 2011) ( Figure 6). Conformational changes were more pronounced in turn (between 4th and 5th b-sheets of domain II), loop (domains II and III) and a-helices (domain III). To gain further insight into the conformational changes induced during binding and unbinding of known and novel ligands, the principle component analysis (PCA) or essential dynamics was performed and stable conformation (minimum DG) of all complexes were obtained through free energy landscape study. Principle components analysis was performed to identify the dominant motions that occurred in different complexes, where major dominant motions were achieved in first few eigenvectors (10) or PCs (Supporting material Figure  S11(A)). PCA indicated that first 3 PCs (principle components) accounted for 72. 50, 77.68, 81.80, 76.79, 79.84, and 69.19% of the motions observed in apo protein (M pro ), protein-N3, -hydroxychloroquine, -ZINC14732869, -ZINC19774413, and -ZINC19774479 complexes, respectively (Supporting material Figure S11(A)). Conformational behaviors of different protein-ligand complexes were monitored by plotting first two PCs in phase space (Supporting material Figure S11(B-F)). Protein-N3 complex showed unequal space in phase space as compared to protein-hydroxychloroquine, -ZINC14732869, -ZINC19774413, and -ZINC19774479 complexes, indicating that protein-N3 complex is still exploring the conformation and led to unstable conformation. Conformational changes at structural level of different complexes were examined by analyzing the low energy structures from PC1 and PC2 in free energy landscape (FEL) plot. The 3D and 2D plots of FEL were constructed to show different energy barriers (Figures 7, 8 and Supporting material Figure S12). FEL results suggested that protein complexed with N3 (RMSD:1.78 Å) and hydroxychloroquine (RMSD:1.78 Å) ligands undergo large conformational changes during MD simulations (Figure 7(B,E)), as compared to protein complexed with -ZINC14732869 (RMSD:1.77 Å), -ZINC19774413 (RMSD:1.44 Å), and -ZINC19774479 (RMSD:1.59 Å) ligands (Figure 8(B,E,H)). Additionally, conformational changes were mostly restricted to turn connecting 4th and 5th b-sheets of domain II, loop connecting domains II and III and a-helices in domain III as major motions were accompanied in those regions.
Porcupine structures were generated from respective PCs to elucidate the nature of motions occurred in protein-ligand complexes and found that different rotational motions were existed in different protein-ligand complexes (Figures 7 and 8). Arrows of porcupine structures indicated direction of motions and length of arrows denoted amplitude of motions. Interestingly, turns of domain II and a-helices of domain III face each other, indicating the expansion of binding cleft or increase in the volume of binding pocket that results in instability of ligands in case of protein-N3 and -hydroxychloroquine complexes (Figure 7(C,F)). In contrast, turns of domain II and a-helices of domain III moves against each other in protein-ZINC14732869, -ZINC19774413, and -ZINC19774479 complexes, indicating the contraction of binding cleft or restricted the volume of binding pocket, that provide stable regime to ligand binding (Figure 8(C,F,I)). Asn142, His164, Pro168, and Gln189 are the key residues of protein which maintain morphology of binding cleft and also help in binding of the ligand . Asp142 present at turn between 4th and 5th b-sheets of domain II, His164 lied at 6th b-sheet of domain II, Pro168 occurred at turn connecting 6th and 7th b-sheets of domain II and Glu189 present at loop connecting domains II and III, as these residues assisted in the proper binding of ligands (Figure 9).
Conformational difference at global level of proteins were observed during PCA analysis. Hence, it is important to unravel the conformational stability of binding pocket to make sure the stable binding of ligand. Therefore, conformation of binding pocket of protein in various complexes were elucidated by measuring the distance between key residues that formed the binding cleft ( Figure 9G). Residues such as Asn142, His164, Pro168, and Gln189 are located at the binding pocket of protein and pairs like Asn142-Gln189, His164-Gln189, and Pro168-Gln189 provide shape to the binding cleft as also discussed above. Distance between Asn142-Gln189, Pro168-Gln189, and His164-Gln189 residues were remained unchanged and showed a consistent pattern in all known and novel compounds during the entire simulation period (Figure 9(B,C,E,F)). Nonsignificant results of distance analysis were found which indicated that both known and novel compounds have not affected the conformation of binding pocket. The PCA results demonstrated that large conformational changes exist in proteins when complexed with N3 and hydroxychloroquine as compared to novel ligands. Moreover, residues near binding site showed large motions that might play an important role by providing proper orientation for ligand binding.

Discussion
In December 2019 an outbreak of novel coronavirus was reported from Wuhan, China and its dissemination developed as epidemic . In March 2020, because of rapidly spreading of disease from Wuhan to the entire world, WHO (World Health Organization) declared pandemic and named as COVID-19. In the absence of an appropriate randomized controlled trial of any drug, some measures suggested by various health institutions including WHO to curtail the infection of SARS-CoV2, such as self-quarantine, maintaining respiratory hygiene, social distancing etc. In the awake of this current pandemic, preventive medicine on the basis of preliminary data like hydroxychloroquine, a derivative of chloroquine being used against COVID patients (Sinha & Balayla, 2020). However, the mode of action of hydroxychloroquine is not fully understood, but it is known to increase the pH of endosome which arrest its maturation and thus prevents the virus entry . The recent study found that the medication of hydroxychloroquine cause side effect like cardiac arrest (Chorin et al., 2020). In the current scenario, the coarse and cost of drug development by conventional methods are the major hurdles for drug discovery against the current pandemic. In short of time, repurposing of already approved drugs is the only approach to identify the potential drug candidates against such disease. In this situation, in silico methods of drug . Protein 3D structure was displayed in cartoon mode and residues were labelled in stick mode. M pro (apo) and protein complexed with N3, hydroxychloroquine, ZINC14732869, ZINC19774413, and ZINC19774479 were shown in white, black, red, mustard, green, and blue color, respectively in 2D graphs.
designing play a critical role to minimize the drug development hurdles. Few in silico studies have been performed which identified various drug and phytochemical compounds targeting the viral proteins (Bhardwaj et al., 2020;Elfiky, 2020;Ton et al., 2020). The X-ray structure of main protease of CoV2 virus has been submitted on PDB in complex with N3 inhibitor . The inhibitor was identified through structure-based virtual screening against the library consisting of clinical trial drug candidates, already approved drugs and natural compounds. None of the studies have provided the details of binding analysis of drugs and its effect on protein structure and conformation, which is a crucial parameter for precise or effective drug development.
In this study, the experimentally resolved 3D structure of M pro was downloaded from PDB and stereochemical evaluation was done after energy minimization. M pro consists of 3 domains having mixtures of helices, sheets, and loop. Stereochemical geometry was assessed through measuring the percentage of residues located at the favored region of Ramachandran plot and found high percentage (99.7%) of residues at favored or allowed regions. Furthermore, high Verify3D (94.4%) and ERRAT (96.5%) scores of M pro suggested that protein had better 3D and good structure quality. Protein structure was optimized through MD simulation and different parameters such as RMSD, RMSF, and Rg were measured. RMSD and RMSF are important parameters for measuring the stability of protein both at function of time and amino acid residues, respectively. RMSD of structure showed a consistent pattern and no major fluctuations were observed during RMSF analysis (Figure 1). The overall dimension of M pro protein was measured by Rg analysis and found that Rg remained stable and consistent throughout the simulation period. RMSD, RMSF, and Rg results suggested that M pro protein showed stable and steady behavior during an entire simulation period. MD optimized structure was targeted for high-throughput virtual screening and docking.
Nearly, 1 million compounds from million molecule and natural compound libraries were screened and top hit compounds were docked individually. Docking analysis showed that compounds ZINC14732869, ZINC19774413, ZINC12338080, ZINC123845408, ZINC19774479, and ZINC31 exhibited À8.5, À8.8, À8.6, À9.3, À8.7, and À7.3 kcal/mol of docking scores, respectively while N3 or known inhibitor exhibited docking score of À7.2 kcal/mol. Lowest docking scores of novel compounds compared with known suggested that these compounds displayed high binding affinities. Further, physiochemical properties or drug likeliness of novel compounds were examined. Computational prediction of ADMET (Absorption distribution metabolism excretion toxicity) properties of new drugs are helpful to minimize the failure of drugs during clinical trials. Therefore, ADMET properties of novel compounds were predicted and found that all drugs were had good absorption, distribution, metabolism and excretion properties and are less toxic with minimum side effects. Moreover, novel compounds such as ZINC14732869, ZINC19774413, ZINC12338080, ZINC123845408, ZINC19774479 were having good intestinal absorption properties, nonmutagenic, noncarcinogens, and have a high drug-like properties. Top binding modes obtained from docking were optimized and refined by MD simulation. Stabilities of all protein-ligands complexes were compared during 50 and 100 ns MD simulations time. RMSD was measured for backbone atoms from the initial structure with reference to equilibrium structure. Low values and consistent pattern of RMSD of protein complexed with novel compounds were existed as compared to the RMSD of protein complexed with known (N3) and hydroxychloroquine compounds. Moreover, ligand RMSD of known and hydroxychloroquine compounds showed higher values as well as large deviations as compared to the RMSD of all novel ligands, suggesting that known inhibitor formed an unstable complex with M pro in contrast to novel inhibitors (Figure 4). Effect of drug on local protein structure were examined through RMSF analysis of protein and found that moderate fluctuations were observed in all protein-ligand complexes. Fluctuations were more pronounced at the flanking (N-and C-terminals) and turn regions of the protein. Protein compactness or globularity was examined in presence of known and novel compounds by means of Rg analysis. Rg of protein in all complexes showed consistent and stable patterns, suggested that all compounds have little or no effect on overall dimensions and compactness of protein. The secondary structure of proteins showed variations with changing in the sheets and coils content throughout the simulation.
Rescoring of binding free energies of top scored docking complexes along with known and hydroxychloroquine were analyzed using MD simulation trajectories and different binding energy components were measured. Binding affinity of ligands toward protein by means of docking provide a preliminary information about the inhibition ability of ligand compounds. During docking, binding modes of different ligands are predicted, followed by an estimation of binding affinity (Chen, 2015). However, MD simulation is advantageous in predicting the binding free energy and detailed binding modes of proteinligand complexes (Kumar et al., 2019). Binding free energy by means of MD simulation was calculated by MMPBSA method and results suggested that ZINC14732869 (-151.3 ± 3.3 kJ/mol), ZINC19774413 (-160.4 ± 9.2 kJ/mol), ZINC19774479 (-136.7 ± 7.6 kJ/mol) compounds interact with M pro with high binding affinities as compared to N3 (-39.3 ± 16.1 kJ/mol) and hydroxychloroquine (-125.4 ± 12.2 kJ/mol) compounds, consistent with the docking results (Table 1). Energy decomposition and protein-ligand interaction studies demonstrated that Leu27, Met49, Leu50, and Met165 were the key residues in ligand binding in all protein-ligand complexes and both hydrophobic and hydrophilic forces played an important role in ligand binding ( Figure 5). Moreover, a slightly higher number of H-bonds formed by known inhibitor while novel inhibitors formed large number of hydrophobic bonds and they (novel) were mainly stabilized by hydrophobic interactions. As we know, hydrophobic interactions act as major forces for providing the binding strength of biological macromolecules. Further, all novel ligands displayed minimum values of binding free energies, implying that these ligands interact to receptor molecule with maximum binding strength. Conformationally stable binding of ligand is necessary for effective drug development. Therefore, we elucidated the conformational changes induced by different ligands and examined the stability of protein-ligand complexes through PCA and FES studies. PCA studies suggested that first 3 PCs accounted majority of protein motions. First 2 PCs were mapped on essential subspace and found that, protein-N3 complex showed large conformational changes as compared to the rest of protein-ligand complexes (Figures 7  and 8). Major motions were observed in loop and turns regions spanning near the binding pocket of proteins and a-helices of protein domain studied through porcupine plots. Motions in the residues those were located near to binding pocket was existed which probably assisting in stable binding of ligands ( Figure 9). MD simulation technique is a promising approach for discovering the novel drug and help in understanding the dynamics and conformational changes induced by the ligands (Dong et al., 2018;Durrant & McCammon, 2011;Kumar et al., 2020;Simmerling et al., 2002). In the current study, MD simulation was utilized for discovering new drug candidates against rapid growing disease such as COVID-19. Further, experimental testing of newly filtered drugs is required. We reckon that the newly discovered novel drugs from this computational study will provide the effective therapeutics against COVID-19.

Conclusions
In this study, we have attempted an in silico analysis of potential drugs for COVID-19. Molecular docking and dynamics simulation techniques were employed to discover the novel and potent inhibitors for SARS-CoV2 main protease and compared with known inhibitors. Prior to docking, 3D structure of main protease was optimized and refined by MD simulation. Highthroughput virtual screening against two ligand compounds libraries was performed followed by docking. Further, drug binding modes were inspected through MD simulation and rescoring of binding energy was carried out through MM/PBSA method. A total of 4 novel compounds namely ZINC14732869, ZINC19774413, ZINC19774479, and ZINC31 displayed lowest binding free energies than N3 inhibitor. Novel compound complexes were mainly stabilized by hydrophobic interactions. Furthermore, the effect of drugs on structure and conformation of proteins were revealed by essential dynamics simulation and found that ZINC14732869, ZINC19774413, and ZINC19774479 formed conformationally stable binding. From this study, the compounds namely ZINC14732869, ZINC19774413, and ZINC19774479 were prioritized as with promising binding affinities over other known inhibitors that may have a huge impact in development of effective therapeutics against COVID-19 patients.

Disclosure statement
No potential conflict of interest is reported by the authors.