Isatin-based virtual high throughput screening, molecular docking, DFT, QM/MM, MD and MM-PBSA study of novel inhibitors of SARS-CoV-2 main protease

Abstract Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is a rapidly growing health care emergency across the world. One of the viral proteases called main protease or Mpro, plays a crucial role in the replication of SARS-CoV-2. As the structure of Mpro of SARS-CoV-2 is similar to the Mpro of SARS-CoV-1 (responsible for SARS outbreak between 2002 and 2004), we hypothesize that the inhibitors of SARS-CoV-1 Mpro can also inhibit the Mpro of SARS-CoV-2. To test this hypothesis, a total of 79 isatin derivatives, which inhibited Mpro activity under in vitro conditions, were selected from the literature, and then screened through AutoDock Vina. The chemical features of the top 5 isatin derivatives with low binding energies (−8.5 to −8.2 kcal/mol) were used to screen similar types of compounds from several small-molecule libraries containing 15856508 compounds. A total of 1,609 compounds with similarity score ≥ 6 were screened and then subjected to docking as well as ADME analysis. Among the compounds screened, 4 ligands form Zinc drug-like library (ZINC000008848565, ZINC000009513563, ZINC000036759789 and ZINC000046053855) showed good ADMET properties, low binding energy (−8.4 to −8.6 kcal/mol), low interaction energy (−72.62 to −50.01 kcal/mol) and high structural stability with Mpro. Hence, the selected ligands might serve as the lead candidates for further wet laboratory validation, optimization and development. Communicated by Ramaswamy H. Sarma


Introduction
Coronavirus disease 2019 , caused by the novel pathogen severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has already caused huge devastations in terms of the health care crisis, and economic depression across the world. In one year, the virus has infected more than 90.8 million people with 1.94 million deaths worldwide. As the mortality rate of COVID-19 (approximately 3.7%) is comparatively higher than the common flu (1%), COVID-19 is considered a global emergency (Mehta et al., 2020). Without a specific treatment such as vaccines or drugs, the control of transmission of the disease relies strongly upon non-pharmaceutical interventions such as social distancing and good hygienic practices. Even with stringent non-pharmaceutical strategies, a large number of cases have been reported every day. Hence, the demand for the development of novel therapeutics is on the rise, especially in developing countries that lack adequate health care infrastructure and policies.
Several studies stated that the primary mode of virus spread is through person-to-person transmission, especially by respiratory droplets produced by an infected individual during coughing or sneezing (Wu et al., 2020;Xu et al., 2020). Upon its entry into the lungs, SARS-CoV-2 binds to one of the cell surface receptors called angiotensin-converting enzyme 2 (ACE2) with its spike protein. After binding with the receptor, the viral envelop fuses with the cell membrane of lung epithelial cells (Rothan & Byrareddy, 2020). Once entered into the cell, SARS-CoV-2 releases its mRNA like genome in the cytoplasm and get it translated into several proteins. The viral genome comprises of around 14 open reading frames (ORF) and encodes various types of proteins responsible for survival and virulence of the virus (Astuti, 2020). Among the proteins encoded in the viral genome, non-structural proteins (which do not constitutes any of the structural components of virion) play a vital role in the replication of the virus inside the host cells. Some of the important functions associated with viral non-structural proteins include blockage of host mRNA translation, inhibition of interferon signaling, viral genome proofreading, and so on. Viral proteases such as main protease (Mpro) or chymotrypsin-like protease (3CLpro) and papain-like protease (PLpro), are non-structural viral proteins responsible for the cleavage of polyproteins into various functional proteins . Hence, these proteases are widely used targets for the discovery of new drugs against viral replication.
Among the proteases encoded by the SARS-CoV-2 genome, Mpro cleaves the polyprotein at 11 unique sites to produce various non-structural proteins. As Mpro gene is located at the 3 0 end of the viral genome, which is known for its redundant variability, it is considered the best target for screening potential inhibitors of SARS-CoV-2 replication (Ul Qamar et al., 2020). Mpro is a polypeptide made up of 306 amino acids and it has a molecular weight of 33.8 kDa.
Mpro cleaves viral polyprotein only at specific sites that contain a recognition sequence, Leu-Glnj(Ser, Ala, Gly)j (indicates cleavage position), and such cleavage specificity is not observed in any human proteases. Hence, the drugs that inhibit Mpro activity are expected to be less inhibitory toward human proteases . The structure of Mpro is highly conserved among the coronaviruses. Mpro of SARS-CoV-2 shows 96% sequence similarity with SARS-CoV-1, which caused the SARS epidemic of [2002][2003][2004], and 87% with MERS-CoV (Middle East Respiratory Syndrome Coronavirus) (Ul Qamar et al., 2020). Moreover, the catalytic dyad of Mpro of SARS-CoV-1 and SARS-CoV-2 are identical and made up of two residues: His-41 and Cys-145.
Currently, a large number of studies depict the use of computational resources to identify potent inhibitors of Mpro. Computer-aided drug discovery makes use of advanced computational techniques to screen a large number of ligands against a selective target in a relatively shorter time than in vitro methods. Besides, the type of interaction between the ligand and target can be studied at ease with computational techniques (Song et al., 2009). Recently, Kandeel and Al-Nazawi (2020) have used computational screening to repurpose various FDA approved drugs against Mpro of SARS-CoV-2 and found a combination of drugs (Ribavirin, telbivudine, vitamin B12 and nicotinamide) that could be used for the treatment of COVID-19. Similarly, a study by Das et al. (2020) described the use of molecular docking to identify the interaction between Mpro and various ligands including natural and synthetic compounds such as rutin, ritonavir, emetine and hesperidin. The data collected through in vitro assays can be combined with computational methods to select and screen ligands, that show structural and functional similarities with compounds, that show high-affinity binding under in vitro conditions. In this way, a large number of ligands with a comparatively high affinity toward a target can be identified and further evaluated under wet lab conditions. One of the recent studies shows ebselen as a potent inhibitor of Mpro (Jin et al., 2020). Many well-known drugs such as boceprevir, manidipine, bedaquiline, lercanidipine and efonidipine showed effective inhibition of Mpro at concentrations below 50 mM (Ghahremanpour et al., 2020).
In this study, we have used virtual screening to identify ligands that show high-affinity binding toward Mpro of SARS-CoV-2. Due to the existence of structural similarity between the Mpro of SARS-CoV-1 and SARS-CoV-2, we hypothesize that the ligands showing high-affinity binding toward the active site of Mpro of SARS-CoV-1 are likely to bind with the Mpro of SARS-CoV-2. Based on this hypothesis, we have surveyed the literature for the ligands that could inhibit Mpro of SARS-CoV-1 under in vitro conditions. Numerous studies indicated that various derivatives of isatin exhibit excellent binding affinity toward Mpro of SARS-CoV-1 (Chen et al., 2005;Liu et al., 2014;Selvam et al., 2008;Zhou et al., 2006). Isatins are a diverse group of organic molecules found in many natural products, especially from plants of genus Isatis. Many derivatives of isatin are pharmacologically active molecules with anti-bacterial, anti-cancerous, anti-viral and antiinflammatory properties. Also, isatin occurs naturally in humans as a metabolic derivative of adrenalin (Singh & Desta, 2012). Hence in this study, isatin derivatives that showed high-affinity binding toward the Mpro of SARS-CoV-1 were evaluated against Mpro of SARS-CoV-2 using molecular docking simulations. The best performing ligands were used as model compounds for high throughput screening of ligands from various compound libraries by the chemoinformatics approach. The screened ligands were further evaluated for their binding affinities toward Mpro, ADMET (Absorption, Distribution, Metabolism, Excretion and Toxicity) properties and quantum chemical properties under in silico conditions. The DFT (Density Functional Theory) calculations were carried out to predict the ligand's quantum chemical properties. DFT offers more insight into the ligand's electronic structure in the absence of experimental data, which is necessary to show the receptor-ligand reactivity. Finally, the QM/MM (Quantum mechanics and molecular mechanics), MD (Molecular Dynamics) and MM-PBSA (Molecular mechanic/ Poisson-Boltzmann surface area) methods were employed to determine the interaction energy as well as the stability of protein and ligand complexes.

Data collection
The crystal structure of SARS-CoV-2 Mpro (Chain A) in apo form was downloaded from the RCSB protein data bank (PDB id: 6M03). All the ligands used in preliminary screening were collected from the literature. The ligands were categorized into four groups such as A, B, C and D, based on the literature from which the ligands were collected. Groups A, B, C and D contain 26 (Chen et al., 2005), 37 (Liu et al., 2014), 10 (Zhou et al., 2006) and 6 (Selvam et al., 2008) isatin derivatives, respectively (Table S1). The ligands were represented by alphanumeric characters indicating group name and ligand number in the group. The backbone structure of the ligands in various groups is depicted in Figure 1.
To compare the effectiveness of selected isatin derivatives against Mpro, two effective inhibitors of Mpro of SARS-CoV-2, manidipine (PubChem ID: 4008) and boceprevir (PubChem ID: 10324367) were included in this study. Manidipine is a calcium channel blocker, while boceprevir is an inhibitor of hepatitis C virus protease.

Binding site prediction
To identify the presence of cavities and binding pockets around the SARS-CoV-2 Mpro, binding site analysis was carried out using the online tool metaPocket 2.0 (Zhang et al., 2011). This tool makes use of 8 different predictors such as LIGSITE, SURFNET, Q-SiteFinder, POCASA, Fpocket, GHECOM, ConCavity and PASS to find binding pockets on the surface of the protein. Since all of these predictors using different scoring systems, a common ranking system was introduced by metaPocket to rank binding sites. Initially, the z-score of the individual pocket sites predicted by each of the predictors was calculated. Then, the pocket sites were clustered together based on their spatial similarity. Once the clustering was completed, the total z-score of the individual clusters was generated along with the list of residues near each of the clusters. The cluster with top z-score could be the active site of the protein and other clusters with relatively low zscore could be the allosteric sites.

Molecular docking simulations
All molecular docking simulations were carried out in python prescription (PyRx), a virtual screening tool that makes use of AutoDock tools (Morris et al., 2009) andOpenBabel (O'Boyle et al., 2011) to generate input files for docking simulations and ligand preparation such as energy minimization. Before running the docking calculations, water molecules present in the Mpro PDB file were removed and then the polar hydrogens along with Kollman charges were added to the structure. Later, the PDBQT file of the receptor was obtained by selecting the PDB file as AutoDock macromolecule in PyRx platform. Meanwhile, the SMILES (Simplified Molecular Input Line Entry System) notation of the ligands were generated after drawing the ligand structures in ACD/Chemsketch v2019. The SMILES notations were further converted into PDB files with 3 D coordinates using the online SMILES translator and structure file generator tool. After the conversion, the PDB files of the ligands were opened in PyRx, and then the energy minimization was performed in the Open Babel control panel. The energy minimized ligand PDB files were then converted to PDBQT files after selecting them as AutoDock ligands.
After the preparation of receptor and ligand files, molecular docking simulations were performed under AutoDock vina (Trott & Olson, 2010) wizard of PyRx. Before docking, the grid box was centered and constructed around the binding pockets (P1-P3) predicted by the metaPocket tool (P1 (Center: ). Also, the receptor was set as a rigid structure while the ligands were set as flexible structures. After the completion of docking simulations, the output files containing the coordinates of atoms in the protein-ligand complex were used for further simulation.

Ligand selection and library screening
The ligands with lowest binding energies were used as model ligands to screen compounds with similar chemical features in various small molecule libraries using the online tool SwissSimilarity (Zoete et al., 2016). This tool permits the users to perform ligand-based screening of several smallmolecule libraries including FDA approved (n ¼ 1516), experimental (n ¼ 4788), investigational (n ¼ 504), ChEMBL (n ¼ 177000), zinc drug-like (n ¼ 10639400), zinc lead-like (n ¼ 4328000) and zinc fragment-like (n ¼ 705300). In this study, combined chemical features of the ligands such as FP2 fingerprints, electroshape and spectrophores were used to screen ligands from small-molecule libraries. The screened ligands with similarity score ! 6 were further subjected to molecular docking simulations.

ADMET analysis
The ADME properties of the screened ligands with low binding energies were studied using the Web-based tool SwissADME (Daina et al., 2017). It includes physicochemical, lipophilicity, water-solubility, pharmacokinetics and drug-likeness properties. The compounds that pass through the druglikeness filter including Lipinski, Ghose, Veber, Egan and Muegge filter, were further evaluated for their bioactivity and toxicity. The bioactivity of the selected compounds was predicted using cheminformatics software Molinspiration, It predicts the bioactivity score for important targets of drugs such as GPCR ligand, ion channel modulators and protease inhibitors. The bioactivity scores were calculated based on Bayesian statistics which compare various ligands active on the selected target with the structure of inactive compounds. The oral rat median lethal dose (LD 50 ) value of the ligands were computationally determined using the offline tool T.E.S.T (Toxicity Estimation Software Tool) v.4.2. Finally, the toxicity profile of the ligands including hepatotoxicity, carcinogenicity, mutagenicity and cytotoxicity was predicted using ProTox-II web server .

Quantum chemical calculations
Many studies indicate that several quantum chemical factors including bandgap energy, play a crucial role in determining the efficiency of the drug (Rozhenko, 2014). A bandgap is defined as the difference between the energies of HOMO (Highest occupied molecular orbital) and LUMO (Lowest unoccupied molecular orbital) of the molecule. Various properties of the drug including strength, stability, and molecular reactivity can be determined by computing the bandgap energy between HOMO and LUMO of the drug (Pradhan & Sinha, 2018).
In this study, band gap energy between HOMO/LUMO and electrostatic potential (ESP) surface of the selected ligands were calculated using Gaussian 09 W software. The density functional theory calculations (DFT) were carried out using Becke's three-parameter exchange potential and Lee-Yang-Parr correlation functional theory (B3LYP) with 6-31 G as a basis set. After the Gaussian calculations, molecular orbitals and electrostatic potential surfaces of the selected ligands were visualized in GaussView v.6.0.

QM/MM calculations
To perform quantum mechanical and molecular mechanical calculations, a two-layer ONIOM (Our own N-layered Integrated molecular Orbital and molecular Mechanics) methodology was utilized. All ONIOM calculations were performed in Gaussian 09 W software. To run ONIOM calculations, the ligand was selected as high layer while the macromolecule was chosen as low layer. For high layer, the density functional theory method was employed using B3LYP with 6-31 þ G(d) as basis set. For low layer, the MM method was employed with UFF (Universal Force Field). By subtractive QM/MM scheme, the total ONIOM energy (E total ) of the ligand-protein system is given by the equation (Farrokhpour et al., 2015), where E QM,Model is the QM energy of the model system (ligand), E MM,Real is the MM energy of the real system (protein þ ligand) and E MM,Model is the MM energy of the model system (ligand). Using the supramolecular approach, the interaction energy (E int ) for the protein-ligand complex was computed through the following equation, where E QM,high-layer is the energy of the high layer (ligand) and E MM,low-layer is the energy of the low layer. The interaction energy, E int was derived from single point energy calculations of the protein-ligand complex.

MD simulations
The molecular dynamics (MD) simulations were carried out to determine the stability of protein-ligand complexes (Bhardwaj et al., 2020). In order to run the MD simulations, Groningen Machine for Chemical Simulation (GROMACS) v.5.1.2 was utilized (Abraham et al., 2015;Hess et al., 2008;Van Der Spoel et al., 2005). The topology of Mpro was prepared using the 'pdb2gmx' scripts, and the ligand topologies were procured from the GlycoBioChem PRODRG1 server. The ligand topologies were further integrated with the protein topology to build the topology of protein-ligand complexes. The energy minimization of protein-ligand complexes was carried out using GROMOS96 42a1 force field. Following the energy minimization, the protein-ligand system was solvated by a cubic box of water molecules (single point charge water model). After solvation, the net charge of the system was neutralized by adding four Na þ ions. The energy minimization of the system was then performed with a maximum of 50,000 steps using the steepest descent minimization algorithm. The equilibration of the energy minimized system was achieved in two steps: NVT (constant number of particles, volume and temperature) and NPT (constant number of particles, pressure and temperature). Before the equilibration, a positional restrain was placed on the protein-ligand complex to ensure the free movement of solvent molecules across the unit cell. Both NVT and NPT equilibrations were carried out for 100 ps with energies and coordinates of the systems saved for every 1.0 ps. During equilibration, the bonds to hydrogen were constrained using LINear Constraint Solver (LINCS) algorithm (Hess et al., 1997). Particle Mesh Ewald method was used with a short-range electrostatic cutoff of 1.2 nm and Fourier spacing of 0.16 to determine the electrostatic interactions in the system (Essmann et al., 1995). Also, the temperature coupling of two groups: Protein-ligand and water-ions, was achieved at a reference temperature of 300 K using a modified Berendsen thermostat algorithm (Berendsen et al., 1984). For phase two equilibration, isotropic pressure coupling was attained at a reference pressure of 1 bar. After equilibration, production MD for the system containing protein-ligand complex was carried out for 100 ns with a time step of 2 fs, and the structural coordinates were saved for every two ps. Finally, the trajectory files generated during the MD run were used to determine the root mean square deviations (RMSD) and the number of hydrogen bonds formed between the protein and ligands. After MD simulation, the PDB files of the receptor-ligand complexes were generated and the interactions between receptor and ligand were visualized using Discovery Studio Visualizer v.2019.

MM-PBSA calculations
The binding free energy of protein-ligand complexes was calculated by combining MD data with molecular mechanic/ Poisson -Boltzmann surface area. The 'g_mmpbsa' script of GROMACS (Kumari et al., 2014) was utilized to calculate the free energies of MM-PBSA. The binding free energy calculated by MM-PBSA method is the sum of potential, polar and non-polar solvation energies. The binding free energy (DG Binding ) between protein and ligand was calculated using the following equation, where G complex is the total binding energy of the protein-ligand complex. G protein and G ligand represents the binding energies of free receptor and unbound ligand, respectively. The change in free energy associated with any entity (receptor, ligand or receptor-ligand complex) is given by the following equation, where E MM is the vacuum potential energy (including bonded and non bonded interactions such as van der Waals and electrostatic interactions), T is the temperature of the system, S is the entropy of the system and G solvation is the solvation free energy(including polar and non-polar solvation energies)

Ligand binding sites
Results obtained from binding site prediction tool metaPockted indicate the presence of more than one binding site on the Mpro of SARS-CoV-2. A total of 7 metaPocket clusters have been identified using 6 base methods including PAS (PASS11), LCS (LIGSITE CS), FPK (Fpocket), SFN (SURFNET), GHE (GHECOM) and CON (ConCavity). The top 3 binding clusters predicted by metaPocket are named as P1, P2 and P3 (Figure 2), and their z-scores were found to be 15.12, 14.35 and 3.13, respectively. The cluster P1 consists of 6 pocket sites (GHE-2, FPK-2, SFN-2, PAS-3, LCS-1 and CON-2), surrounded by 32 residues, and it is located at the junction between domain I and domain II in chain A of Mpro. Several studies demonstrated that the clefts and pockets near the cluster P1 are favorable positions for ligand occupancy (Mittal et al., 2020;Pant et al., 2020), especially near the residues His-41 or Cys-145 or both. In contrast, some recent studies have shown that non-catalytic sites such as the cluster P2 supports the binding of several ligands. Few important drug candidates including lopinavir showed highaffinity binding toward some of the residues near the cluster P3 (Muralidharan et al., 2020). Since the z-score of cluster P2 is not significantly lower than the z-score of cluster P1, it is most likely of the residues near the cluster P2 to support the high-affinity binding of the ligands.

Preliminary ligand docking simulations
All docking simulations were carried out around the clusters predicted by the metaPocket Tool. A total of 82 ligands were included in the docking simulations. Among the ligands used in the docking study, 51 ligands showed binding affinity values less than À7.0 kcal/mol. All the studied ligands preferred the cluster P1 for binding. A list of binding energy values for all the ligands' best docking poses is tabulated in Table S2.  After completing docking simulations, 5 ligands with top binding energy values were selected to study the ligand-protein interactions. The selected ligands include B27 (À8.5 kcal/ mol), D3 (À8.4 kcal/mol), B17 (À8.3 kcal/mol), B37 (À8.3 kcal/ mol) and B26 (À8.2 kcal/mol). The binding energy values obtained for manidipine and boceprevir are À7.2 and À6.9 kcal/mol, respectively.
From molecular docking simulations, it was found that the majority of isatin derivatives collected from the literature exhibits high-affinity binding toward the Mpro of SARS-CoV-2. This proves our hypothesis that the compounds (isatins) interacting and binding with the Mpro of SARS-CoV-1 could also bind with the Mpro of SARS-CoV-2. Many natural compounds including ursolic acid, carvacrol, and oleanolic acid showed similar binding energies as observed in our study with Mpro of SARS-CoV-2 .

High throughput library screening and docking
The chemical features of the ligands with top binding energies (B27, D3, B17, B37 and B26) such as FP2 fingerprints, electroshape and spectrophores were used to screen several small-molecule libraries using the SwissSimilarity tool. Out of 15856508 chemical compounds in various small molecule libraries, a total of 1,609 compounds with similarity score ! 6 (Zinc drug-like ¼ 1128, Zinc lead-like ¼ 397, ChEMBL ¼ 83 and FDA experimental ¼ 1) were screened out. The screened compounds were further subjected to a molecular docking simulation study. The binding energy values of the ligands screened through high throughput screening were recorded in the range of À8.7 to À5.6 kcal/mol for Zinc drug-like, À7.9 to À5.5 kcal/mol for Zinc lead-like and À8.0 to À5.4 kcal/mol for ChEMBL library (Table S3). For the ligand screened from the FDA experimental drug library (DrugBank ID: DB08213, IUPAC name: 1-methyl-5-(2-phenoxymethyl-pyrrolidine-1-sulfonyl)-1H-indole-2, similarity score ¼ 0.67), a binding energy of À6.8 kcal/mol was observed.
Chemoinformatic approaches are widely used to screen potent ligands from a large set of compounds in various small molecule libraries and databases. Recently, Rond on-Villarreal and L opez (2020) has used a chemoinformatics approach to screen potent neuroprotective/neurotoxic molecules from a natural compound library for the effective inhibition of a-synuclein aggregation. One of the important advantages of chemoinformatics is that it can be integrated with other theoretical and experimental approaches to derive significant outcomes. A study by Shukla et al. (2019), had demonstrated the application of chemoinformatics and pharmacokinetics to identify small molecule inhibitors of glycogen synthase kinase 3 from a subset of ZINC database. In our study, the chemoinformatics approach was successfully implemented to screen the novel inhibitors of Mpro of SARS-CoV-2. Some of the ligands derived through high throughput screening showed excellent binding energies than the ligands collected from the literature. Many of the ligands that showed high binding energies belong to the Zinc drug-like library.

ADME screening
The compounds screened through the chemoinformatics approach with binding energy values less than À8.0 kcal/mol and all the ligands collected from the literature were subjected to the ADME screening process. In ADME screening, 5 drug-likeness filters such as Lipinski, Ghose, Veber, Egan and Muegge filter were used to screen the compounds depending on their pharmacokinetic activity. Drug-likeness filters make use of different molecular properties such as molecular weight, hydrogen bond acceptor, hydrogen bond donor, logP (octanol-water partition coefficient), molar refractivity, rotatable bonds and polar surface area to determine the biological activity of the ligand. A brief description of all the drug-likeness filters used in this study is available elsewhere (Tian et al., 2015).
Among the ligands collected from the literature, 56 ligands including the top-ranking compounds passed through all the drug-likeness filters. In contrast, only 20 compounds that are selected through high throughput screening and molecular docking experiments could pass through all the drug-likeness filters. This difference occurred because the majority of the compounds selected through high throughput screening experiments have molar refractivity greater than 140. To satisfy the Ghose rule, molar refractivity of the molecules should lie within 40-130. Molar refractivity is defined as the measure of total polarizability of a molecule and it influences the London dispersive force which plays a key role in protein-ligand interaction (Padron et al., 2002). All the compounds that passed through drug-likeness filters also possess better aqueous solubility and high gastrointestinal absorption. Some of the important physicochemical properties of the ligands tested for drug-likeness such as molecular weight (MW), the number of rotatable bonds (RB), hydrogen bond acceptor (HBA), hydrogen bond donor (HBD), molar refractivity (MR), total polar surface area (TPSA), logP and logS, and so on are given in Tables S4 and 1.

Bioactivity and toxicity
The ligands that can pass through all the drug-likeness filters with binding energy values lesser than À8.3 kcal/mol (screened from the library) were evaluated for their bioactivity and toxicity properties. Here, the bioactivity refers to the ability of the molecule to interact with various naturally occurring proteins in human. A bioactivity score between À0.5 and 0 indicates moderate activity, score value greater than 0 indicates high activity, and score value lesser than À0.5 refers to complete inactivity. Compounds collected from literature with top binding affinities showed little to moderate activity toward ion channel modulation, kinase inhibition and nuclear receptor binding. Except for B27 and B26, all the ligands collected from literature are predicted to maintain little to moderate activity toward G-protein coupled receptor (GPCR) binding, protease inhibition, and enzyme inhibition. Among the ligands that passed through ADME screens, 4 ligands had binding energy values lesser than À8.3 kcal/mol. The selected ligands are ZINC000008848565 (À8.6 kcal/mol), ZINC000009513563 (À8.4 kcal/mol), ZINC000036759789 (À8.4 kcal/mol) and ZINC000046053855 (À8.4 kcal/mol). Based on the bioactivity scores (Table 2), all of the selected ligands are completely inactive in modulating ion channels. In addition, ZINC000009513563, ZINC000036759789 and ZINC000046053855 depicted complete inactivity toward nuclear receptor binding. Also, ZINC000036759789 showed complete inactivity toward kinase inhibition. All the ligands maintained their bioactivity score between À0.3 and À0.1 for GPCR binding and it indicates little to moderate activity. Similarly, all the ligands, except ZINC000036759789, were predicted to possess little to moderate activity toward protease inhibition. The control drug, boceprevir, showed very high protease inhibition activity.
Based on the estimates by T.E.S.T software, the oral rat median lethal dose (LD50) value of ZINC000008848565, ZINC000009513563, ZINC000036759789 and ZINC000046053855 are found to be 1249. 90, 2082.35, 666.42 and 354.14 mg/kg, respectively. The oral rat LD 50 value of B27, D3, B17, B37 and B26 are predicted to be 935. 54, 1901.70, 1809.42, 708.85 and 700.00 mg/kg, respectively. Also, the oral rat LD 50 value of manidipine and boceprevir are 872.34 and 1184.58 mg/kg, respectively. A large LD50 value indicates that the compounds selected are relatively safe at high concentrations. Moreover, the calculations by ProTox-II webserver (Table  3) indicated that all the selected ligands except for B37, are less likely to possess hepatotoxic, carcinogenic, immunotoxic and cytotoxic characteristics. The ligand B37 scored positive for immunotoxicity. Isatin derivatives are known to possess several biological activities, including anti-thrombolytic, muscle relaxant, fibrinolytic, immunosuppressive and anti-allergic activity (Pakravan et al., 2013). Hence, isatins are pharmacologically active molecules. Some of the synthetic derivatives of isatin even showed cytoprotective effects against lead, iron and copper toxicity (Benhangi et al., 2019).

Quantum molecular properties-HOMO and LUMO energies
The frontier molecular orbitals (HOMO and LUMO) of the structure optimized ligands were identified using the B3LYP/ 6-31G(d,p) basis set. The energy difference between HOMO and LUMO, called bandgap energy, is an indicator of chemical reactivity. Also, the energies of HOMO and LUMO can help determine the susceptibility of the ligand toward electrophilic and neutrophilic attacks. Apart from this, the chemical stability and electron conductivity of the ligand depends on the bandgap energy (Azam et al., 2018). The bandgap energy of the selected ligands is observed in the range of 1.837 eV-5.218 eV (Figure 3). This low value of bandgap energies suggests that all the ligands selected in this study are highly reactive due to the possibility of rapid electronic transitions (Malkhasian & Howlin, 2016). When comparing the bandgap energies of the ligands, all the ligands selected from the ZINC drug-like library recorded high values than the ligands selected from the literature. This indicates that the ligands selected from the literature are highly reactive than the ligands screened from the library. The control inhibitors also demonstrated high bandgap energy values. For B27, the calculated HOMO and LUMO are located at the piperazine and isatin moiety, respectively. In the case of D3, HOMO covers almost the entire structure of the ligand while LUMO covers the isatin moiety, sulfonyl moiety and aromatic ring. For the ligands B17, B27 and B26, the LUMO orbitals are located primarily at the isatin moiety. In contrast, the HOMO orbitals for B17, B27 and B26 are found covering the furan ring and piperazine moiety, b-naphthyl methyl moiety and piperazine moiety, respectively. The HOMO orbitals of ZINC000008848565 and ZINC000009513563 are located at the piperazine and sulfonyl moiety whereas, the LUMO orbitals are located at the isatin moiety and isoxazole ring respectively. In the case of ZINC000036759789, the HOMO orbitals enclose the isatin moiety while the LUMO orbitals were observed to cover the benzene ring near the sulfonyl moiety. For ZINC000046053855, the HOMO orbitals cover the majority of the compound while the LUMO orbitals occupy the isatin moiety of the ligand. With manidipine, the HOMO orbitals cover the dihydropyridine moiety. The LUMO orbitals in manidipine are present on the nitrophenyl moiety. The HOMO orbitals of boceprevir cover the carbamoyl and valine moiety, while LUMO orbitals cover the butanamide moiety.
The frontier molecular orbitals of the selected ligands implied that the regions near the isatin moiety are more prone to donate electrons than the regions near the piperazine moiety, which are prone to accept electrons. Apart from the bandgap energy, the HOMO and LUMO energies of the ligand can be used to calculate various global reactivity descriptors such as ionization potential, electron affinity, electronegativity (v), chemical potential (l), hardness (g) and electrophilicity index (x) ( Table 4). The ionization potential and electronegativity of the molecule corresponds to the HOMO and LUMO energies (with a positive sign). Among various global reactivity descriptors, the electrophilicity index of the ligands plays a key role in determining the biological activity (Kosar & Albayrak, 2011). A large electrophilicity index (3.185 to 11.750) suggests that the selected ligands are good electrophiles. When comparing the electrophilicity indexes of the selected ligands, the library selected ligands are significantly less electrophilic than the ligands selected from the literature and control inhibitors.

Quantum molecular properties-molecular electrostatic potential
The molecular electrostatic potential (ESP) surface was determined by mapping the electrostatic potential of the ligand  on to its total electron density at B3LYP/6-31G(d,p) level ( Figure 4). ESP is widely used as an important descriptor for identifying the sites that are susceptible to nucleophilic and electrophilic attacks. ESP helps to predict various biological recognition processes including hydrogen bond formation between the ligand and receptor (Kosar & Albayrak, 2011). The ESP surfaces are depicted with a color scheme in which the red color indicates a negative potential, the blue color indicates a positive potential and the green color indicates a zero potential across the ligand surface. In this study, the electrostatic potential on the surface of the selected ligands is observed in the range of -4 Â 10 À2 to 4 Â 10 À2 Hartree. The areas around the oxygen atoms of the sulfonyl group and isatin moiety are colored in deep red (negative potential) and it indicates high electron density around this region. In contrast, the regions around the hydrogen atoms of isatin moiety are colored in deep blue (positive potential) and it indicates low electron density. Hence, the oxygen atoms (sulfonyl group and isatin moiety) and hydrogen atoms (isatin moiety) are likely to be the major hydrogen bond acceptor and donor sites in the selected ligands. The oxygen atoms in the nitrophenyl group and dicarboxylate moiety act as the significant hydrogen bond acceptors within manidipine. In boceprevir, the oxygen atoms and hydrogen atoms in the butanamide and carbamoyl moiety are predicted to be the hydrogen bond acceptor and donor groups.

ONIOM results
One of the important advantages of the ONIOM model is that it can integrate any number of molecular orbital and molecular mechanics methods (Altun et al., 2008). In the case of docking software, the interaction energy between the receptor and ligands will be calculated solely based on the molecular mechanics of the ligand and receptor. Hence, a realistic depiction of the interaction between ligand and receptor cannot be derived by docking software. On the contrary, the ONIOM model assumes ligand as the quantum mechanical part of the protein-ligand system. So, an accurate description of protein-ligand interaction can be predicted using ONIOM calculations. In this study, the QM/MM method was employed to evaluate the protein-ligand complexes derived from molecular docking studies. The interaction energies (E int

MD simulation analysis
MD simulations are implemented to validate the results of molecular docking. Since static conformations of protein-ligand complexes are obtained from docking studies, many vital pieces of information, including the complex's stability during the interaction, cannot be determined (Singh et al., 2015). To determine the binding mode stability of the Mpro with the isatin derivatives of high binding affinity, MD simulations up to 100 ns were performed. In this study, the isatin derivatives with Autodock vina binding affinity values < À8.2 kcal/mol were selected to proceed with MD simulations. It includes all the library selected compounds and two compounds selected from literature (B27 and D3). Apart from this, the MD simulations were carried out for Mpro-manidipine and Mpro-boceprevir complexes. For all the selected compounds, the RMSD calculations were made to determine the ligand and protein's stability during the simulation. All the compounds selected in this study showed an average RMSD value of less than 0.250 nm ( Figure  6). Compared to library selected isatin derivatives, the complex formed by the ligands selected from the literature showed better RMSD values.
The average RMSD values of Mpro complexed with the ligands ZINC000008848565, ZINC000009513563, ZINC000036759789 and ZINC000046053855 are 0.232 nm, 0.167 nm, 0.168 nm and 0.174 nm, respectively. For Mpro-ZINC000008848565 complex, a sharp deviation in RMSD of protein backbone was observed at 55 ns. Whereas, the ligand RMSD was observed in the acceptable range with a maximum deviation of 0.3 nm. In the case of Mpro-ZINC000009513563 complex, the stabilization in RMSD value of protein backbone and ligand was recorded after 15 and 10 ns of simulation. After 80 ns of simulation, the fluctuations were observed in the RMSD of ZINC000009513563. For Mpro-ZINC000036759789 complex, a rise in the deviation of RMSD of protein backbone to high values was recorded after 87 ns of simulation. The ligand RMSD of Mpro-ZINC000036759789 complex was observed to undergo high fluctuations after 10 ns of simulation. The Mpro-ZINC000046053855 complex exhibited minimal deviations for protein backbone while the stabilization in ligand RMSD was seen after 32 ns of simulation. The average RMSD values of Mpro complexed with the ligands B27 and D3 are 0.175 nm and 0.190 nm, respectively. A maximum deviation in protein RMSD of Mpro-B27 complex was recorded at 50 ns (0.35 nm), while the minimum deviation in ligand RMSD was observed after 52 ns. For Mpro complexed with D3, a raise in protein RMSD value was observed at 80 ns of simulation. The ligand RMSD plot of the Mpro-D3 complex contains multiple peaks of deviation including maximum deviations at 27, 59 and 95 ns. The maximum protein RMSD values for Mpro-manidipine complex are recorded at 8, 28 and 40 ns. Also, the maximum values for ligand RMSD are recorded at 10, 25 and 45 ns. In Mproboceprevir complex, the maximum changes in protein RMSD are seen at 45 and 93 nm. Also, the ligand RMSD calculated for Mpro-boceprevir complex showed deviations till 45 ns and attained equilibrium. The lower overall RMSD fluctuations and less significant difference in average RMSD values of the selected Mpro-isatin complexes indicate that the selected complexes are highly stable.

Hydrogen bond formation
MD simulations play a crucial role in determining the rise and fall of hydrogen bonds during protein-ligand interactions. The hydrogen bonds formation is one of the indicators of protein-ligand complex stability (Rao et al., 2020). In this study, the formation of hydrogen bond between protein and ligands was observed throughout the simulation period of 100 ns (Figure 7). All the compounds selected in this study shared at least one hydrogen bond with the Mpro during MD simulation. The ligand ZINC000009513563 formed the highest number of hydrogen bonds (7) with Mpro of SARS-CoV-2. The highest number of hydrogen bonds formed by the ligands ZINC000008848565, ZINC000036759789, ZINC000046053855, B27 and D3 are 4, 1, 3, 5 and 2, respectively. Finally, the highest number of hydrogen bonds formed by the control inhibitors manidipine and boceprevir during the simulation are 10 and 7, respectively. These results indicate similar performances in terms of H-bond formation by both library and literature selected isatin derivatives.

Protein-ligand interactions
After completing molecular dynamic simulations, the nonbonded interactions between the Mpro and selected ligands were analyzed (Figure 8). Among the preferred compounds, the maximum number of hydrogen bond interactions were observed between ZINC000009513563 and Mpro of SARS-CoV-2. A total of 5 hydrogen bonds were identified between Mpro and ZINC000009513563. The residues of Mpro that shared hydrogen bond with ZINC000009513563 are Thr-26, Asn-142, Cys-145 and Glu-166. Apart from ZINC000009513563, molecules such as D3 and boceprevir formed a maximum of 4 hydrogen bonds with Mpro. The residues involved in forming hydrogen bonds with D3 and boceprevir are Thr-24, Thr-26, Asn-119, Asn-146, Glu-166, Cys-145 and His-164. In other studied compounds, the residues of Mpro involved in sharing hydrogen bond with the ligands are Thr-26, Ser-46, Gly-143 and Glu-166. Besides hydrogen bond, carbon-hydrogen bond was identified between Mpro and the compounds ZINC000008848565, ZINC000009513563, ZINC000046053855, manidipine and boceprevir. The common residues involved in forming carbon-hydrogen bonds with the ligands are Thr-24, Thr-25, Thr-26, Thr-45, Phe-140 and Met-165. Also, many residues such as Leu-21, His-41, Leu-50, Cys-145, His-163 and His-172 are involved in forming alkyl and pi-alkyl interactions. In compounds ZINC000008848565, ZINC000009513563, ZINC000046053855, B27 and D3, one of the aromatic ring is associated with the formation of pi-sulfur interaction with Cys-145. Also, residues such as Leu-141, Met-165 and Glu-166 are implied in other pi interactions such as amide pi-stacking and pi-anion interaction with the ligands B27 and manidipine. Finally, numerous van der Waals interactions were observed between the selected ligands and Mpro. Some residues that formed van der Waals interaction with the ligands include  From protein-ligand interaction, it was observed that all the above ligands interact with at least one residue in the catalytic dyad of Mpro (His-41 and Cys-145). The interaction between ligand and active site residues is crucial for the inhibition of Mpro. Various studies exemplified the need for the active site binding of the ligand. A recent study by Havranek & Islam (2020) demonstrated a high-affinity interaction between the ligand (macrocyclic tissue factor-VIIa inhibitor) and active site residues of the Mpro (Havranek & Islam, 2020).

MM-PBSA binding free energy calculations
One of the critical advantages of MD simulation is that it can be used to predict binding free energy between the receptor and ligand. In this study, the energy liberation during bond formation was predicted as the cumulative sum of electrostatic energy, Van der Waal's energy, polar solvation energy and Solvent Accessible Surface Area (SASA) energy (Table 5). All the energies, except polar solvation energy, contributed favorably toward the overall binding free energy of the complexes. Among the complexes studied, the lowest binding free energy was recorded for the complex Mpro-B27 (À46.716 ± 6.945 kcal/mol). The highest binding free energy was observed for the complex Mpro-boceprevir (À14.812 ± 3.692 kcal/mol). The literature selected isatin derivatives complexed with Mpro depicted significantly lower binding free energies than library selected isatin derivatives complexed with Mpro of SARS-CoV-2.

Conclusion
The interaction between various isatin derivatives and Mpro of SARS-CoV-2 was studied in this work using molecular docking, QM/MM and MDstudies. Among the isatin derivatives collected from the literature, 5 ligands (B27, D3, B17, B37 and B26) showed high-affinity binding toward the Mpro of SARS-CoV-2. This proved our hypothesis that the compounds having a high binding affinity toward Mpro of SARS-CoV-1 are likely to bind to the Mpro of SARS-CoV-2. Also, a total of 1609 compounds (out of 15856508 library compounds) having chemical features similar to the selected ligands were screened out. Among the compounds screened, 4 ligands (ZINC000008848565, ZINC000009513563, ZINC000036759789 and ZINC000046053855) from Zinc druglike library with high binding affinity and better ADME properties were selected. The predicted LD 50 values of all the selected ligands indicated that the ligands screened from both the literature and Zinc drug-like library are less toxic in nature. The DFT calculations performed over the selected ligands depicted the electrophilic characteristics as well as the locations of hydrogen bond donor (O atoms in sulfonyl and isatin moiety) and acceptor (H atoms in isatin moiety) in the ligands. The interaction energies of protein-ligand complexes predicted by QM/MM calculations are partially in agreement with the binding affinities calculated by the docking software. Finally, the MD simulations indicate high structural stability of the selected Mpro-isatin complexes.