Virtual screening and structure-based 3D pharmacophore approach to identify small-molecule inhibitors of SARS-CoV-2 Mpro

Abstract The outbreak caused by a coronavirus 2 has required quick and potential treatment strategies. The main protease enzyme Mpro plays an important role in the viral replication which renders it an important target for discovering SARS-CoV-2 inhibitors. In this study, 3D pharmacophore structure-based virtual screening and molecular docking were conducted using MOE and Bristol University Docking Engine (BUDE). Around 400,000 molecules of ZINC15 database were docked against the crystal structure of main protease, followed by 3D pharmacophore filtration. Six top-ranked hits (ZINC58717986, ZINC60399606, ZINC58662884, ZINC45988635, ZINC54706757 and ZINC17320595) were identified based on their strong spatial affinity and forming H-bonds with key residues H41, E166, Q189 and T190 of the binding pocket of Mpro SARS-CoV-2. The 6 hits subjected to molecular dynamics simulations for 100 ns followed by binding free energy calculations using MM-PBSA technique. Interestingly, three hits showed free binding energy (ΔGbinding) lower than tert-butyl N-[1-[(2S)-1-[[(2S)-4-(benzylamino)-3,4-dioxo-1-[(3S)-2-oxopyrrolidin-3-yl]butan-2-yl]amino]-3-cyclopropyl-1-oxopropan-2-yl]-2-oxopyridin-3-yl]carbamate (α-ketoamide 13 b) (ΔGbinding) −76.67 ± 0.5 kJ/mol which suggested their potential against SARS-CoV-2. The best binding free energy candidates, ZINC58717986 (ΔGbinding) −98.41 ± 0.7 kJ/mol. The second-best hit candidate, ZINC54706757 (ΔGbinding) −83.4 ± 0.6 kJ/mol and the third one ZINC17320595 (ΔGbinding) −78.85 ± 0.5 kJ/mol. Per residue decomposition free energy indicates H41, S46, H164, E166, D187, Q189 and Q192 are hot spot residues while residues M49, M165, L167 and P168 contribute to the hydrophobic interactions. The pharmacokinetic study suggests that the selected 6 hits possess drug-like properties. The 3D pharmacophore virtual screening, molecular dynamics and MM-PBSA approaches facilitated identification 3 promising hits with low free binding energy as SARS-CoV-2 inhibitors. Communicated by Ramaswamy H. Sarma Graphical Abstract


Introduction
The new pathogenic human coronavirus (CoV) was reported for first time in December 2019 as the main cause of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Zhou et al., 2020). The virus was named SARS-CoV-2, because its RNA genome is 82% identical to the SARS coronavirus (SARS-CoV) (Zheng, 2020), both viruses and MERS-CoV belong to the coronaviridae family. The three viruses are known by their ability to cause severe respiratory syndrome in human in contrast to other human coronaviruses (HCoV-NL63, HCoV-229E, HCoV-OC43 and HCoVHKU1) which cause mild respiratory infection (Gil et al., 2020). By March 2020, World Health Organisation declared that the coronavirus disease-2019 (COVID-19) is a world pandemic as around 73 million cases were confirmed over 225 countries (Cucinotta & Vanelli, 2020) (https://covid19.who.int/). In response to the social and economic crisis caused by COVID-19 outbreak, FDA authorized remdesivir for treatment COVID-19 (Thorlund et al., 2020) and clinical trials were evaluated for marketed drugs as lopinavir, ritonavir (Sanders et al., 2020) and hydroxychloroquine (Sanders et al., 2020). However, the efficacy of these drugs still needs to be further confirmed.
There are numerous COVID-19 vaccines currently undergoing research and development, until January 2021 three vaccines are approved by the Medicines and Healthcare products Regulatory Agency (MHRA); Pfizer/BioNTech vaccine, Moderna vaccine and Oxford University/Astra Zeneca vaccine (UK medicines regulator gives approval for first UK COVID-19 vaccine). The phase III data showed that the vaccines have more than 90% effectiveness in preventing COVID-19 (Kesselheim et al., 2021). However, the efficacy of vaccines measured in clinical trials sometimes imperfectly predict the effectiveness in real-world. Also, It is not known how long the vaccinated person is protected, and if the vaccine is safe for the pregnant and breastfeeding women (Caserotti et al., 2021). Besides, the growing vaccine hesitancy (Dror et al., 2020) and anti-vaccination views against the new vaccines may weaken vaccination uptake rates and reduce the chance of herd immunity (Murphy et al., 2021). For that, there is still a demand for specific and potent therapeutics against coronovirus-2 to reduce the severity of the deadly disease.
Coronaviruses express two proteases Main protease (M pro ) also known as 3-chymotrypsin-like protease (3CL pro ) and Papain-like protease (PL pro ) (Osipiuk et al., 2021). M pro is an essential for processing the polyproteins that are translated from the viral RNA making it an attractive target for inhibition viral replication (Rut et al., 2020). Moreover, The M pro operates at around 11 cleavage sites on the virus polyprotein, and until now there is no human proteases have a similar cleavage specificity are identified , that makes developing M pro inhibitors is likely to be effective and safe. The recently released SARS-CoV-2 M pro crystal structures provides an insight information for rational drug design for SARS-CoV-2 (Wang, 2020), a peptidomimetic inhibitor (N3) (Jin et al., 2020) and tert-butyl N-[1-[(2S)-1-[[(2S)-4-(benzylamino)-3,4-dioxo-1-[(3S)-2-oxopyrrolidin-3-yl]butan-2-yl]amino]-3-cyclopropyl-1-oxopropan-2-yl]-2-oxopyridin-3 yl]carbamate (a-ketoamide 13 b) that has been recently crystallized with SARS-CoV-2  promote the virtual screening of available libraries for development SARS-CoV-2 M pro inhibitors (Al-Khafaji et al., 2021). M pro consists of two protomers (A and B) associate to form a dimer, each protomer contains three domains: domain I (residues 8-101), domain II (residues 102-184) and domain III (residues 201-303) (Somboon et al., 2021). Domain III connected to domain II by a long loop region (residues 185 À 200), dimerization of M pro regulates through salt bridge between Q290 of a protomer A and R4 of a promoter B, the M pro dimerization is an important for the catalytic activity as it helps in shaping the S1 pocket (Shi & Song, 2006). The substrate-binding site is in a cleft between domain I and domain II, the binding site formed of four binding sites S1, S2, S3 and S4 (Anand et al., 2002). S1 subsite formed of F140, L141, N142, C145, H163, E166, H172 while S2 subsite which is hydrophobic consists of H41, M49, H164 and M165. S3 formed of Q189, and S4 formed of L167, F185 and Q192 and the main chain of Q189 (Jin et al., 2020). Computer-aided drug design (CADD) employing a molecular docking, virtual screening, 3D pharmacophore structure-based drug design and molecular dynamics have played an essential role for rapid discovering and developing SARS-CoV-2 M pro inhibitors (Das et al., 2021).
The success in our previous work (Elseginy et al., 2021) in discovering SARS-CoV-2 Papain like protease inhibitor using 3D pharmacophore virtual screening technique encourage us to develop COVID-19 inhibitors by targeting M pro of SARS-CoV-2 employing 3D pharmacophore structure-based virtual screening technique and molecular dynamics (MD) simulation. Molecular docking study were carried out to 400,000 commercial compounds from ZINC15 database  then the compounds filtered through 3D pharmacophore using Bristol University Docking Engine (BUDE) (McIntosh-Smith et al., 2015). Molecular docking performed prior to pharmacophore filtration to improve the efficiency and accuracy of filtration process. The compounds showed binding score better than tert-butyl N-[1-[(2S)-1-[[(2S)-4-(benzylamino)-3,4-dioxo-1-[(3S)-2-oxopyrrolidin-3-yl]butan-2-yl]amino]-3-cyclopropyl-1-oxopropan-2-yl]-2-oxopyridin-3 yl]carbamate (a-ketoamide 13 b) were selected and undergo visual inspection. MD for 100 ns was performed for the selected hits and the control compound (a-ketoamide 13 b) and the free binding energy were calculated using molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) approach  to identify the hits with lowest binding energy. The ligand À residues interaction profile and the decomposition of binding free energy into different components were analyzed. The finding of this study provides insight into rationally designing potent inhibitors of SARS-CoV-2 main protease.

Computational method
2.1. Docking screening 2.1.1. Prepare the ligands A library of 400,000 compounds were download from ZINC15 database, the ligands were protonated, partial charges were assigned by MOE using Gastiger method and saved as mol2 files. The energy of ligands was minimized by MMFF94Â force field until RMSD gradient of 0.05 kcal/mol Å was reached. The ligands were kept flexible during the docking simulation.

Prepare the protein
The crystal structures of APO (6M03) and SARS-CoV-2 main protease (M pro ) complexed with peptidomimetic inhibitor (N3) and a-ketoamide 13 b (PDB: 6LU7, 6Y2G, respectively) were download from protein data bank (www.rcsb.org), PDB structures employed for this study were selected based on their availability and their high resolution (6LU7; 2.16 Å and 6Y2G; 2.20 Å) and absence the mutations.
Hydrogen atoms were added for the crystal structures, water molecules and ions was removed with MOE and the protonation state of each protein residue was determined at pH 7 using the PDB2PQR server (Dolinsky et al., 2004) (http://nbcr-222.ucsd.edu/pdb2pqr2.0.0/). The missing loop regions and missing residues were inserted with MODELLER program (Eswar et al., 2006) via the UCSF Chimera graphical interface (Pettersen et al., 2004). The protein was kept rigid during the docking.

Molecular docking
In this study, two docking studies were performed, first one, used before pharmacophore filtration, 400,000 commercial compounds were docked using BUDE programme against M pro complexed with peptidomimetic inhibitor N3 (PDB code: 6LU7). The binding site is located at (X ¼ À11.45, Y ¼ 12.85, Z ¼ 70.08) which is identified up on the center of N3. Before virtual screening, the accuracy of the screening protocol was validated by re-docking the peptidomimetic ligand N3 in the crystal structure, through which we found that the re-docked pose obtained a minimum RMSD < 1.0 Å against the original binding pose in the crystal structure which indicated the reliability of the screening method used in this study. For further confirmation of the accuracy of our docking protocol, several SARS-CoV-2 crystal structures (PDB codes: 6LZE, 6XQS, 6XQT, 6XQU, 7ABU, 7AKU, 7AQI, 7AWS, 7AXM and 7L0D) were download from https://www.rcsb.org/ and prepared as mentioned before. The co-crystalized ligands were removed from the crystal structures and redocked in the putative binding sit of each protein. The second docking was carried using MOE against M pro complexed with a-ketoamide 13 b (PDB:6Y2G), the site map tool which implements in MOE was used to determine the binding site, a-ketoamide inhibitor 13 b was re-docked with MOE against crystal structure (PDB:6Y2G). Placement tool were selected to be Triangle Matcher and London dG for scoring, the rescoring affinity dG was used for refinement. The control compounds in virtual screening protocol (peptidomimetic N3 and a-ketoamide13b) were re-docked with BUDE and showing docking score (À64.4 and À65.32 kJ/mol), respectively, and re-docked with MOE showing docking score (À10.64 and À10.35 kcal/mol), respectively.

Pharmacophorevirtual screening
The crystal structure of M pro complexed with the a-ketoamide 13 b was used to generate 3D pharmacophore using Protein-Ligand Interaction Fingerprints (PLIF) tool which is implemented in MOE. The pharmacophore composed of 6 features, two H-bond donors (F1, F2), two H-bonds acceptors (F3, F4) and two hydrophobic centers (F5, F6). The resulting 3D pharmacophore validated its ability to distinguish true active compounds from decoys by screening a set of active compounds (30 compounds) (Fourches et al., 2010;Principi & Esposito, 2020;Zhang et al., 2021) and decoy compounds (300 compounds) obtained from Database of Useful Decoys (DUDe: http://dude.docking.org) . Molecules that came out from docking with BUDE and showed binding score N3 were screening against the constructed 3D pharmacophore. Around 1000 molecules passed the 3D pharmacophore filtration, and 818 molecules were obtained after removing the duplicate. For consistency, the 818 were re-docked by BUDE and MOE against M pro complexed with (a-ketoamide 13 b) inhibitor, and the molecules that showed binding score lower than a-ketoamide 13 b with MOE and BUDE (À10.35 kcal/mol, À65.35 kJ/mol, respectively) were selected and visualized using Pymol (DeLano, 2002). For accuracy and consistency as each software used different criteria to calculate the binding energy score, the BUDE docking score of the compounds were compared with BUDE docking score of native ligand (a-ketoamide 13 b) and MOE docking score of the compounds were compared with MOE docking score of 13 b. Six hits were selected upon their good interactions with the binding pocket residues. The shortlisted molecules were screened for Pan Assay Interference compounds (PAINS) using the online PAINS filters at ZINC (docking.org), and the 6 hit molecules passed the filter.

Drug likeness
The physicochemical and pharmacokinetic parameters of the 6 hits and control compound a-ketoamide 13 b were calculated using SwissADME online tool (Kuhn et al., 2005). Lipinski's parameters including molecular weight (MW), number of hydrogen bond donors, number of hydrogen bond acceptors, the partition coefficient log P and log S were calculated. Besides, topological polar surface area (TPSA), number of rotatable bonds, GI absorption, blood brain barrier (BBB) penetration and effect the 6 hits on CYP1A2. The toxicity study of the promising hits and control compound a-ketoamide 13 b were carried out using pkCSM (Pires et al., 2015), and several predicators were calculated. The chemical structure of the hits was converted to SMILES format using MOE and sent to http://www.swissadme.ch/ and pkCSM (unimelb.edu.au) for analysis.

Molecular dynamics (MD) simulations
The top scoring docked poses obtained from docking against M pro complexed with a-ketoamide 13 b was taken as the initial poses for simulation. The MD simulations were carried out with GROMACS 5.1.2 software (Berendsen et al., 1984). The topology file of protein was generated using Pdb2gmx and forcefield Amber99-SB-ildn was employed (Lindorff-Larsen et al., 2010). The topology files of the 6 hits were created using Acpype (Da Silva & Vranken, 2012) under the GAFF force field (Wang et al., 2010). A cubic periodic box was defined with a minimum margin of 1 nm, and each system was solvated and the box was filled with TIP3P water (Jorgensen et al., 1983). The sodium and chloride ions were added to the system to give an ionic strength of 0.15 M and an overall neutralize the system. The long-range electrostatics and Van der Waals (rvdw) interactions were calculated using the Particle Mesh Ewald (PME) method (Manhas et al., 2017), and the cut-off distance for the short-range Coulombic interactions and VDW was set to 1.2 nm (Wang et al., 2010). The steepest descent algorithm was used to minimized the energy of each system then each system was heated gradually from 0 to 310 K under NVT condition for 100 ps and then equilibrated with NPT conditions for 100 ps. Finally, 100 ns molecular dynamics simulations were performed for the 7 systems (6 hits and control compound complexes with SARS-CoV-2 M pro ) on Blue Crystal, the University of Bristol's high-performance computing machine. All post MD analyses were performed using Gromacs tools and Xmgrace, molecular graphics operations and visualizations were executed using and Pymol (DeLano, 2002) and VMD-1.9.1 (Humphrey et al., 1996).

MM-PBSA binding free energy calculation
The binding free energies for the 6 hits and control compound a-ketoamide 13 b complexed with SARS-CoV-2 M pro (PDB:6Y2G) were calculated using the MM/PBSA method (Kuhn et al., 2005). The g_mmpbsa implemented by GROMACS was used to calculate the binding free energy of the 6 hits and the control a-ketoamide 13 b complexed with receptors. The snapshots were collected every 100 ps over 100 ns MD starting from 40 ns giving 600 frames. DG and per residue decomposition calculations were determined using MMPBSA.py as input file (Miller et al., 2012). The ionic strength (istrng) was 0.1 and the external and internal dielectric constants were (exdi ¼ 80, indi ¼ 1.0), respectively. The method for calculating nonpolar solvation free energies was (inp ¼ 2), cavity_surften ¼0.037 and cavity_offset ¼ À0.569, fillratio ¼ 4 and scale ¼ 2.0. the number of iterations was 1000 to perform of the linear PB equation (linit). A 1.4 Å probe radius (prbrad) was used for simulations carried out with water as the solvent to model the size of a typical water molecule. The solvation free energy was calculated using the default solver pbsa (Luo et al., 2002).

Virtual screening procedure
In order to identify hits as promising inhibitors of SARS-CoV-2 M pro target, virtual screening technique was applied followed by molecular dynamic simulation. Virtual screening strategy based on using molecular docking and pharmacophore-based structure, the database molecules were docked prior to screened by pharmacophore model to identify novel molecules in appropriate molecular shape and size for the target active site and hence improve the accuracy of screening procedure .

Docking-based virtual screening
Validation results of the docking protocol by re-docking native ligands of several SARS-CoV-2 M pro crystal structure (PDB: 6LU7, 6Y2G, 6LZE, 6XQS, 6XQT, 6XQU, 7ABU, 7AKU, 7AQI, 7AWS, 7AXM and 7L0D) showed RMSD value between the native ligands and the re-docked poses < 1.0 Å, which indicates the reliability of the docking protocol. Next, around 400,000 commercially compounds downloaded from ZINC15 database  were docked in the N3 ligand active site of SARS-CoV-2 M pro (PDB:6LU7) using BUDE algorithm (Martineau et al., 2016;McIntosh-Smith et al., 2015) and ranked according their docking score. Around 170,000 compounds that showed binding score lower than N3 docked pose were selected for next step. The 170,000 compounds were screened against 3D pharmacophore. Around 1000 molecules were obtained and then were docked against SARS-CoV-2 M pro complexed with a-ketoamide 13 b (PDB:6Y2G).
The main reason for performing docking against 6LU7 and 6Y2G is that although the two structures are well aligned and both N3 and a-ketoamide 13 b ligands showed that the same binding mode, the peptidomimetic inhibitor N3 ligand extends in the back hydrophobic pocket showing close contact to T190, A191 and A193 while tert-butoxyethenamine group of a-ketoamide 13 b showed close contact to L167, P168 and D187 ( Figure 1). In our screening, we took the advantage of binding mode of both ligands N3 and a-ketoamide 13 b. Also it could be observed in Figure 1, APO SARS-CoV-2 M pro (PDB: 6M03) aligned perfectly with SARS-CoV-2 complexed with N3 and a-ketoamide 13 b (PDB: 6LU7 and 6Y2G, respectively) but it was observed that there a little difference in the position of some residues in the putative active site; T26, M49, M165, P168, Q189 and T190.

Pharmacophore-based virtual screening
3D structure-based pharmacophore consists of six features was generated up on SARS-CoV-2 M pro and a-ketoamide 13 b interaction (PDB:6Y2G). Pharmacophore features including two H-bond centers; F1 and F2 corresponding to interaction with H41, H164 and E166, respectively. Two H-bonds acceptors; F3 and F4 corresponding to interaction with G143, C145, Q189 and Q192, two hydrophobic centers F5 and F6 corresponding to M165, L167 and P168 (Figure 2(A)). To validate the 3D constructed pharmacophore, a set of decoy compounds and active SARS-CoV-2 M pro inhibitors were screening with the model, the sensitivity and specificity were calculated (Table 1), the ROC curve were plotted (Figure 2(B)) and area under the curve (AUC) was calculated. It is obvious from the good sensitivity and specificity values and an excellent AUC value of 0.9 indicating that the 3D pharmacophore model has the best ability to distinguish successfully active molecules from decoys. Subsequently, the pharmacophore model was used to screen the 170,000 hits, a total of 1000 molecules that best fit the pharmacophore model were obtained, to further narrow the compound, the duplicated structures were removed afforded 818 molecules. The remaining 818 compounds and a-ketoamide 13 b ligand were docked within binding pocket of M pro (PDB:6Y2G) using BUDE and MOE. The binding scores of re-docked a-ketoamide 13 b with the BUDE and MOE (À65.32 kJ/mol and À10.35 kcal/mol), respectively, were used as a cut-off for selecting the promising hits. 100 promising hits showed (Supplementary Data, File S1) docking score better than the score of the re-docked a-ketoamide 13 b conformation. The 100 molecules were inspected visually following the criteria; molecules that showed chemical structures diversity, followed rule of five, formed H-bond with one of Key residues H41, E166, C145, Q189 and Q192 and showed good fitting within M pro active site. The visual inspection step gave shortlist of 6 compounds ( Figure 3) namely, ZINC58717986, ZINC60399606, ZINC58662884, ZINC45988635, ZINC54706757 and ZINC17320595 and their binding score by BUDE and MOE represented in Table 2. The virtual screening process was summarized in supplementary Figure S1.

Drug likeness
Physicochemical properties of the 6 molecules and control compound a-ketoamide 13 b were evaluated using SwissADME web tool (Table 3). The molecules showed good physical parameters which indicate a good oral bioavailability, all the 6 hits follow Lipinski rule, the H-bond donor 5, H bonds acceptors 10, Molecular weight 500 and log p < 5.6, the 6 hits showed log P range (1.02-3.09) which gave an indication the molecules have a good permeability across the cell membrane. Log S of 6 molecules < À5 which suggested they have good solubility and absorption. Furthermore the 6 molecules showed lower TPSA 140 A 2  which suggested that they have better cellular internalization properties expect, the slight increase in TPSA of compound 5 (152 A 2 ) which do not expect to have a significant effect on the transportation of the molecules and transportation. The 6 hits showed an inhibitory effect on cytochrome CYP1A2, indicating that they have a prolonged action and good therapeutic effect. Also, the 6 molecules do not pass the blood brain barrier (Table 3). Control compound a-ketoamide 13 b follows Lipinski rule expect its molecular weight >500 which violated the role of five (Table 3). It could be concluded that the 6 molecules have a good pharmacokinetics and follow Lipinski rule, based on these results all 6 compounds are predicted to have a good absorption and higher bioavailability (Lipinski, 2004). In this study, the toxicity profile of 6 promising hits and control compound a-ketoamide 13 b were predicted using pkCSM tool, several predictors including mutagenicity (AMES test), human maximum tolerated dose, oral rat acute toxicity (LD50), oral rat chronic toxicity, hepatotoxicity, human ethera-go-go related gene (hERG) inhibitors and skin sensitization these parameters are presented in Table 4. The toxicity profile of the 6 hits is well-tolerated by rats and human. The AMES mutagenicity prediction revealed absence of induced mutagenic activity by the hits and the control compound 13 b, as well as absence of skin sensitization. The hERG encoded voltage-dependent ion channel, blockage this ion channel may affect the cardiac repolarization, which is a critical side-effect of non-cardiovascular therapeutic agents (Vandenberg et al., 2012). The pkCSM prediction of hERG of our compounds and the control is negative which indicates the 6 compounds have a safe profile. The pkCSM hepatotoxicity prediction of the 6 hits and control 13 b were positive, but the hepatotoxicity prediction by pkCSM is based on the human liver-related side effects of 531 compounds, if the compound is related with at least one liver pathological or physiological event, the prediction come out positive result (Fourches et al., 2010;Pires et al., 2017). It could be concluded the 6 hits showed good toxicity profile except the hepatotoxicity still needs an experimental investigation. Acute toxicity describes the immediate adverse side effects occurring in short times after administration a chemical, calculating oral acute toxicity in vivo is very complicated, time consuming and costly (Li et al., 2014). In this study, the oral acute toxicity and chronic toxicity were predicted using pKCSM and the results are represented in Table 4 .

System stability
To explore the binding stability of the selected 6 hits-M pro complexes, the docked conformations of the 6 hits obtained from the docking, and control compound a-ketoamide 13 b complexed with M pro protein (PDB:6Y2G) and APO protein (PDB:6M03) were applied for 100 ns MD simulations. Firstly, we investigated the conformations of APO SARS-CoV-2 M pro and the docked a-ketoamide 13 b-M pro complex during MD simulation. It showed that at 20 ns the overall structure of both proteins is aligned perfectly expect the C-terminal part ( Figure 5(A)), the binding site residues of APO and docked 13 b-complex well aligned expect the residues T25, M49, N142, G143, M165, P168 and Q198 ( Figure 5(A)). At 50 ns, the conformations of both proteins kept stable with slight changes in parts of b-sheets and the residues of binding site of APO and docked a-ketoamide 13 b-complex (PDB:6Y2G) showed the same confirmations as in 20 ns expect N142, M165, P168 and Q189 showed slightly conformational changes ( Figure 5(B)). Both APO and docked a-ketoamide 13 b-complex kept the same conformations until almost end of the simulation. It could be observed at 80 ns ( Figure 5(C)) both proteins kept the same conformations and biding site residues of both proteins kept aligned expect the same residues which identified in first 20 ns of the simulations ( Figure  5(D-F)). It also could be observed that the stability of control compound a-ketoamide 13 b within the binding site and the stability of 13 b-complex model with minor conformational changes which indicates the reliability of the model and stability of the binding site. The root mean square deviation (RMSD) values of the 6 hits and control compound 13 b complexed with M pro protein were calculated and plotted ( Figure 6). RMSD plot for APO vs. HOLO (13 b-M pro complex) showed that APO system reached equilibrium at 20 ns while 13 b-M pro reached the equilibrium at 40 ns, after that both systems showed stable trajectories over the simulation (Figure 6(A)). ZINC58717986-M pro showed stable RMSD pattern similar to 13 b complex (Figure 6(B)) throughout the simulation. ZINC58717986, ZINC60399606 and ZINC17320595 complexed with M pro reached the equilibrium at 40 ns and remain steady stable over the remining simulation (Figure 6(B,C,G)), while ZINC58662884 and ZINC54706757 complexed to M pro reached the equilibrium at 25 ns ( Figure 6(D,F)), compound ZINC54706757-M pro (Figure 6(F)) remained stable until the   Table S1) while RMSD value for the control 13 b-M pro is 0.21 nm . RMSD values revealed that all 6 hits and the control 13 b M pro complexes in the simulation were well-equilibrated and remained stable throughout 100 ns simulation. It was observed that the RMSD of 6 hit-complexes slightly higher than the control compound 13 b-complex, this may be attributed to the size of the hits which is smaller than the 13 b and due to better conformational optimization of the ligands inside the binding site the RMSD is a little higher (Wang, 2020). To confirm the stability of 6 hits within the binding pocket of SARS-CoV-2 M pro , the distances between the 6 hits and the residues of the binding site were calculated as shown in (Figure 7(A-F)). ZINC58717986, ZINC58662884 and ZINC45988635 showed the most stable distance over the 100 ns simulation with average 0.2 ns, while compound ZINC60399606 showed great deviation from the binding pocket at 38 ns and return to the smooth pattern at around 58 ns which gave indications the ligand did not fit well within the binding pocket. Compound ZINC54706757 showed small distance from binding site around 0.2 nm but it adopted two conformations from 50 ns and until the end of the simulation. Compound ZINC17320595 showed stable conformation from beginning of the simulation with small distance around 0.2 nm then from 30 ns it adopted another conformation with distance 0.15 nm from the binding site, in the last 20 ns of the simulation ZINC17320595 kept moving between the two conformation while still stable within the pocket (Figure 7(A-F)).
To evaluate and compare the flexibility of 6 hits-M pro complexes, the root-mean-square fluctuations (RMSF) were calculated for each system (Figure 8). The RMSF analyses The higher RMSD values of 6 hit-complexes than 13 b-M pro complex this may be due to that protein and ligands adapt themselves to the better complexation.
highlighted similar fluctuations pattern in the 6 hits-M pro system, compared to the control compound 13 b complexed to M pro . From the RMSF plots it was observed that binding of 6 hits to the protein made SARS-CoV-2 M pro protein less flexible as compared to 13 b-M pro (Figure 8). The greatest observed fluctuation occurred in domain III, between residues 220 and 224, may be the reason related to involved domain III in the formation homodimer, second amplitude in 189-192 which form the loop region. It could be observed that the presence of hits caused a decrease in the residue's amplitude of movement, in domain I the residues T25, T26, the residues within the range 41-46 and residue M49 showed a decrease in the movement with the 6 hits complexes in comparison to 13 b-M pro . In domain II there are a decrease in residues fluctuation in the range between 141-146 and 161-166. This may be attributed to the interactions of the hits with the residues of the M pro catalytic site (T25, T26, H41, S46, M49, N142, G143, S144, C145, H163, H164, E166) mean RMSF value for a-ketoamide 13 b and ZINC17320595 were 0.18 and 0.16 nm, respectively, and for compounds ZINC58717986, ZINC60399606, ZINC45988635 and ZINC54706757 is 0.14 nm, compound ZINC58662884 showed the least fluctuation value 0.123 nm (Supplementary  Table S1). MD simulation results indicate that the 6 hits-M pro complexes were stable over 100 ns simulation and the binding pocket residues undergo little conformational changes.
3D RMSF tool was used to compare the complexes fluctuations (Supplementary Figure S2), the most stable region was in active site with residues (H41, C44, M49, C145) while the loop region the highest fluctuated region. These data are in consistent with SARS-CoV-2 M pro crystallographic structure .
Moreover, the stability of hits-M pro complexes was evaluated by calculating the radius of gyration Rg (Figure 9(A)). All the 6 hits-M pro systems showed protein compactness, Rg results indicate that all 6 systems are stable during MD simulations.

H-bonds analysis
To get insight of hydrogen bonds between the 6 hits and binding site of M pro , the number of hydrogen bonds was monitored over the 100 ns simulation, it was found ZINC58717986 displayed an average 3 H-bonds, while compounds ZINC60399606, ZINC45988635 and ZINC54706757 displayed an average two H-bonds and compounds ZINC58662884 and ZINC17320595 displayed four H-bonds (Figure 9(B)).

Free energy binding for complexes
To provide insight into the binding mechanisms between the 6 hits and M pro , binding free energies were calculated by MM-PBSA (Table 6). The free binding energy values obtained for the 6 systems were as follows: 13 b, compound ZINC58717986, ZINC60399606, ZINC58662884, ZINC45988635, ZINC54706757 and ZINC17320595 are À76.67 ± 0.5, À98.41 ± 0.7, À59.40 ± 1.0, À50.12 ± 0.5, À58.06 ± 1.0, À83.4 ± 0.6 and À78.85 kJ/mol, respectively. It could be seen those compounds ZINC58717986, ZINC54706757 and ZINC17320595 showed free binding energy lower than 13 b which reflected the strong binding of these three molecules to M pro and suggested that the three hits have a promising potential activity against SARS-CoV-2. Interestingly, ZINC58717986 showed the lowest free binding energy (À98.41 kJ/mol) among the 6 hits and the a-ketoamide 13 b  which gave an indication ZINC58717986 may have a high potential activity against SARS-CoV-2. For all the 6 complexes, the greatest contribution for binding to the protein active site occurred due to the contributions of van der Waals(DEvdw) ( Table 6), this is due to the noncovalent interactions carried out by the inhibitors. The polar free solvation energy (DG pol ) was unfavorable contributes to the free binding energy of the systems. The contributions of non-polar (DEnonpol) and gas phase electrostatic interactions (DEele) were favorable to the systems, but due to their low values, van der Waals interactions were the most important for inhibitor free binding affinity.

Analysis protein structure inhibitors affinity relationship
To identify the key residues contribution in the binding pocket, the binding free energies were decomposed into residues using the mmpbsa.py. The importance of identification the hot spot residues is to help in rationally design potent and selective inhibitors of M pro . The key residues for the 6 hits and 13 b are summarized in Table 7. The most significant key residues for the 6 molecules are L27, H41, S46, M49, G143, C145, H164, M165, E166, L167, P168, D187, Q189, T190 and Q192. These results are in consistent with MD results of interaction analysis.  The identified hot spot residues play an important role in SARS-CoV-2 M pro activity and in shaping the substrate-binding site, for example, E166 is curial for dimerization of SARS-CoV-2 M pro which is necessary for the catalytic activity, as NH 2 terminal of each of the two protomers interacts with E166 of the other protomer and this helps in the shaping of the S1 substrate-binding site . H41 and C145 are the catalytic dyad, N3 inhibitor formed a covalent bond with C145 while a-keto group of a-ketoamide (13 b) inhibitor nucleophilic attack by the catalytic C145 and formed a thiohemiketal and the oxyanion group of the thiohemiketal is stabilized by a hydrogen bond with His41 . Residues H41, M49, M165 and D187 formed S2 subsite while residues M165, L167, P168 and Q192 formed small hydrophobic pocket (Jin et al., 2020). The control inhibitor showed H-bonds with G143, H163 and E166 (Mittal et al., 2020). interactions the promising hits with these residues suggest their potential promising activity.

Trajectory analysis of ZINC58717986
ZINC58717986 showed the best binding free energy comparing with the control compound a-ketoamide 13 b, showed stable distance pattern with the binding pocket, and formed H-bonds with the key residues. These good results encourage us to study the molecular contact of the ligand in the last poses of the simulation and compare the interactions and residence of the ligand of last poses with the first pose of the molecular dynamics. It was showed at first 40 ns, ZINC58717986 showed H-bonds with H41, E166 and Q189, hydrophobic contact with M49, M165, E166, P168 and A191, and close contact to G143, C145, T190 and Q192 (Figure 11(A,D)). At 60 ns the ligand showed the same binding mode and formed H-bonds and showed hydrophobic interactions similar to the pose at 40 ns ( Figure 11(B,E)). ZINC58717986 continuing showing the same binding mode and H-bonds profile at 80 ns similar to the poses at 40 and 60 ns, expect T190 showed slightly movement (Figure11(C,F)). These results confirmed that ZINC58717986 has stable binding mode within the docked pocket and kept stable interactions with the key residues.

Conclusion
Regarding the importance of M pro for SARS_CoV-2 survival in the host, inhibition of M pro represents an effective approach for the treatment of COVID-19. In this study, with the aid of the released crystallographic structure of M pro in complex with N3 and a-ketoamide 13 b ligands, structure-based 3D Pharmacophore based virtual screening were performed to identify novel SARS-CoV-2 inhibitors. ZINC database libraries consisting of diverse drug-like small molecules subjected to a molecular docking screening followed by filtration by the 3D pharmacophore, next was the visual examination. Finally, 6 hit compounds were selected namely, ZINC58717986, ZINC60399606, ZINC58662884, ZINC45988635, ZINC54706757 and ZINC17320595. Molecular dynamics simulation for 100 ns were performed for the selected hits. The hydrogen bonding scanning reveals that residues H41, E166, Q189, T190 and Q192 are the important residues for the binding of 6 hits to the binding pocket of M pro . M49, M165, L167, P168 involved in a hydrophobic interaction while the residues T24, T25, C145, H164, E166, D187 and R188 involved in salt bridge Poisson-Boltzmann surface area (MM-PBSA) binding energy calculations were performed. ZINC58717986, ZINC54706757 and ZINC17320595 showed free binding energy (98.41 ± 0.7, À83.4 ± 0.6, À78.85 ± 0.5 kJ/mol) lower than 13 b (À76.67 ± 0.5 kJ/mol). ZINC58717986 showed the lowest free binding energy which suggested the potential activity of this molecule against SARS-CoV-2. The per-residue free energy decomposition analysis suggested that the hot spot residues that contributed to the hits binding are L27, H41, S46, M49, H164, M165, E166, L167, P168, D187, Q189 and Q192. The pharmacokinetics of the selected hits indicated that those molecules possess drug like properties. Finally, 6 hits were selected as the candidates for SARS-CoV-2 inhibitors. ZINC58717986 is the most potential candidate as SARS-CoV-2 inhibitor. The promising 3 hits with lowest free binding energy and good stability within the binding pocket are available and ready to purchase which facilitates further investigation of their inhibitory activity in the future and carrying out further lead optimization and developing selective and potent derivatives as COVID-19 inhibitors.