Proposing lead compounds for the development of SARS-CoV-2 receptor-binding inhibitors

Abstract The COVID-19 pandemic has had deleterious effects on the world and demands urgent measures to find therapeutic agents to combat the current and related future outbreaks. The entry of SARS-CoV-2 into the host’s cell is facilitated by the interaction between the viral spike receptor-binding domain (sRBD) and the human angiotensin-converting enzyme 2 (hACE2). Although the interface of sRBD involved in the sRBD-hACE2 interaction has been projected as a primary vaccine and drug target, currently no small-molecule drugs have been approved for covid-19 treatment targeting sRBD. Herein structure-based virtual screening and molecular dynamics (MD) simulation strategies were applied to identify novel potential small-molecule binders of the SARS-CoV-2 sRBD from an sRBD-targeted compound library as leads for the development of anti-COVID-19 drugs. The library was initially screened against sRBD by using the GOLD docking program whereby 19 compounds were shortlisted based on docking scores after using a control compound to set the selection cutoff. The stability of each compound in MD simulations was used as a further standard to select four hits namely T4S1820, T4589, E634-1449, and K784-7078. Analyses of simulations data showed that the four compounds remained stably bound to sRBD for ≥ 80 ns with reasonable affinities and interacted with pharmacologically important amino acid residues. The compounds exhibited fair solubility, lipophilicity, and toxicity-propensity characteristics that could be improved through lead optimization regimes. The overall results suggest that the scaffolds of T4S1820, E634-1449, and K784-7078 could serve as seeds for developing potent small-molecule inhibitors of SARS-CoV-2 receptor binding and cell entry. Communicated by Ramaswamy H. Sarma

The spike protein of SAR-CoV-2 consists of the S1 and S2 subunits which are involved in receptor recognition and cell membrane fusion processes, respectively.The S1 subunit of the spike protein contains the spike receptor-binding domain (sRBD) which facilitates the entry of the virus into the cell of the human host by interacting with the human angiotensinconverting enzyme 2 (hACE2) (Jackson et al., 2022;Lan et al., 2020;Shang et al., 2020;Wang et al., 2020;Wrapp et al., 2020).Considering the crucial role of the sRBD-hACE2 interaction (Figure 1) in the initiation of the viral infection, the interface of sRBD involved in the interaction has been designated as a primary target for vaccine and drug development (Huang et al., 2020;Papageorgiou & Mohsin, 2020).Thus compounds that could interact strongly with the sRBD interface could serve as potential inhibitors by competing with hACE2 leading to the disruption or weakening of the sRBD-hACE2 interaction.Notwithstanding that the sRBD interface of interest is a typical protein-protein interface and could be a challenging target for the discovery of small-molecule drugs because these interfaces are usually large, flat, and lack the traditional drug pockets (Hopkins & Groom, 2002;Jones & Thornton, 1996;Lo Conte et al., 1999), potential small-molecule inhibitors targeting sRBD have been reported (Bojadzic et al., 2021;David et al., 2021;Muhseen et al., 2020).Small-molecule drugs could offer the benefits of being less sensitive to mutation-driven resistance, less CONTACT Elvis Awuni elvis.awuni@ucc.edu.ghSupplemental data for this article can be accessed online at https://doi.org/10.1080/07391102.2023.2204505.immunogenic, and appropriate for oral administration compared to vaccines and antibodies (Downing et al., 2017).
In this study, structure-based virtual screening (SBVS) and molecular dynamics (MD) simulation strategies were applied to identify novel potential small-molecule binders of the SARS-CoV-2 sRBD from an sRBD-targeted library as lead compounds for the development of anti-COVID-19 drugs.The targeted library was initially screened against sRBD by using the GOLD docking program (Verdonk et al., 2003) whereby 19 compounds were shortlisted based on docking scores after using a control compound, NPACT01552 (Figure S1), predicted in a previous in silico study as a promising sRBDtargeting inhibitor of the sRBD-hACE2 interaction (Muhseen et al., 2020), to set the selection cutoff.The stability of each compound in MD simulations was used as a further standard to select four hit compounds namely T4S1820, T4589, E634-1449, and K784-7078.Analyses of MD simulations data showed that the four compounds remained stably bound to sRBD for � 80 ns with reasonable affinities and interacted with pharmacologically important amino acid residues.The compounds exhibited fair solubility, lipophilicity, and toxicitypropensity characteristics that could be improved through lead optimization strategies.The overall results suggest that the scaffolds of T4S1820, which emerged as the most promising candidate, E634-1449, and K784-7078 could serve as seeds for developing potent small-molecule inhibitors of SARS-CoV-2 receptor binding and cell entry.

Receptor and ligand library retrieval and preparation
The three-dimensional (3D) crystal structure and FASTA sequence of the SARS-CoV-2 sRBD in complex with hACE2 (PDB ID: 6LZG (Wang et al., 2020)) were downloaded from the protein databank (PDB).All water and N-acetylglucosamine (NAG) molecules were deleted, after which the complex was split into the separate hACE2 (Chain A) and sRBD (Chain B) 3D components.The NAG atoms were deleted because the four surface-bound NAG molecules found on the complex (3 on hACE2 and 1 on sRBD) were located outside the interacting interface and thus may not have any effect on the sRBD-hACE2 interaction (Rowland & Brandariz-Nuñez, 2021) and the binding of inhibitors to the targeted sRBD interface.The SWISS-MODEL server (Guex et al., 2009;Waterhouse et al., 2018) was used to replace all missing atoms and residues in sRBD and hACE2 by specifying the corresponding FASTA sequences as the target sequences.As the receptor for SBVS in this study, the sRBD structure was prepared by using the automatic protein preparation tool in BIOVIA Discovery Studio 2016 during which it was fully protonated, energy minimized, and saved as a mol2 file for the GOLD docking program (Verdonk et al., 2003).The hACE2 structure, on the other hand, was purposely used for the construction of complexes involving sRBD and hit compounds.
A SARS-CoV-2 sRBD-targeted compound library (Catalog No. L1714) consisting of 206 compounds was obtained from TargetMol (targetmol.com).A targeted library is a collection of compounds with special features that could enhance their ability to interact with a specific target (Harris et al., 2011).The automatic ligand preparation tool in BIOVIA Discovery Studio 2016 was used for the ligand library preparation that involved protonation, energy minimization, removal of duplicate entries, and fixing of bad valences in the constituent compounds.Four entries in the library were rejected by the tool, and thus 202 compounds were prepared and saved as a mol2 library file.Three compounds, NPACT01552, NPACT01557, and NPACT00631 (Figure S1), proposed in a previous computational study (Muhseen et al., 2020) as possible seeds for the development of sRBD-targeted SARS-CoV-2 inhibitors, were also prepared and added to the library to serve as control compounds to facilitate the hits selection process.

Structure-based virtual screening and hits selection
The SBVS process was carried out by using the GOLD program (Jones et al., 1995;1997;Verdonk et al., 2003), which docks flexible ligands into protein binding sites.The GOLD program uses the genetic algorithm (GA) as the search engine to predict the binding modes during which dihedrals of ligand rotatable bonds, ligand ring geometries, dihedrals of receptor OH and NH 3 þ groups, and the mappings of the fitting points are optimized.To place a ligand in the binding site, GOLD uses a fitting-points-based method in which it maps hydrogen-bond (H-bond) acceptor points on the ligand to H-bond donor points in the receptor and vice versa by first of all adding fitting points to H-bonding groups on the receptor and ligand.Also, GOLD maps ligand CH groups onto hydrophobic fitting points it generates in the receptor cavity.Finally, predicted binding modes are ranked by CHEMPLP, GoldScore, ChemScore, ASP, or a user-defined scoring function.The GoldScore function adopted in this study consists of a fitness score made up of the protein-ligand H-bond score (S hb_ext ), the protein-ligand van der Waals score (S vdw_ext ), the contribution to the fitness due to intramolecular H-bonds in the ligand (S hb_int ), which is switched off by default, and the contribution due to intramolecular strain in the ligand (S vdw_int ) as illustrated by the following equation: The GoldScore scoring function and the default GA parameters of the GOLD docking software (Verdonk et al., 2003) were applied to screen the prepared library of small molecules against the targeted interface of sRBD defined by the following amino acid residues that play important roles in the sRBD-hACE2 interaction (Wang et al., 2020): Arg403, Glu406, Lys417, Ile418, Tyr421, Gly446, Gly447, Tyr449, Tyr453, Leu455, Phe456, Ala475, Glu484, Gly485, Phe486, Asn487, Tyr489, Pro491, Gln493, Ser494, Tyr495, Gly496, Phe497, Gln498, Thr500, Asn501, Gly502, and Tyr505.After the screening process, hits were selected based on the GOLD fitness scores by which the bestranking.lstfile, which contains the best score for each ligand, was sorted in decreasing order.A fitness score of 60.7 obtained by NPACT01552 which emerged as the best-scoring control molecule was then used to set the selection cutoff such that all compounds with fitness scores � 60.7 were selected and those with scores < 60.7 were rejected.Following this selection criterion, 19 compounds were shortlisted for further evaluation.A complex involving the receptor and the best-scoring pose of each of the 19 compounds was saved as a PDB file and used as the initial structure in preliminary 20 ns MD simulations.After the simulations, the RMSDs of the compounds and the final structures of the sRBD-compound complexes (Figures S2 and  S3) were inspected to select the final hits based on the ability of a compound to remain stably bound to the targeted interface of sRBD.Five compounds with IDs T4S1820, T4589, T4495, E634-1449, and K784-7078 (Figure 2) and their corresponding complexes with sRBD were finally selected for further studies.

Molecular dynamics simulations
In each case, the traditional MD simulations protocol involving the GROMACS 5.1.1 simulation suite (Van Der Spoel et al., 2005) was applied whereby the Amber 99SB force field (Cornell et al., 1995) and the TIP3P water model (Jorgensen et al., 1983) were used to build the topology of the protein.
Alternatively, the topology of the ligand was generated for the general amber force field (Junmei Wang et al., 2004) by using the antechamber tool in the AMBER 20 suite (Case et al., 2005;Pearlman et al., 1995) during which the appropriate atom types and RESP charges (Bayly et al., 1993) were assigned.The system was centered in a cubic box with 10 Å defined as the minimum distance of separation between the surface of the protein and the edges of the box.The box was then solvated with an explicit TIP3P water model (Jorgensen et al., 1983), after which the required number of Na þ and Cl -ions were added to neutralize the system and adjust the ionic strength to 0.1 M. Long-range electrostatic interactions were treated by using the Particle Mesh Ewald method (Darden et al., 1993).The LINCS (Hess et al., 1997) and the steepest descent algorithms were then used to restrain all bonds involving hydrogen atoms at their constant equilibrium lengths and energy-minimize the system, respectively.By applying position restraints on the heavy atoms of the protein and ligand and assigning initial velocities to the atoms in the system, a 5 ns equilibration was performed at 1 bar and 300 K in the NPT ensemble using the Berendsen barostat for pressure and the V-rescale thermostat (Bussi et al., 2007) for temperature coupling.A production run was continued in the NPT ensemble to 100 ns and 50 ns for the sRBD-compound and sRBD-hACE2-compound complexes, respectively, with data collected every 1 ps.Three replicates of 100 ns simulations were performed for each complex involving sRBD and each of the five selected compounds.

Selected compounds and docking scores
Five compounds (Figure 2) including T4S1820 (Parishin A), a derivative of Parishin, isolated from the rhizomes of Gastrodiaelata (Lin et al., 2016); T4589 (9 0 ''-Methyl lithospermate B) isolated from the roots of Salvia yunnanensis (Zhang et al., 2007); T4495 (dBET1), a hybrid compound of (þ)-JQ1 and thalidomide (Winter et al., 2015); E634-1449; and K784-7078 emerged as the selected hits for further studies.The predominance of rings in the chemical structures of the compounds is not surprising because the target interface of sRBD is a typical protein-protein interface characterized by aromatic and hydrophobic amino acids to facilitate the hydrophobic effect phenomenon normally required to drive protein-protein interactions (Chanphai et al., 2015;Southall et al., 2002;Tsai et al., 1997).Thus the rings in the selected compounds could expedite hydrophobic interactions with sRBD.Also, there are H-bond donor and acceptor groups distributed within the compounds that could enhance H-bonding with sRBD.
Fitness scores of 84.1, 68.3, 64.9, 62.9, and 60.7 were obtained by T4S1820, T4589, T4495, E634-1449, and K784-7078, respectively.The control compounds NPACT01552, NPACT01557, and NPACT00631 on the other hand obtained fitness scores of 60.7, 59.3, and 52.8, respectively.Largely, the scores obtained by the compounds are reasonable considering that targeting protein-protein interfaces with small molecules is a daunting task due to the wider and flat search areas without clearly-defined binding cavities (Hopkins & Groom, 2002;Jones & Thornton, 1996;Lo Conte et al., 1999).

Interactions between the selected compounds and sRBD
The complex of sRBD and each selected compound was analyzed by using PyMol (Schr€ odinger, 2014) and LigPlot (Wallace et al., 1995) software to ascertain the ligand binding mode and interactions.Figures 3 and 4 illustrate the poses and interactions made by the selected compounds on the targeted interface of sRBD. Figure 3A shows that T4S1820 makes H-bonds with the amino acid residues Arg403, Glu406, Gln409, Tyr453, Glu484, Phe490, Gln493, Gly496, Thr500, and Gly502.T4S1820 also makes p-cation and p-p stacking interactions with Arg403 and Tyr505, respectively.Other residues that make contacts with T4S1820 include Gly416, Lys417, Leu455, Phe456, Tyr489, Ser494, Tyr495, and Asn501.As illustrated in Figure 3B, T4589, on the other hand, makes H-bonds with residues Arg403, Ser494, and Asn501.The amino acid residues Arg403 and Tyr505 are involved in p-cation and p-p interactions with T4589, respectively.Nonbonding contacts are also made between T4589 and Glu406, Lys417, Ile418, Tyr453, Gln493, Tyr495, and Gly496 of sRBD.

RMSD analyses
To assess the stability of each of the selected compounds regarding its ability to remain bound to the targeted interface of sRBD, the gmx rms tool in GROMACS was used to calculate the root mean square deviation (RMSD) of the compound as a function of simulation time by least-square fitting the compound to the protein.This strategy of calculating the RMSD of the ligand allows one to assess how far the ligand moves from the binding site and the surface of the receptor during the simulations.A stable ligand undergoes a minimal conformational change and maintains proximity to the receptor depicted by a small and stable RMSD, while an unstable ligand records a significant conformational change and/or distance of separation from the receptor characterized by large and unstable RMSD as a function of simulation time.
The RMSDs of T4S1820, T4589, T4495, E634-1449, and K784-7078, in three simulations, are shown in Figure S5A, S5B, S5C, S5D, and S5E, respectively.Figure S5 suggests that T4495 is the least stable and could remain bound to sRBD for not more than 40 ns (Figure S5C).Based on this observation, T4495 was rejected from the hit list.For the remaining four compounds which could each maintain stability up to the range of 80-100 ns, the most stable trajectories (Figure 5A) were selected for further analyses.Figure 5A indicates that the RMSDs of T4S1820, T4589, and K784-7078 shown as blue, red, and yellow curves, respectively, remained stable up to 100 ns whereas the RMSD of E634-1449 (green curve) could remain stable up to 80 ns.Since the results suggest that the RMSD curves of T4S1820, T4589, E634-1449, and K784-7078 all converged around 40 ns, it is worth noting that whenever appropriate, the 40-100 ns simulation trajectory of each compound was used for conformational analyses except 40-80 ns in the case of E634-1449.
To find out if there is any major global backbone conformational change in sRBD induced by T4S1820, T4589, E634-1449, and K784-7078, which in some cases could explain the mechanism of a ligand at the molecular level (Awuni & Mu, 2019;Rumpf et al., 2015;Yuan et al., 2015), the backbone conformations of the ligand-bound forms and apo form of sRBD were monitored by calculating the RMSD of the backbone atoms as a function of time by least-square fitting to the same atoms.The apo form of sRBD was added for comparative purposes.The results obtained are illustrated in Figure 5B where the blue, red, green, yellow, and cyan curves represent the backbone RMSDs of the T4S1820, T4589, E634-1449, K784-7078, and apo forms of sRBD, respectively.Figure 5B shows that the backbone RMSDs of the ligand-bound forms of sRBD compare well with the apo form.Overall, minimal RMSD fluctuations between 1.0-2.5 Å are observed; suggesting that there is no significant global conformational change induced in the backbone of sRBD following the binding of T4S1820, T4589, E634-1449, and K784-7078.Nonetheless, it is worth mentioning that there could be minor or local conformational changes in some secondary structure elements that could contribute to a possible mechanism of inhibition.

RMSF analyses
To determine if there are any possible atomic fluctuations within the backbone atoms of sRBD following the interaction with T4S1820, T4589, E634-1449, and K784-7078, the gmx rmsf utility in GROMACS was used to calculate the root mean square fluctuations (RMSF) of the backbone atoms of the ligand-bound forms and apo form of sRBD by using the last 60 ns of each trajectory (except 40-80 ns trajectory for E634-1449) and the corresponding initial equilibrated structure as the reference.As shown in Figure 5C, where the blue, red, green, yellow, and cyan curves depict the backbone RMSFs of sRBD in the T4S1820, T4589, E634-1449, K784-7078, and apo forms, respectively, the results propose that overall there is no significant difference in the atomic fluctuations observed in the ligand-bound forms in comparison with the apo form except the intermittent spikes observed in the T4589-bound form.

Hydrogen bond analyses
H-bonds may contribute to ligand specificity, affinity, and stability (Bitencourt-Ferreira et al., 2019;Kretschmer et al., 2012;Wade & Goodford, 1989); and their presence is a function of the existence of H-bond donors (mainly OH and NH groups) and H-bond acceptors (mainly O and N atoms) in the ligand and the protein target site.By using the gmx hbond tool in GROMACS, the number of intermolecular H-bonds between sRBD and each of T4S1820, T4589, E634-1449, and K784-7078 was calculated as a function of simulation time by using the default values of a donor-acceptor distance of 3.5 Å and a donor-acceptor angle of 30 � for defining an H-bond.The results as presented in Figure 5D, where the blue, red, green, and yellow, curves illustrate the number of intermolecular Hbonds between sRBD and T4S1820, T4589, E634-1449, and K784-7078, respectively, indicate that T4S1820 relatively achieves the best H-bond counts with maximum values above 10 at some points.Figure 5D also shows that T4589 (after 40 ns) and K784-7078 obtained comparable H-bond counts albeit the H-bonds in K784-7078 appear to be more stable.E634-1449, however, recorded the worse H-bond counts (<4) and thus hints that other interactions could be the major forces responsible for E634-1449 binding.

Free energy landscape and conformational analyses
Free energy landscape (FEL) analyses were performed firstly on the four selected compounds and on the backbone of sRBD to determine the most stable poses and the stable ligand-induced receptor conformations, respectively, during the simulations by using the 40-100 ns (except 40-80 ns for E634-1449) trajectories.Regarding the FEL analyses on the compounds, in each case, the gmx covar utility in GROMACS was used to produce eigenvectors by least-square fitting the compound to sRBD and performing covariance analysis on the compound.By using the corresponding eigenvectors as input, the gmx anaeig tool incorporated in GROMACS was used to create a projection of principal component 1 (PC1) and principal component 2 (PC2) for each compound for the generation 2D FEL diagrams shown in Figure 6.The results show that apart from T4S1820 (Figure 6A) which produced two lower energy basins (populated bin1 and less populated bin2), T4589 (Figure 6B), E634-1449 (Figure 6C), and K784-7078 (Figure 6D) each generated one main lower energy basin in the FEL diagram.The results suggest that two predominant poses were explored by T4S1820 during the simulations in contrast to a single pose by each of the remaining compounds.This phenomenon of two poses being explored by T4S1820 could explain the pattern of H-bond counts observed in Figure 5D (blue curve) in that the intermittent H-bond spikes originated from the less populated pose in bin2 while the relatively stable and dominant counts originated from the most populated pose in bin1.
To compare the most stable binding modes of T4S1820, T4589, E634-1449, and K784-7078 explored in the simulations with the initial docking poses, the ligand atoms in a complex structure of sRBD and each compound extracted from the lower energy basin(s) of the appropriate FEL diagram in Figure 6 were superimposed on the corresponding atoms in the initial docking structure.The results are shown in Figure 7 where the carbon atoms of the initial docking poses are illustrated as magenta sticks and those extracted from the lower energy bins are shown as blue sticks.Figure 7A shows the atoms of the simulated poses of T4S1820 extracted from bin1 (deep blue sticks) and bin2 (light blue sticks) of the FEL diagram (Figure 6A) superimposed on the initial docking pose (magenta) with global RMSDs of 3.24 Å and 3.07 Å, respectively.Figure 7A indicates that during the simulations T4S1820 adopted poses that are close to the initial docking pose.The main difference between the poses extracted from bin1 and bin2 of T4S1820 emanates from the rotation of rings though both structures sit in the same cavity.Regardless, the pose in the most populated lower-energy minimum (bin1) of T4S1820 represents the most energetically stable binding mode.Aside from rotation and ring flipping of some terminal rings, the simulated structure extracted from the lower energy bin of the FEL diagram of T4589 (Figure 6B) to a larger extent maintains the pose of the initial docking structure with an all-atom RMSD of 3.68 Å as shown in Figure 7B.As shown in Figure 7C, the simulated pose of E634-1449 aligns closely with the initial docking pose with a global RMSD of 0.87 Å although this configuration could only be maintained up to 80 ns.The simulated pose of K784-7078 also maintains the initial docking orientation as shown in Figure 7D with a resultant all-atom RMSD of 2.14 Å.In general, the results recommend that T4S1820, E634-1449, and K784-7078 are stable ligands and promising leads.
Concerning the FEL analysis on the backbone of sRBD, the method was the same as indicated above except that the calculations were performed on the backbone atoms of the T4S1820, T4589, E634-1449, and K784-7078 forms of sRBD plus the apo form by least-square fitting to the same atoms.The results obtained are presented in Figure S6.The FEL plots in Figure S6 show that apart from the T4S1820-bound form of sRBD that again produced an FEL diagram characterized by two distinct lower energy basins (Figure S6A), all the other forms of sRBD including the apo form recorded one distinct lower energy basin (Figure S6B-E).To find out if there is a significant backbone conformational change in sRBD induced by the bound compounds, the backbone structure of the ligandbound form of sRBD was extracted from the lower energy basin(s) of each FEL diagram in Figure S6 and superimposed on the backbone atoms of the most stable apo conformation extracted from the FEL diagram of the apo form of sRBD (Figure S6E).The global RMSD was estimated in each case.As illustrated in Figure S6F, the superimposition of the backbone atoms of the simulated T4S1820-bound (bin1), T4S1820-bound (bin2), T4589-bound, E634-1449-bound, and K784-7078-bound sRBD structures (colored deep blue, light blue, red, green, and yellow, respectively) on the apo form (cyan) generated allatom RMSDs of 1.93 Å, 2.10 Å, 2.38 Å, 1.83 Å, and 2.71 Å, respectively.The results suggest that there is no significant backbone conformational change apart from minor loop displacements.In particular, the targeted interface of sRBD, marked by the rectangle in Figure S6F, shows no major backbone conformational change.This observation is not surprising because as a protein-protein interface, the ligands interact with the surface amino acid residues of sRBD and may not cause major backbone conformational changes.

Binding energy analyses
The g_mmpbsa tool (Kumari et al., 2014) was used to estimate the binding energies of T4S1820, T4589, E634-1449, and K784-7078 to sRBD as well as the per residue contribution to the corresponding energies by using a solute dielectric constant of 1 and the default parameters of the tool.In each case, 1000 frames of the protein-ligand complex were extracted from stable regions of the RMSD curve (Figure 5A) of the compound for the calculations.Binding energies of À 188.63 ± 31 kJ/mol, À 105.23 ± 24 kJ/mol, À 98.46 ± 22 kJ/mol, and À 90.16 ± 19 kJ/mol were obtained for T4S1820, T4589, E634-1449, and K784-7078, respectively, indicating a decreasing affinity for sRBD in the same order.The energy decomposition plots shown in Figure 8, where only residues whose contributions are � 2.5 kJ/mol are labeled, indicate the specific amino acid residues in the targeted interface of sRBD that contributed to the binding energies obtained.As shown in Figure 8A, Ile402, Arg403, Arg408, Lys417, Glu484, Tyr495, and Tyr505 are the main amino acid residues of sRBD that make favorable contributions while Glu406 and Ser494 render unfavorable contributions to the binding energy of T4S1820. Figure 8B shows that residues Arg403, Lys417, Tyr453, Tyr495, T500, and Tyr505 contribute favorably to the binding energy of T4589 while Glu406 and Gly496 offer unfavorable contributions.Residues Phe497, Asn501, Gly502, and Tyr505 mainly make favorable contributions to the binding energy of E634-1449 as illustrated in Figure 8C.In Figure 8D, Arg403, Phe497, Asn501, and Tyr505 are shown to make favorable contributions while Asp405, Glu406, and Tyr495 make unfavorable contributions to the binding energy of K784-7078.Intriguingly, most of the amino acid residues found to make favorable contributions to the binding energies are important pharmacological targets because they have been identified as relevant for the interaction between sRBD and hACE2 (Andersen et al., 2020;Muhseen et al., 2020;Wang et al., 2020).The binding energy data could be useful during lead optimization processes involving the selected compounds.

Solubility assessment
The solubility of a drug candidate is an important property worth considering during the drug discovery and development process because it could greatly influence the ADMET properties, bioavailability, and efficacy of a drug (Guo & Shen, 2004;Wang & Skolnik, 2009;Zhong, 2017).Solubility is particularly an indispensable property to consider if the drug is meant for oral administration, which undoubtedly is the most common, convenient, and cost-effective means of administering drugs.Insoluble drugs are poorly absorbed leading to reduced bioavailability when administered orally.Nonetheless, depending on the properties of the drug, site of absorption, and the required dosage, there are reports of various physical modifications (e.g.particle size reduction, modification of the crystal habit, amorphous form and cocrystallization, drug dispersion in carriers, solid dispersions, solid solutions, and cryogenic techniques); chemical modifications (e.g.change of pH, use of buffer, derivatization, complexation, salt formation); and miscellaneous methods (e.g.supercritical fluid process, use of adjuvants like surfactant, solubilizers, cosolvency, hydrotrophy, and novel excipients) for enhancing the solubility of poorly soluble drugs or drug candidates (Sareen et al., 2012;Savjani et al., 2012).
The SwissADME web-based tool (Daina et al., 2017) was used to assess the solubility of the four selected compounds.The log S solubility of each compound was determined by using the ESOL (Delaney, 2004), Ali et al. (2012), and SILICOS-IT models.Overall, the results as shown in Table S1 flag T4S1820 as soluble, moderately soluble, and soluble according to the ESOL, Ali, and SILICOS-IT models, respectively, whereas T4589, E634-1449, and K784-7078 are predicted as being either moderately or poorly soluble.

Lipophilicity assessment
Lipophilicity is another useful property of a drug candidate that should be assessed during the drug discovery and design process as it can also immensely affect the ADMET properties, bioavailability, and efficacy of a drug (Arnott & Planey, 2012;Wang & Skolnik, 2009;Waring, 2010;Zhong, 2017).Lipophilicity, measured as Log P, affects the solubility, uptake/permeability, metabolism, and toxicity of a drug (Arnott & Planey, 2012).Low lipophilicity values may affect the ability of a drug to penetrate the phospholipid bilayer of the plasma membrane of cells to reach the intended target (Liu et al., 2011) while high lipophilicity could promote promiscuity and off-target binding (Tarcsay & Keser} u, 2013).Concerning the low lipophilic (hydrophilic) compounds that cannot efficiently traverse the plasma membrane, methods involving the use of carriers such as liposomes, nanoparticles, nanobbubles, and hydrogels have been developed to facilitate their transportation to the intended targets (Arpicco et al., 2016;Larrañeta et al., 2018;Mirchandani et al., 2021).For oral drugs, it is generally suggested that candidates with a log P > 1 or < 4 are more likely to possess optimal physicochemical and ADME properties (Gao et al., 2017;Gleeson, 2008;Lipinski, 2004;Waring et al., 2010).
By using the SwissADME web-based tool (Daina et al., 2017) for the lipophilicity predictions, a consensus partition coefficient between n-octanol and water (log P o/w ) was determined for each compound as an average of values from five log P o/w models including iLOGP (Daina et al., 2014), XLOGP3 (Cheng et al., 2007), WLOGP (Wildman & Crippen, 1999), MLOGP (Moriguchi et al., 1992;1994), and SILICOS-IT.The results as shown in Table S2 suggest that T4S1820 (consensus log P o/w ¼ À 2.82) is hydrophilic while the remaining three compounds are lipophilic with increasing order of lipophilicity as T4589 (consensus log P o/w ¼ 2.60), K784-7078 (consensus log P o/w ¼ 3.01), and E634-1449 (consensus log P o/w ¼ 4.74).The log P values of T4589 and K784-7078 (2.60 and 3.01, respectively) meet the log P requirement of > 1 or < 4 for an oral drug while the log P of E634-1449 (4.74) marginally fall outside this range.

Toxicity assessment
The hepatotoxicity, carcinogenicity, immunotoxicity, mutagenicity, and cytotoxicity potentials of T4S1820, T4589, E634-1449, and K784-7078 were predicted by using the ProTox-II web server (Banerjee et al., 2018) which integrates molecular similarity, pharmacophores, fragment propensities and machine-learning models for the prediction of various toxicity endpoints.Table 1 shows the results obtained in which the prediction as 'active' (toxic) or 'inactive' (non-toxic) is made with the corresponding probability of the prediction indicated.The deep green, light green, magenta, and red colors in Table 1 connote inactive, moderately inactive, moderately active, and active predictions, respectively.Generally, T4S1820 was predicted not to possess hepatotoxic, carcinogenic, immunotoxic, mutagenic, and cytotoxic properties with high certainties.The remaining three compounds also exhibited some reasonable levels of non-toxic potentials except the prediction of T4589, E634-1449, and K784-7078 as possible active agents of immunotoxicity and mutagenicity, hepatotoxicity, and immunotoxicity, respectively.

Modeling the mechanism of the selected compounds
To enter the human-host cell, the sRBD of the SAR-SCOV-2 interacts with the hACE2 resulting in a series of events leading to the fusing of the virus into the cell followed by the multiplication of the viral particles (Jackson et al., 2022;Lan et al., 2020;Shang et al., 2020;Wang et al., 2020;Wrapp et al., 2020).As such, small molecules that could bind to the interface of sRBD involved in the interaction could stop the virus from entering the host's cell by either impeding, disrupting, or weakening the sRBD-hACE2 complex and/or interfering with the flow of the information needed to trigger the cascading events that lead to infection.Backbone conformational analyses (Figures 5B and S6) suggest that the hit compounds, T4S1820, T4589, E634-1449, and K784-707, may not interfere with the sRBD-hACE2 interaction by inducing conformational changes in sRBD.Therefore, to investigate the possible effect that each of these compounds could have on the complex en bloc, sRBD-hACE2 complexes were constructed in the presence of T4S1820, T4589, E634-1449, and K784-7078 as shown in Figure 9A-D, respectively, by using the corresponding docking structures and hACE2 from the PDB structure 6LZG.Each complex was subjected to a 50 ns MD simulation after which the intramolecular H-bonds between sRBD and hACE2 were quantified as a function of simulation time by using the gmx hbond tool in GROMACS.The results were then compared with the intramolecular H-bonds between sRBD and hACE2 of the apo form (sRBD-hACE2 structure without a bound compound) of the complex.The observed number of H-bonds between sRBD and hACE2 in the T4S1820, T4589, E634-1449, K784-7078, and apo forms are shown as blue, red, green, yellow, and cyan curves in Figure 9E, respectively.Figure 9E hints that the number of H-bonds between sRBD and hACE2 reduces in the presence of T4S1820, E634-1449, and K784-7078 at the interacting interface.To determine the effect of this observation on the conformation and stability of the complex, the RMSD of the backbone of each complex was calculated by least-square fitting to the same backbone atoms.Figure 9F shows the results of the backbone RMSDs of the T4S1820, E634-1449, and K784-7078 forms of the sRBD-hACE2 complex in blue, red, green, and yellow curves, respectively.Figure 9F shows that the backbone RMSDs of the ligand-bound complexes increase, suggesting that the corresponding holo forms are relatively unstable compared with the apo form (cyan curve).Summing up, the results thus suggest that T4S1820, E634-1449, and K784-7078 can shield the interactions between sRBD and hACE2 leading to complex instability and/or interference with the flow of signaling information required for the entry of the virus into the host's cell.

Discussion
Advancements in structural biology and combinatory chemistry have made available structures of many drug targets and broadened the chemical space leading to the evolution of SBVS (Blundell, 1996;Jhoti et al., 2013;Lavecchia & Di Giovanni, 2013).Notwithstanding, the number of commercial drugs availed to the market has been on a continuous decline over the last several years (Paul et al., 2010).This phenomenon could be attributed partly to an artificial gap created between virtual screening (VS) and rational drug design as a result of researchers essentially focusing on only the hits with the most effective pharmacological or biological activity.To bridge this gap, there is the need to include molecules that can at least illustrate good target-binding characteristics to the list of promising candidate drugs because such molecules could serve as useful leads for optimization.The interface of the sRBD of the SARS-CoV-2 that directly interacts with the hACE2 to facilitate the entry of the virus into the host's cell was targeted with an sRBD-focused small molecule database to find promising binders that could exemplify the possibility of interfering with the sRBD-hACE2 interaction and the consequential information flow leading to infection.Such molecules could serve as candidate inhibitors or leads for optimization in a rational drug design process.
The novel compounds T4S1820, T4589, E634-1449, and K784-707 were identified herein as hits that demonstrated  Comparing the initial docking poses of T4S1820, E634-1449, and K784-7078 (Figures 3 and 4) suggest that the last two are mainly stabilized by hydrophobic interactions as one H-bond and no H-bonds, respectively, were identified compared with nine H-bonds formed by T4S1820.It is suggested that the stabilities of E634-1449 and K784-7078 could be improved further by carefully introducing H-bond donor and acceptor groups into the parent structures.Although the simulated pose of T4589, on the other hand, revealed some ring flipping compared with the docking pose as shown in Figure 7B, the fact that this chemical entity can stably bind to the targeted interface of sRBD signals that the unstable groups could be rationally modified to improve the overall stability of the compound.Also, T4S1820, T4589, E634-1449, and K784-707 interacted with pharmacologically important amino acid residues in   (Muhseen et al., 2020), further emphasizes their prospects as lead compounds.
Overall, the compounds were projected as inactive agents of hepatotoxicity, carcinogenicity, immunotoxicity, mutagenicity, and cytotoxicity (Table 1) except T4589, E634-1449, and K784-7078 which were flagged as active agents of immunotoxicity and mutagenicity, hepatotoxicity, and immunotoxicity, respectively.As potential lead compounds, it is possible to apply several strategies (Nassar et al., 2004) to alter the isolated toxicity predicted.For the compounds that were suggested as being moderately and poorly soluble (Table S1) including T4589, E634-1449, and K784-7078, there are also reported physical and chemical modifications as well as miscellaneous methods (Sareen et al., 2012;Savjani et al., 2012) that could be adopted to improve the solubility.Although the plasma membrane may act as a barrier to the hydrophilic T4S1820 which recorded a negative log P of À 2.82 (Table S2), carriers could be used to deliver the compound across the plasma membrane (Arpicco et al., 2016;Larrañeta et al., 2018;Mirchandani et al., 2021).The lipophilicity of T4S1820 could also be improved through structureactivity relationship (SAR) studies.
Interestingly, the results of the simulations of the sRBD-hACE2 complex in the presence of the four compounds at the interacting interface revealed a reduction in the intramolecular H-bonds between sRBD and hACE2 (Figure 9E) as well as complex stability (Figure 9F) in each case except the T4589 complex.T4589 was ineffective in blocking the intermolecular H-bonds between sRBD and hACE2 and decreasing the stability of the sRBD-hACE2 complex because it occupies a relatively smaller area in the interface (Figure 9B) and is characterized by ring moieties that were found to be relatively unstable in the simulations (Figure 7B).Altogether, the results thus suggest that T4S1820, E634-1449, and K784-707 could perturb and weaken the intermolecular interactions between sRBD and hACE2 by shielding the interactions leading to complex instability and interference with the flow of the appropriate signals to trigger the processes required for the entry of the virus into the host's cell.Nonetheless that the overall results project T4S1820 as the most promising lead compound following its demonstration of competitive stability, superior binding energy, and non-toxic credentials, E634-1449 and K784-7078 also produced encouraging results.Thus the features of these proposed binders of sRBD could be improved through appropriate lead optimization strategies to develop potent SARS-CoV-2 receptor-binding inhibitors.

Conclusions
SBVS and MD simulation techniques were applied to identify four novel chemical entities that could bind to the interface of the sRBD of SARS-CoV-2 that interacts with the host's cell-surface receptor, hACE2, to facilitate the entry of the virus into the host's cell.The hits including T4S1820, T4589, E634-1449, and K784-7078 were shown to make stable interactions with important amino acids of sRBD with estimated MMPBSA-based binding energies of À 188.628 ± 31 kJ/mol, À 105.230 ± 24 kJ/mol, À 98.459 ± 22 kJ/mol, and À 90.161 ± 19 kJ/mol, respectively.These small molecules also displayed reasonable solubility, lipophilicity, and toxicity-propensity characteristics that could be improved by applying the appropriate lead optimization techniques.T4S1820, E634-1449, and K784-7078 also demonstrated the ability to weaken the sRBD-hACE2 interaction by shielding H-bond formation between the two proteins.All-inclusive, the results suggest that the scaffolds of T4S1820, which emerged as the most promising candidate, E634-1449, and K784-7078 could serve as seeds for the development of potent inhibitors of SARS-CoV-2 targeting the sRBD.We recommend in vitro and in vivo studies to further assess the sRBD-binding and anti-SARS-CoV-2 properties of the proposed compounds.

Figure 1 .
Figure 1.Cartoon illustration of the sRBD-hACE2 interaction (PDB ID: 6LZG).Carbon atoms of sRBD and hACE2 are colored gray and orange and the carbon atoms of the corresponding residues involved in hydrogen bonding are colored yellow and green, respectively.Hydrogen bonds are shown as black dashes.

Figure 3 .
Figure 3. Interactions between sRBD and the selected compounds.(A) Interactions between sRBD and T4S1820.(B) Interactions between sRBD and T4589.The balls colored in black, blue, and red represent carbon, nitrogen, and oxygen atoms, respectively.Bonds connecting the atoms of T4S1820 and T4589 are colored magenta.Bonds connecting the atoms of amino acid residues involved in hydrogen bonding are colored gray.The red 'eyelashes' represent nonbonded contacts.Hydrogen bonds are shown in green dashes.

Figure 4 .
Figure 4. Interactions between sRBD and the selected compounds.(A) Interactions between sRBD and E634-1449.(B) Interactions between sRBD and K784-7078.The balls colored in black, blue, red, yellow, and green represent carbon, nitrogen, oxygen, sulfur, and chlorine atoms, respectively.Bonds connecting the atoms of E634-1449 and K784-7078 are colored in magenta.Bonds connecting the atoms of amino acid residues involved in hydrogen bonding are colored gray.The red 'eyelashes' represent nonbonded contacts.Hydrogen bonds are shown in green dashes.
). Conformational analyses further revealed that the scaffolds of T4S1820, E634-1449, and K784-7078 could maintain the initial docking poses during the simulations (Figure7A, C, and D).