Remdesivir analogs against SARS-CoV-2 RNA-dependent RNA polymerase

Abstract The COVID-19 pandemic has already taken many lives but is still continuing its spread and exerting jeopardizing effects. This study is aimed to find the most potent ligands from 703 analogs of remdesivir against RNA-dependent RNA polymerase (RdRp) protein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus . RdRp is a major part of a multi-subunit transcription complex of the virus, which is essential for viral replication. In clinical trials, it has been found that remdesivir is effective to inhibit viral replication in Ebola and in primary human lung cell cultures; it effectively impedes replication of a broad-spectrum pre-pandemic bat coronaviruses and epidemic human coronaviruses. After virtual screening, 30 most potent ligands and remdesivir were modified with triphosphate. Quantum mechanics-based quantitative structure–activity relationship envisages the binding energy for ligands applying partial least square (PLS) regression. PLS regression remarkably predicts the binding energy of the effective ligands with an accuracy of 80% compared to the value attained from molecular docking. Two ligands (L4:58059550 and L28:126719083), which have more interactions with the target protein than the other ligands including standard remdesivir triphosphate, were selected for further analysis. Molecular dynamics simulation is done to assess the stability and dynamic nature of the drug–protein complex. Binding-free energy results via PRODIGY server and molecular mechanics/Poisson-Boltzmann surface area method depict that the potential and solvation energies play a crucial role. Considering all computational analysis, we recommend the best remdesivir analogs can be utilized for efficacy test through in vitro and in vivo trials against SARS-CoV-2. Communicated by Ramaswamy H. Sarma


Introduction
The COVID-19 pandemic continuously manifests devastating consequences for human life. In over 210 countries (Chan et al., 2020;Lu et al., 2020;Shanmugam et al., 2020;Zhang et al., 2020) the outbreak has resulted in more than 176 million cases of infections and more than 3.8 million deaths till 15th June. Thus, the spreading of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus has been a prevalent enigma in this year. The higher incidence of human-to-human transmission probably due to stronger binding affinity of the virus spike protein for the host receptor (Lan et al., 2020;Shang et al., 2020;Walls et al., 2020;Wrapp et al., 2020;Yan et al., 2020) results in an rapid infection.
The replication of coronaviruses is mediated by a multisubunit replication/transcription complex. The RNA-dependent RNA polymerase (RdRp) is the core component (nsp12) of this complex and thus plays a central role to facilitate viral RNA synthesis. The accessory cofactors nsp7 and nsp8 enhance RdRp template binding and processivity Subissi et al., 2014;Yin et al., 2021) . The nsp12 comprises an N-terminal nidovirus RdRp-associated nucleotidyltransferase (NiRAN) domain, an interface domain and a Cterminal RdRp domain. The RdRp domain is similar to a right hand and encompasses the fingers, palm and thumb subdomains. The thumb domain is occupied by nsp7 subunit, while the fingers and thumb domains are bound by two copies of nsp8. The catalytic center is assembled by conserved motifs A to G in the palm domain Hillen et al., 2020;Kirchdoerfer & Ward, 2019). Besides, a N terminal b-hairpin in RdRp domain is clamped by the NiRAN domain and palm subdomain and thus stabilizes the overall structure . Moreover, the crystal structures of RdRps among different RNA viruses have exposed a common architecture and mechanism of catalysis (Kamer & Argos, 1984;Zhang et al., 2020) . Subsequently, RdRp has been recognized as promising druggable target in diverse RNA viruses, such as hepatitis C virus (HCV), Zika virus (ZIKV) and CoVs (Mercorelli et al., 2018;Wang et al., 2020).
In addition, worldwide rapid emergence of SARS-CoV-2 has emphasized the requirement to develop more effective vaccines and therapeutics immediately. So far, nucleoside analogs (NAs) are the most promising broad-spectrum viral RdRp inhibitors, among which above 25 got approval against several viral diseases (Shannon et al., 2020). In the host cell, nucleoside/nucleotide prodrugs are metabolized into an active 5 0 -triphosphate form (5 0 -TP) to compete with endogenous nucleotides as substrates for the viral RdRp. The NAs are successively integrated with the emergent viral RNA by the RdRp and exert the antiviral effect via several mechanisms of action (MoA). Conversely, CoVs manifest an exclusive exonuclease activity in their nsp14, which is liable for rendering a challenge via resistance to several frequently used NAs. However, remdesivir (RDV) that is an adenosine analogue can inhibit RdRp protein of many RNA viruses rendering itself as a potent antiviral drug (Uzunova et al., 2020;Yesudhas et al., 2021). RDV prodrug has proven effective to inhibit the viral replication in clinical trials against Ebola and COVID-19 infection Eastman et al., 2017). In laboratory, RDV tested on cultured cells and non-human primate models manifests its effectiveness against SARS-CoV-2 viruses (Bimonte et al., 2020). In primary human lung cell cultures, Remdesivir (RDV) effectively impedes replication of a broad-spectrum pre-pandemic bat CoVs and epidemic human CoVs (Agostini et al., 2018;Brown et al., 2019;Pruijssers et al., 2020;Sheahan et al., 2017). Accordingly, the delayed chain termination and altered excision capacity due to the ribose 1 0 -CN of the RDV-TP have considerably increased antiviral effect against SARS-CoV-2 RdRp complexes (Hashemian et al., 2020;Shannon et al., 2020;Wang et al., 2020). An in silico study also suggests the effectiveness of RDV against RdRp with nsp7, nsp8 and nsp12 of SARS-CoV-2 virus (Skariyachan et al., 2020). However, clinical trials suggest that RDV is a functional drug, but it has a poor efficacy as a repurposed drug against SARS-CoV-2. Moreover, some mutations made SARS-CoV-2 resistant against RDV. All these show the urgency of finding more potent SARS-CoV-2 antivirals (Martinez, 2020). Since the NAs render structural similarity, other analogous drugs may exert quite similar binding mode and inhibition mechanism to perturb viral replication cycle .
Hence, in this study, the effort is to search a large chemical database with ultimate interest to recommend the best drug candidate among the RDV analogs. The probable fastest way to achieve the goal can be in silico virtual screening. This critical approach can facilitate early identification of promising drug candidate in a high-throughput manner (Ahmed et al., 2021;Hernandez et al., 2019;Onawole et al., 2020). Mostly, various types of computational tools are employed for structure-based virtual screening that followed druggability and toxicity testing. The least toxic and non-carcinogenic ligands with a greater affinity toward the targeted site were reported as probable drug candidates. Besides, partial least square (PLS) regression of quantitative structural-activity relationship (QSAR) of drug is applied to predict the binding energy. Principal component analysis (PCA) is employed further to recognize QSAR pattern. Molecular dynamics (MD) simulation and binding-free energy calculation have been executed to assess their binding capability compared to RDV. Consequently, this study would illustrate the strategy to screen broad-spectrum effective RDV analogs against SARS-CoV-2 and CoVs evolving in near future.

Protein preparation
We took the available crystal structure of SARS-CoV-2 RdRp (PDB ID: 6M71) (Burley et al., 2019), which was released on 1 April 2020 (resolution 2.90 Å). Later, the protein structure was renovated by the SWISS-MODEL modeling pipeline (Waterhouse et al., 2018) to recover missing amino acids. The FASTA sequences of the RdRp protein were collected from UniProtKB P0DTD1 (R1AB_SARS2) to conduct template-based modeling. The template (PDB ID: 7BTF) representing highest quality was utilized for protein modeling. The actual model coordinates were constructed based on the target-template alignment employing ProMod3 modeling engine (Studer et al., 2021). ProMod3 extracts preliminary structural records from the selected template. The conserved regions between the target and the template had been replicated from the template to the model. Insertions (missing loops) and deletions were remodeled by a template-based fragment library employing the cyclic coordinate descent (CCD) method. Then, side chains were reassembled from 2010 backbonedependent rotamer library by Dunbrack lab (Krivov et al., 2009). The OpenMM library (Eastman et al., 2017) was applied to perform the computations and the CHARMM22/ CMAP force field (Mackerell et al., 2004) for parameterization. In final stage of modeling process, structural distortions, unfavorable interactions or clashes were fixed by energy minimization. The energy minimization was accomplished by the OpenMM (Eastman et al., 2017) molecular mechanics library. The expected quality of the resulting model was estimated by GMQE (Biasini et al., 2014) and QSQE (Bertoni et al., 2017) at the tertiary and quaternary structure level. The evaluation of global and per-residue model quality had been performed by QMEAN scoring function (Benkert et al., 2011).

Ligand preparation
For this study, initial structure of 703 ligands (RDV analogs) was taken from PubChem database. These ligands were virtually screened against the prepared RdRp protein by AutoDock Vina protocol (Trott & Olson, 2010) and GOLD 5.7 (Genetic Optimisation for Ligand Docking). Based on the retrieved results, 30 best ligands with the highest binding affinity and similar backbone of RDV were selected. Usually, nucleoside/nucleotide prodrugs are metabolized into an active 5 0 -TP form to exert the antiviral effect. Hence, these ligands along with RDV were modified with triphosphate. Quantum mechanics (QM) calculations were used to optimize the modified ligands. Gaussian 09 program package was used for all quantum calculations (Frisch et al., 2009). The modified ligands were optimized using semi-empirical PM6 method (Bikadi & Hazai, 2009). Successively, the calculation for vibrational frequencies (all values are in harmonic approximation) was performed as well as the absence of imaginary frequencies was assured the stationary points corresponding to minima on the potential energy surface.

Virtual screening and docking
All collected ligands (703 RDV analogs) were energy minimized by using mmff94 force field with the steepest descent optimization algorithm and a total 2000 number of minimization steps. After ligand and protein preparation, targeting viral RNA entry site Yin et al., 2021), a grid box was generated around the polymerase active site of the protein (the (A-G) conserved motifs) where the center was X: 119.3520, Y: 111.7097 and Z: 128.1657 and the dimensions were X: 42.3242, Y: 37.1526 and Z: 42.1612 in the search space. Later, the modified triphosphate ligands were docked against the prepared RdRp protein by AutoDock Vina protocol (Trott & Olson, 2010). Docking of ligands with the prepared RdRp protein was also done by GOLD 5.7. The molecular docking approach using AutoDock Vina protocol predicted the binding affinity and the interaction of the selected ligands with RdRp. The binding affinities of the ligands were estimated as the negative scores in kcal/mol unit. The greater negative scores resemble better binding affinities. Along with docking by AutoDock Vina, GOLD Suite was applied to measure the fitting score and the binding interaction of selected ligands against the protein. To perform docking in GOLD Suite, the drugs and SARS-CoV-2 RdRp protein were prepared using Hermes visualizer in the GOLD. All other parameters were set as default and CHEMPLP was chosen for the fitness function. BIOVIA Discovery Studio version 4.5 was utilized to visualize and detect the non-covalent interactions among ligands and receptor protein.

QM-based QSAR for antiviral drugs
A number of quantum chemical parameters were utilized to build a firmly predictable QSAR applying B3LYP/6-31G (d, p) level of Gaussian 09 package. To make a good QSAR, we have randomly taken 20 small molecules as a training set, and the other 10 is considered as the test set. Both training set and test set are very important to make the QSAR model. To find out the correlation for structural-activity relationship, PLS method is used with various chemometric descriptors such as TPSA (topological polar surface area, Å 2 ), molecular weight, XLogPs-AA, HBD, ROTB, H atom, C atom, O atom, Cl atom, N atom, F atom, single bonds (SB), double bonds (DB), benzene rings of the drug candidate's, the highest occupied molecular orbital energy E HOMO (in eV), the lowest unoccupied molecular orbital energy E LUMO (in eV), total energy E T (in eV), dipole moment (DM in Debye), absolute hardness g (in eV), absolute electronegativity v (in eV), softness S, electrophilicity x (in eV) and energy gap DE (in eV) (Adams, 2004;Ahmed et al., 2020;De Jong, 1990;Fakayode et al., 2009;Pack, 1991;Yu, 2009). PLS is performed using XLSTAT version 2013.
The g, v, x and S were determined to exhaust Equations (1-4):

Pharmacokinetic properties
Computational validation of the selected two ligands (L4 and L28) along with RDV-TP is extensively performed to generate potential lead substances with efficient drug-likeliness and pharmacokinetics. Online ADMET structure-activity database (admetSAR) (Cheng et al., 2012) was used to compute the ADMET profiles that include absorption, distribution, metabolism, excretion and toxicity properties, as it is a crucial step for determining their potential success. Besides admetSAR, we utilized in silico tools such as SwissADME (Daina et al., 2017) to predict important pharmacokinetic properties with minimized potential safety issues of screened ligands. All the aforesaid analyses were done by providing the ligands' SMILE file format as retrieved from the PubChem database (www.pubchem.ncbi.nlm.nih.gov). Overall, it predicts the physically significant and pharmaceutically relevant properties of organic molecules. This includes various properties, such as logPo/w, the human Ether-a-go-go-Related Gene (hERG), Caco-2 permeability, blood-brain barrier (BBB) penetration, cytochrome P450 2D6 (Susnow & Dixon, 2003), enzyme inhibition level, human intestinal absorption (HIA), P glycoprotein inhibitor (PGI) and carcinogenicity of compounds.

MD simulations
In this study, we employed MD simulation for RdRp protein in apo-form (without ligand) and holo-form (ligand-protein). A 100-ns MD simulation was done to understand their conformational dynamics and interaction. All simulations were performed utilizing YASARA dynamics (Krieger et al., 2004) program, and the applied force field was AMBER14 (Dickson et al., 2014). As solvent water molecules were added and total system was made with the addition of 0.9% NaCl at 298 K temperature. A cutoff radius of 8.0 Å was set for shortrange van der Waals and Coulomb interactions. Long-range electrostatic interactions were calculated using particle mesh Ewald method, and the temperature was fixed using Berendsen thermostat. Periodic boundary condition was used in the simulation and a cell was formed, which was 20 Å larger than the protein in all sides of the system. The equilibration simulation continued for 250 ps to stabilize the protein. 1.25 fs was maintained as a time step for the overall simulations. The snapshots were saved at every 100 ps, while the 100-ns MD simulation was performed. When the analysis was performed time, energy, bond distance, bond angle, dihedral angles, solvent-accessible surface area (SASA), molecular surface area (MolSA), columbic and van der Waals interactions, root-mean-square fluctuation (RMSF) and rootmean-square deviation (RMSD) values for backbone, alpha carbon, and heavy atoms were collected and recorded from MD simulations using modified macro-files.

PCA
MD trajectory data were employed to investigate the structural and energy variances among the apo-protein and complexes. In the analysis, reflected variables for structural and energy elements were bond distances, bond angles, dihedral angles, planarity, van der Waals energies and electrostatic energies (Ahmed et al., 2020). The diverse multivariate energy factors were applied to reveal the existing variability among them (De Jong, 1990;Wold et al., 1987). The retrieved MD data were pre-processed through centering and scaling. The variations among apo-RdRp and ligand-protein complexes were revealed considering first 50-ns MD trajectory due to their better stability during that time scale. The following equation reflects PCA model: represents multivariate factors as the product of two new matrices, i.e. T k and P k ; T k manifests matrix of scores relating to the samples; P k matrix of loadings where the variables correlate with each other; k is the number of factors existing in the model and E specifies the matrix of residuals. The assessment of trajectory was achieved through R (Peng, 2015), RStudio (Rstudio Team, 2019) and basic codes. Finally, the R package ggplot2 was utilized for PCA plots generation (Wickham, 2009).

Binding-free energy calculations
The binding-free energy calculations were implemented using both PRODIGY server and molecular mechanics/Poisson-Boltzmann surface area method (MM/PB-SA). At 0.5-ns time interval, a total of 200 MD snapshots from each complex were selected for the binding-free energy calculations by PRODIGY server (Xue et al., 2016). The server measures the free energy based on intermolecular contacts and properties derived from non-interface surface. Usually, PRODIGY-LIG predicts binding affinity in protein-small ligand complexes employing the Python Framework Flask (version 0.12.2). The prediction model was constructed by using structural terms (i.e. ACs), within the distance threshold of 10.5Ð. The ACs include atoms in the interaction (C ¼ Carbon, O ¼ Oxygen, N ¼ Nitrogen, X ¼ all other atoms) (Vangone et al., 2019). A multiple linear regression model with R was trained, by executing fourfold cross-validation. Akaike's information criterion (AIC) stepwise selection method was applied to avoid overfitting and detect the essential topographies (Kurkcuoglu et al., 2018). For ranking ligands and predicting the affinity, the subsequent binding affinity predictor model DG noelec is estimated by the following equation: where DG noelec denotes binding affinity for 'No Electrostatics Prediction' protocol, and AC NN , AC XX and AC CN are the ACs between Nitrogen-Nitrogen, between all other atoms and polar hydrogens, and Carbon-Nitrogen, respectively. DG noelec binding affinity has been successfully applied by Drug Design Data Resource (D3R) on PRODIGY-LIG method to improve ligand's docking and scoring (Vangone et al., 2019).
In MM/PB-SA method, all calculations were done by using YASARA dynamics. AMBER14 force field with single trajectory approach was employed (Kollman et al., 2000). The snapshots collected from the 100-ns MD simulation were used for all three protein-ligand complexes. Following equation was utilized to calculate binding-free energy for the protein and ligand.
Here, DG complex ¼ the total free energy of the protein-ligand complex in solvent, DG ligand ¼ total energy of separated ligand in solvent and DG protein ¼ the total energy of protein in solvent.

Analysis of results from virtual screening
Structure-based virtual screening is performed to screen 703 ligands based on their binding affinities. Binding affinities are distributed within the range of 0 to À6 kcal/mol, À6.1 to À7 kcal/mol and so on up to À10 kcal/mol (Figure 1(A)). GOLD suite is also employed for more accurate analysis/ docking trials (Figure 1(B)). The selected top 30 ligands, showing good binding affinities, are presented in Table S1. RDV, as a standard drug, is also docked to compare the results with the ligands. Later, the top 30 ligands rendering good binding affinities and best interacting pattern are modified with triphosphate and optimized via PM6 method, and are shown in Table S2.

Analysis of QM-based QSAR
QSAR is a very promising method for modeling and predictive pattern analysis, extensively, used in various drug discovery and bioinformatics for the pharmaceutical industry, agrochemicals and petrochemical sectors (Alam & Khan, 2017;Berhanu et al., 2012;Choong et al., 2013;Fakayode et al., 2014;Funar-Timofei et al., 2017). Moreover, QM-based QSAR consistently assist the drug discovery approaches. The biochemical information delivered by QM is more accurate, and thus, more appropriate QSAR models are expected with QM descriptors (De Benedetti & Fanelli, 2010;Mazanetz, 2013).
For instance, TPSA (Å2), MW, #H-bond donor and #DM of the ligands are the most significant descriptors on QSAR contributors to the PLS regression (Table S1). Previously, our group has shown that PCA can be used for pattern recognition of potential ligands (Ahmed et al., 2020(Ahmed et al., , 2021. Here, Figure 2 shows the PCA score plots. However, the first principal component (PC1) shows 31% of the variability and the second principal component (PC2) shows 21% of the variation of ligand binding energy. A closer examination of the scores plot suggests that the ligands with a similar functional group are gathered together side by side on the scores plot.
The ligands (L4, L28, L10, L16 and L17) containing single OH group, pyrrole, triazine are attached to the furan ring that grouped together on the first and second quadrant of the scores plot. The observed grouping of ligands on the scores plot is highly significant. Entertainingly, glycine is found at L17. In contrast, ligands containing common furan ring (L1, L3, L5, L6, L9, L12, L14, L15, L23, L26, and L29) are clearly placed on the third and fourth quadrant of the score plots.
In addition, this group of ligands also contains a higher number of highly electronegative (particularly oxygen) atoms, which may have a significant influence on chemical behavior and on therapeutic effect and potency. The PLS model is developed to predict the binding energy of the test set for validation of drug candidates. The principal goal of any PLS model is to predict upcoming drug candidates, mainly from its SAR. Hence, the binding energies of the drug candidates were supposed to predict via developed PLS regression. Binding energy is obtained from PLS regression equation as follows: The binding energy obtained from molecular docking and the predicted binding energy by PLS regression for each ligand of validation ligands are exhibited in Table 1. Due to the capability of PLS regression to predict accurately, the binding energy of the validated ligands is estimated by root-mean-square-relative-percent-errors (RMS%RE). The PLS regression revealed a high accuracy of 80% for the prediction of the binding energy of ligands compared with the binding energy obtained from molecular docking.

Analysis of results from molecular docking
The top selected ligands (L4 and L28) from QSAR list along with RDV-TP are again docked with RdRp (Table 2). These ligands are predicted to have promising results as potential inhibitor, clearly indicating that they may show better inhibitory effects than RDV (Figure 3). The binding affinity, as well as fitness score, obtained for both ligands is comparable. These ligands are also docked with two other crystal structures (PDB ID-7BV2, resolution 2.50 Å and 7BTF, resolution 2.95 Å) of RdRp except 6M71. 7BV2 has nsp12-nsp7-nsp8 complex bound to the template-primer RNA and antiviral drug RDV in its structure, whereas 7BTF is RNA polymerase in complex with nsp7-nsp8 in reduced condition. The obtained docking results have further supported them as a   potential antagonist of RdRp (Table 2). The RdRp-RNA (PDB ID-7BV2) with interacting anti-RdRp ligands are shown in Figure S1.

Molecular interaction of the best ligands with RdRp
A number of non-covalent interactions have been detected between RdRp and the top selected ligands when the best poses were predicted specially for interaction within RNA entry site by AutoDock Vina. The interactions are hydrogen bond, halogen bond, electrostatic bond and hydrophobic bond. L4 displays the highest binding affinity (-8.7 kcal/mol) and fitting score (58.46). It has formed 17 hydrogen bonds and 3 hydrophobic bonds with the RdRp. In L28-RdRp complex, the ligand also interacts with the RdRp by 11 hydrogen bonds and 1 hydrophobic bond, whereas in RDV-TP-RdRp complex, the drug is stabilized by 14 hydrogen bonds and 2 hydrophobic bonds. Details molecular interactions of the best ligands and RDV-TP with RdRp protein are exhibited in Table 3. Non-covalent interactions predicted by the GOLD suite are presented in Figure 4 and Table S3.

Pharmacokinetic properties
Pharmacokinetic properties of the two selected ligands are listed in Table 4. According to admetSAR calculation, selected two ligands are found to be non-carcinogenic. Phosphorylated glycoprotein (P-gp) is a drug transporter, which influences the pharmacokinetics and pharmacodynamics profile of a drug molecule significantly. Additionally, human colon adenocarcinoma (Caco-2) cell permeability assay is used as a gold standard to calculate human intestinal drug permeability. Figure 5 depicts the correlation between HIA, CaCo-2 permeability and P-gp inhibition correlation of selected ligands. On the other hand, LogP (n-octanol water partitioning) can affect BBB penetration capacity. The calculation shows that the selected ligands have relatively similar BBB penetration probability. However, they have shown weak inhibition on hERG but more than the control RDV-TP.

Insights from MD simulation
MD simulation is performed for 100 ns for the top two selected ligands from our QSAR list (L4 and L28) and RDV as a control drug, and all of them are complexed with RdRp. We also conducted a MD run for RdRp apo-form. The RMSD value is (0.6-3.4 Å) for a-carbon atoms in the apo-RdRp, which is higher than both L4 and L28, and suggests that in physiological situation apo-form is unstable. Though in L28, the RMSD value is fluctuated slightly from 20 to 30 ns and after 80 ns but the generated trajectories remained stable for the whole run. So the structure of L28-RdRp complex is highly stable. L4 shows high fluctuations from 40 to 70 ns, but remains stable in the later phase (average 2.3 Å). RDV also shows comparative instability for RMSD value during the whole run than L28. The average RMSD for L28 is 2.16 Å, whereas for RDV, it is determined around 2.5 Å (Figure 6(A)).
To observe the structural compactness and solvent accessibility of all complexes, the radius of gyration and SASA are also determined. The radius of gyration can be defined as the distance of a mass point from the axis. It provides the information about the compactness of protein. RDV-TP shows the higher radius of gyration than others and thus suggests loose packing of the RDV-RdRp complex during the whole run and L28 shows lowest radius of gyration and thus displays the highest compactness. The average radius of gyration for RDV-TP, L28 and L4 are 32.28, 29.28 and 30.28 Å, respectively (Figure 6(B)). RDV-TP complex shows the highest value of SASA (average 38,903 Å 2 ). L28 shows slightly lower value (average 35,753 Å 2 ) than L4 (average 35,953 Å 2 ) but slightly higher than apo-form (average 35,687 Å 2 ) ( Figure 6(C)).
To ascertain the dynamics behavior of the protein, RMSF value is also observed. During the protein-ligand interaction, a significant fluctuation of RMSF values of residues (69-118), (328-434) and (822-917) has been observed. In the RdRp-L28 complex, most of the residues show a low RMSF (average 1.36 Å), but a high fluctuation of 5.46 Å has been observed to the residue 102. A maximum fluctuation of 6.65 Å has been detected to the residue of 69 of the RDV-TP-RdRp complex. The RMSF value of residues that form the finger domain Arg553, Arg555, Val557 and Asp618 (palm domain) is lower for the both ligands than standard or the apo-form. Since the less fluctuation indicates higher protein structural stability, the highest degree of flexibility indicates both ligands are stable.
Stability of the protein molecule largely depends on hydrogen bonds. The intermolecular hydrogen bond that is formed during the whole 100-ns run has been observed and a maximum number of hydrogen bonds are predicted for L28 (average 1474), whereas L4 and RDV-TP have the same average (average 1471) with RdRp ( Figure 6(F)).
Moreover, the snapshot of conformers displays that both ligands most of the time remained in the binding sites (RNA entry site) of RdRp throughout the entire simulation process (Figure 7(A,B)).

PCA on simulation data
The PCA model has revealed the variations among structural and energy profile from MD simulation. The total 79.9% of the variance has been unveiled by PC1 and PC2, where PC1 expresses 64.6% and PC2 expresses 15.3% of the variance. The score plot displays distinct cluster patterns for RdRp and ligand-protein complexes (Figure 9(A)). Here, the RdRp protein complexes with L4 (green) and L28 (purple) are considerably concurred with apo-RdRp (pink). Therefore, these ligand and protein clusters manifest quite similarity in their structural and energy profiles. Thus, the analysis indicates probable similarity among RDV analogs as well as nucleotide analogs. However, RdRp-RDV (blue) manifests some variations located adjacent to the coincided clusters. The loading plot (Figure 9(B)) displays, bond, bond angles, dihedral angles, van der Waals energies, which are closely distributed among the variables. Moreover, in this study, RDV analogs (L4 and L28) exhibit quite better binding pattern with RdRp compared to RDV.

Discussion
To guide and speed up the early-stage development of new active compounds, computer-aided drug design (CADD) techniques are used for the rapid assessment of chemical libraries. The drug discovery process has accelerated immensely, and the cost has been minimized. Hence, a vast number of computational methodologies such as virtual screening, QSAR, ADMET, molecular docking and MD have become integral parts of the modern drug discovery process (Bibi & Sakata, 2016).
RdRp is one of the main drug targets for SARS-CoV-2 as it is the key element of viral replication. It has a shape of the right hand (residues 367-920), with fingers subdomain (residues 366-581 and 621-679), a palm subdomain (residues 582-620 and 680-815) and a thumb subdomain (residues 816-920). The active site of the RdRp consists of the conserved polymerase motifs A-G in the palm domain. Nucleoside triphosphate level in cells is identified as a good signal for rendering antiviral activity (Geraghty et al., 2021). Some NAs with direct antiviral activity target the viral polymerase to inhibit the genome replication. Along with RDV and favipiravir, several NAs, such as sofosbuvir, Ribavirin and Tenofovir, were demonstrated to be potential inhibitors of RdRp, for the probable treatment of COVID-19 (Khan et al., 2021). Besides, 6-azauridine, Gilead Science (GS)-441524 and mizoribine proved effective to improve in COVID-19 infection. Methotrexate, b-D-N4-hydroxycytidine, molnupiravir and galidesivir also showed potentiality against SARS-CoV-2 (Borbone et al., 2021). However, combinational administration of anti-inflammatory and antiviral therapy is likely to be an effective, safer and more potential against COVID-19 (Ye et al., 2021). In a recent study, low-dose administration of GS441524 with GC376 had a synergistic effect to protect mice against SARS-CoV-2 virus . Methylprednisolone had a beneficial effect in combination with RDV in hamster model infected with SARS-CoV-2 (Ye et al., 2021). Oxypurinol in combination with pentoxifylline had been proposed for anti-SARS treatment. Also, gemcitabine with oxysophoridine exerted an improved antiviral effect for the treatment of COVID-19 (Borbone et al., 2021).
In this study, we have applied a unique approach through virtually screening of 703 RDV analogs to identify potential drugs against COVID-19.
Docking studies, employing two docking tools, have been used to find out the best drug candidates against RdRp in SARS-CoV-2. The docking results show that most of the screened ligands, especially the selected 30 ligands, interact with the active pocket residues of the RdRp protein.
Moreover, the top-ranked two ligands (L4 and L28) also show higher binding affinity and interaction compared to recently approved drug, RDV.
The analysis from QM-based QSAR illustrates that Å2, MW, #H-bond donor and #DM of the ligands are the most significant descriptors to the PLS regression. QSAR pattern exhibits that ligands containing single OH group, pyrrole, triazine attached to the furan ring are assembled together on the first and second quadrant of the scores plot. Also, the ligands containing common furan ring are clearly located at the third and fourth quadrant of the score-plots. The grouping of desired ligands (L4, L28) is highly significant on the scores plot. The presence of electronegative atoms may have a substantial impact on chemical behavior for the therapeutic effect and potency.
The binding affinities of the top-ranked two ligands with RdRp are compared with RDV. The binding affinities of À8.7 kcal/mol and À8.5 kcal/mol have been detected for L4 and L28, respectively, while the binding affinity of À7.2 kcal/ mol is obtained for RDV. A similar fitting score is observed with the GOLD docking for both ligands (Table 2). When docking results of both methods are compared with the RDV, the top ligands show very strong binding interactions including hydrogen bond, hydrophobic, halogen bond and electrostatic interactions with the important amino acids of RdRp. Compared to the RDV, both two ligands display better and more appropriate interactions with many important residues through hydrogen bond, Pi-alkyl and Pi-Pi interactions. In addition, our top-ranked ligands can promote more interactions than RDV, contributing to a higher binding affinity and fitting score. Amino acid residues, Arg553, Arg555, Val557, Asp618, Cys622, Asp623 and Ser682, in the binding pocket (RNA entry site) of RdRp are found to participate in non-covalent interactions with both ligands (Figures 3 and 4, Table 3 and Table S3). It can be demonstrated from ADMET analysis that the two ligands are non-carcinogenic with comparatively good HIA, BBB penetration probability and better inhibition on hERG than RDV ( Figure 5).
Based on molecular docking result, the two ligands along with RDV-TP are chosen for analysis with MD. It is shown  from MD simulation that the average RMSD values of L4 and L28 are lower compared to apo-form and RDV-TP, indicating more structural stability. It is revealed from the Rg and SASA that complexes with both ligands are stable similar to RDV-TP ( Figure 6(B,C)). Furthermore, a similar average MolSA is detected for both ligands, whereas it is the highest for the RDV-TP-RdRp (Figure 6(D)). It is observed from the RMSF values that L28 and RDV-TP have more reasonable binding stability with RdRp than L4. Moreover, a nearly similar RMSF value of residues is predicted for all the complexes. All the complexes demonstrate significant interactions with important residues, indicating their compactness in a complex form with RdRp ( Figure 6(E)). Furthermore, the binding-free energy analysis conducted by PRODIGY server (Figure 8) and MM/ PB-SA both support the MD results, which conveys a crucial quantity to realize whether an interaction will actually occur or not in the cell. PCA on MD simulation data highlights that the screened RDV analogs (L4, L28) can provide better binding conformation with RdRp, especially for their consistent  MFmolecular formula; hERGthe human ether-a-go-go-related Gene; BBBblood-brain barrier; HIAhuman intestinal absorption; PGI -P-glycoprotein inhibitor. interactions during simulation (Figure 9). Even though we have screened the best RDV analogs and observed better outcomes than RDV, most of the existing studies disclose preliminary outcomes (Parvathaneni & Gupta, 2020). The actual efficacy of the analogs can be distinguishable through substantial clinical evidence (Khadka et al., 2020;Parvathaneni & Gupta, 2020). Previously, among the selected analogs, L10, and L20 showed antiviral activity against HCV genotype 1b (Cho et al., 2014) and Ebolavirus (Siegel et al., 2017) accordingly. The findings also highlight the efficacy of RDV analogs against viral infection.
Overall, considering promising binding mode and probable inhibitory effect we firmly suggest that our top-ranked   RDV analogs might perturb viral replication cycle in a single or combinational therapeutic dose. This study might be helpful for clinical testing to design fruitful therapeutic strategies mitigating the adverse effects of SARS-CoV-2 and new CoVs evolution.

Conclusion
Among multiple proteins of SAR-CoV-2 involved in viral RNA replication, targeting RdRp plays a significant role in combating viral replication. Hence, to accelerate the drug designing and optimizing new drug candidates, herein, we employed a comprehensive approach for identifying potential anti-RdRp ligands and their binding mechanism using integrated computational methods. Among 703 PubChem ligands (RDV analogs), we identified two potential antiviral ligands (L4:58059550 and L28:126719083) as lead ligands against SAR-CoV-2-RdRp. A strong binding affinity is obtained for both ligands during molecular docking, promoting an increasing number of non-covalent interactions such as hydrogen bonding, hydrophobic and electrostatic interactions. QM-based descriptors offer chemically accurate descriptors for the molecules. The PLS regression predicts the binding energy of ligands with high accuracy of 80% comparing with the value acquired from molecular docking. Further, the ADME and toxicity analysis suggests the favorable pharmacokinetic properties for these ligands. Both ligands found to be more stable, making more hydrogen bond interactions with RdRp, than the standard drug in MD simulation. Binding-free energy demonstrates the higher binding affinity of (L4:58059550, L28:126719083) compared to RDV. Overall, our study highlights biologically active RDV analogs can be effective anti-SAR-CoV-2 therapeutic agents targeting viral RdRp protein to inhibit SAR-CoV-2 propagation within the host. This study can be easily coupled with in vitro and in vivo experiments to confirm their effectiveness against SAR-CoV-2.