In-silico investigation of some recent natural compounds for their potential use against SARS-CoV-2: a DFT, molecular docking and molecular dynamics study

Abstract Since its first appearance in December 2019, SARS-CoV-2 has infected many people all over the world, causing serious health problems in many people and causing many deaths, but no specific drug has been developed yet. SARS-CoV-2 main protease (SARS-CoV-2 Mpro) has an important role in viral replication and transcription, so inhibition of this enzyme is proposed to be an attractive route for the treatment of COVID-19. Natural compounds have been used in the treatment of many diseases throughout the history. In this study, it was aimed to investigate SARS-CoV-2 Mpro inhibition abilities, thus the therapeutic potentials of some novel phytochemicals which have recently been entered the literature. For this purpose, eleven novel phytochemicals obtained from various natural resources have been investigated for their potential antiviral activity against SARS-CoV-2 with the use of in silico methods. In the first part of the study, DFT (density functional theory) calculations were performed on the investigated compounds. In this part, geometry optimizations, vibrational analyses, and MEP (molecular electrostatic potential) map calculations were performed. In the second part, molecular docking calculations and then molecular dynamics (MD) simulations were performed to investigate how these natural compounds interact with SARS-CoV-2 Mpro which is a promising target for COVID-19 treatments. In this part, MM-PBSA (molecular mechanics with Poisson-Boltzmann surface area) calculations were also performed to determine the binding free energies of the investigated compounds. Results showed that most of the investigated compounds interacted with SARS-CoV-2 Mpro effectively and can be promising structures for drug development studies for COVID-19. Communicated by Ramaswamy H. Sarma


Introduction
The incidence of SARS-CoV-2 has created a medical emergency all over the world, and it continues to spread all over the world rapidly, but a drug treatment could not be developed for this disease yet. Since SARS-CoV-2 M pro has an important role in viral replication and transcription, it is considered to be an important target for the studies aiming COVID-19 treatment. There are various studies which targeted SARS-CoV-2 M pro in the extant literature (Avti et al., 2021;Choudhary et al., 2020;Erdogan, 2021;Kapusta et al., 2021;Kumar, Faheem et al., 2020;Kumar, Singh et al., 2020). These studies can be classified in two groups: (1) studies which focus on approved drugs currently in the market and (2) studies which focus on synthetic and/or natural compounds in the literature. Natural compounds from various sources have been used in the treatment of many diseases throughout the history. Therefore, natural compounds continue to be among the first compounds to come to mind in drug development studies.
In this study, eleven natural compounds which have recently been added to the literature by Ma et al. (2021), Mukaide et al. (2021), Xie et al. (2021), Popwo Tameye et al. (2021 and Peng et al. (2021) have been investigated with the use of DFT, molecular docking and MD simulation approaches for their possible antiviral activity against SARS-CoV-2. The study consists of three parts, in the first part of the study, DFT calculations were performed on the investigated natural compounds. In this part of the study, geometry optimizations, frequency analyses and MEP map calculations were performed. In the second part of the study, molecular docking and MD simulation studies were performed on the investigated compounds. In this part of the study, MM-PBSA calculations were performed on the investigated compounds to determine the binding free energies. Results showed that investigated compounds, especially compounds 3, 9 and 2 showed considerably high binding affinity to SARS-CoV-2 M pro and can be promising structures against SARS-CoV-2.

Materials and methods
In the present study, DFT, molecular docking and MD simulations were performed on the selected eleven phytochemicals. Additionally, binding free energy calculations, drug-likeness and ADME analyses were carried out. In Figure 1, chemical structures of the investigated natural compounds are given (Ma et al., 2021;Mukaide et al., 2021;Peng et al., 2021;Popwo Tameye et al., 2021;Xie et al., 2021).

Drug-likeness and ADME analyses
Drug-likeness analyses and ADME predictions were performed with the use of SwissADME web server (Daina et al., 2017). In this part of the study, physicochemical properties, lipophilicity and water solubility were predicted for each compound. Drug-likeness analyses were performed on the investigated natural compounds with the use of Lipinski   (Veber et al., 2002), Egan (Egan et al., 2000) and Muegge (Muegge et al., 2001) filters.

DFT calculations
Geometry optimized structures of the investigated compounds are given in Figure 2. A conformational search was carried out prior to each geometry optimization, and a vibrational analysis was carried out to confirm that each optimized geometry corresponds to a global minimum. The absence of imaginary frequency proved that each optimized geometry corresponds to a global minimum. Geometric parameters and the vibrational spectra of the investigated compounds obtained from DFT calculations are given in Supplementary File (Tables S1-S11 and Figures S1-S11). MEP maps of the investigated compounds were obtained to get information about the electron rich and electron deficient parts of the compounds under investigation. MEP maps of the compounds are given in Figure 3.
It was observed that negative charges were located on the oxygen atoms while positive charges were located on the hydrogen atoms bonded to oxygen or nitrogen atoms ( Figure 3). Results showed that these negative and positive centers took part in the formation of non-bonded interactions between ligands and receptor during molecular docking and MD simulations.

Molecular docking and MD simulation studies
In molecular docking calculations nine top-scoring docked poses were obtained for each enzyme-ligand complex. AutoDock Vina binding scores for top-scoring enzyme -ligand complexes are given in Table 1. Molecular docking results showed that compound 3 has the highest binding affinity among all compounds.
After molecular docking calculations, 100 ns MD simulations were performed on the top-scoring enzymeligand complexes. RMSDs (root mean square deviations) of compounds 1-11 after least square fit to protein are illustrated in Figure 4. Results showed that all compounds, except 7, remained in the binding pocket of the enzyme throughout the entire MD simulation. Compounds 3, 4 and 8 immediately reached their equilibrium positions in the binding pocket and held their positions in the remaining time of the simulation. In enzyme -1 complex, it was observed that there was a slight change in the position of the ligand (1) between 45 th ns and 65 th ns but it kept its position during the simulation. In enzyme -2 complex, there was a change in the position of the ligand (2) about 45 th ns but then it reached its equilibrium position. In enzyme -5 complex, there were remarkable changes in the position of the ligand (5) between 35 th ns and 40 th ns, then it reached its equilibrium position. Although there were some changes in the position of the ligand, it stayed bound in the binding pocket in the remaining time of the simulation. In enzyme -6 complex, the ligand (6) could not keep its position and a remarkable change in the position of the ligand was observed nearly at the end of simulation and the ligand moved away from the enzyme. In enzyme -9 complex, the ligand (9) reached its equilibrium position in the binding pocket after 15 th ns. Although a slight change was observed between 55 th ns and 75 th ns, it can be said that the ligand kept its equilibrium position during the simulation. In enzyme -10 complex, it was observed that the ligand (10) could not show a stable behavior until 70 th ns. After 70 th ns, it reached its equilibrium position and kept its position in the remaining time of the simulation. In enzyme -11 complex, the ligand (11) reached its equilibrium position after 30 th ns and kept its position in the remaining time of the simulation.
Since RMSD and RG (radius of gyration) of protein is a good indicator of protein stability, they were monitored during the MD simulations and are given in Figures 5 and 6. Results showed that generally, RMSDs of the enzyme varied within a narrow range in almost all complexes during the simulation. Thus, it can be concluded that the enzyme remained stable throughout the MD simulations. It can be seen from Figure 5, in enzyme -1 complex, probably due to the change in the relative position of the ligand (Figure 4), there was a remarkable change in RMSD of the enzyme between 45 th ns and 65 th ns, but then it was equilibrated. In enzyme -4 complex, although there were some changes in RMSDs of enzyme in the last part of the simulation, this does not mean that the enzyme is unstable. In enzyme -6 complex, there were also some considerable changes in RMSD of the enzyme as expected because the ligand (6) stayed bound but could not keep its position throughout the MD simulation. In other complexes, no considerable change in RMSD of backbone after least square fit to backbone was observed. RG of proteins in all enzyme-ligand complexes were also monitored during the MD simulation and no considerable change was observed for any of the complexes ( Figure 6). Average RG of proteins in the enzyme complexes of compounds 1-11 and standard deviations were found to be 2.212 ( Root mean square fluctuations (RMSF) of active site residues were analyzed to determine the effects of the ligands on the active site residues of SARS-CoV-2 M pro and results are given in Figure 7. Average RMSFs of the active site residues in SARS-CoV-2 M procompound 1-11 complexes were found to be 0. 1538, 0.1346, 0.1322, 0.1374, 0.1307, 0.1715, 0.1739, 0.1363, 0.1431, 0.1448 and 0.1544 nm, respectively. Average RMSF of the active site residues of unliganded enzyme was found to be 0.1411 nm.
In MD simulations, number of hydrogen bonds between ligands and the enzyme were also monitored during the simulation and are given in Figure 8. Results showed that during the MD simulations, the highest number of hydrogen bonds between ligands and the enzyme was found for enzyme -3 complex while the least number of hydrogen bonds was found for enzyme -9 complex. Average number of hydrogen bonds formed between the enzyme and compounds 1-11 during the MD simulations were found to be 1.896, 1.711, 2.306, 1.816, 1.384, 0.416, 1.444, 1.042, 0.300, 1.402 and 0.878, respectively.
Results of the MD simulations for enzymereference drug complexes are illustrated in Figure 9. Results showed that, however there was a considerable change in the position of remdesivir between 70 th ns and 90 th ns, remdesivir reached its equilibrium position in the binding pocket after 90 th ns. In enzymeritonavir complex, there was a considerable change in the relative position of ritonavir after 55 th ns but it reached its equilibrium position in the binding pocket after 90 th ns, too. On the other hand, results showed that favipiravir reached its equilibrium position immediately and kept its position throughout the MD simulation but favipiravir interacted only with the residues located in the S2 subsite of SARS-CoV-2 M pro (Figure 13m). Results also showed that the change in the RMSD of backbone in enzymeremdesivir complex was higher than the changes in the RMSDs of backbones in both enzymefavipiravir and enzymeritonavir complexes.
Binding positions of the investigated compounds (except compounds 6 and 7) and the reference drugs, at the end of MD simulations are given in Figure 10. Results showed that compounds 6 and 7 could not maintain its position during the simulation, but it was observed that all other compounds remained in the binding pocket of the enzyme.
Interactions between ligands and the enzyme are illustrated in Figures 11 and 12. Results showed that CYS145, MET165, PRO168, GLN189 and especially HIS41 amino acids were the ones mostly took part in the ligand-receptor interactions. Results also showed that this observation was also valid for the reference drugs.
On the other hand, in enzymefavipiravir complex, HIS41, MET49, CYS145, HIS164, ASP187, ARG188 and GLN189 residues of the enzyme took part in the interactions between the ligand and the enzyme. In enzymeritonavir complex, LEU27, HIS41, MET49, ASN142, CYS145 and HIS164 residues interacted with ritonavir through hydrogen bonds, p-sulfur, p-p T-shaped, alkyl and p-alkyl interactions. Finally, in enzymeremdesivir complex, THR25, LEU27, HIS41, MET49, HIS164, MET165 and GLN189 residues took part in the interactions between the enzyme and remdesivir, and hydrogen bonds, p-sulfur, p-p stacked, alkyl and p-alkyl interactions were formed.
Binding sites of SARS-CoV-2 M pro , and binding sites preferred by the investigated compounds and the reference drugs are shown in Figure 13.
At the end of MD simulations, although there are more or less conformational changes in the binding region of the enzyme, it has been observed that compounds 8, 9, 10, remdesivir and ritonavir bound to the S2, S3 and S4 subsites (Figure 13h-j, l and n). In enzyme-8 and enzyme-9 complexes, dimethoxynaphthalene moieties of compounds 8 and 9 were directed to the S4 subsite while methylchromanol moieties were directed to the S2 and S3 subsites. In enzyme-10 complex, chromene moiety of compound 10 was directed to the S4 subsite and acrylamide moiety was directed to the S2 and S3 subsites. It was also observed that compound 1 took place in the S1, S3 and S4 subsites ( Figure  13b) while compound 2 interacted with the S1, S2 and S3 subsites (Figure 13c). In enzyme-1 complex, chromanone moiety of compound 1 was directed to the S1 subsite while substituted phenyl group was directed to the S3 subsite and 3-methylbut-2-en-1-yl group on the substituted phenyl was Figure 11. 2D representations of the interactions between the enzyme and compounds 1-5, 8. directed to the S4 subsite. Moreover, results showed that compound 4 interacted with S1 and S2 (Figure 13e), and compound 11 interacted with S3 and S4 subsites ( Figure  13k). In enzyme-4 complex, compound 4 was directed to the S2 subsite and interacted with the S1 subsite through its oxoacetate moiety. In enzyme-11 complex, compound 11 was directed to the S4 subsite and interacted with the S3 subsite via the methyl group at the 2-position of chromene. Similar to favipiravir, compound 5 interacted only with the residues located in the S2 subsite (Figure 13m and f). Compound 6 was not able to form interaction with any of the subsites (Figure 13g). On the other hand, compound 3 which has the highest binding affinity according to MM-PBSA calculations (Figure 14), took place in the S1, S2, S3 and S4 subsites (Figure 13d). In enzyme-3 complex, chromanone moiety was directed to the S4 subsite and 3-methylbut-2-en-1-yl group on the chromanone interacted with the S3 subsite. Additionally, 3-methylbut-2-en-1-yl group on the substituted phenyl interacted with the S2 subsite, and methoxy group of the phenyl group took place in the S1 subsite.
In the study, average binding free energies were also determined with the use of MM-PBSA method and obtained results are given in Figure 14. Results showed that binding affinities of the investigated natural compounds, especially 3, 9 and 2 to SARS-CoV-2 M pro were considerably high. It was found that most of the natural products have a higher binding affinity than all of the reference drugs. In the study, most of the investigated compounds (1-3, 8-11) were selected from natural compounds containing chromane based moieties since chromane derivatives are important natural compounds and exhibit various biological activities. The highest binding affinities were obtained for compounds 3, 9 and 2 which are containing chromane and chromanone moieties. Among compounds 1-3, the highest and the second highest binding affinities were obtained for compounds 3 and 2, respectively. The presence or absence of a 3-methylbut-2-en-1-yl group at the 8-position may be effective in observing this result. Compounds 8 and 9 which contain chromane moiety are stereoisomers, and results showed that (R)-isomer has higher binding affinity to SARS-CoV-2 M pro more than (S)-isomer. On the other hand, it was observed that the compounds which don't have a chromane derived moiety (4-7) and compounds which bear an amide group (10, 11) did not exhibit as high binding affinity as the other compounds to SARS-CoV-2 M pro . AutoDock Vina binding scores obtained from molecular docking calculations showed that compound 1 has the second highest binding affinity score among all compounds but MM-PBSA calculations performed after MD simulations showed that compound 1 has the fourth highest binding affinity after compounds 3, 9 and 2. This is probably because of the remarkable change in the position of compound 1 during the MD simulation. Namely, molecular docking calculation results showed that, compound 1 interacted with the S1, S2, S3 and S4 subsites of the enzyme, but during the MD simulation, compound 1 could not hold its initial position in the binding pocket, and the interactions between compound 1 and S2 subsite of the enzyme disappeared.

Drug-likeness and ADME analyses
Results of drug-likeness and ADME analyses of compounds 1-4 and 8-11 are given in Tables 2 and 3. Since compounds 5, 6 and 7 were not able to form stable complexes with SARS-CoV-2 M pro , they were not subjected to drug-likeness and ADME analyses. In drug-likeness and ADME analyses, no violation was observed for the investigated natural compounds in terms of Lipinksi, Ghose, Veber and Egan rules. On the other hand, since their XLOGP3 (XLOGP Program Version: 3.2.2) values are higher than 5.00, compounds 3, 8 and 9 do not meet the criteria of Muegge rules. Since the investigated compounds 1-4 and 8-11 have no more than ten hydrogen bond acceptors and five hydrogen bond donors, they meet the criteria of Lipinski and Muegge rules. Molecular weights of these compounds do not exceed the limits stated by Lipinski, Ghose and Muegge rules. Fraction C sp 3 values of the investigated compounds were found to be in the range of 0.27-0.73. Increasing fraction C sp 3 values correspond to  higher degrees of saturation and it is proposed that increasing degrees of saturation increases the clinical success rate of a molecule (Lovering et al., 2009;Wei et al., 2020). Ghose rule states that the molar refractivity of a given molecule should be between 40 and 130. In the study, molar refractivities of the investigated compounds were found to be in the range of 71.9-126.7, thus, all molecules meet the criterion. Another available criterion in drug-likeness analyses is topological polar surface area (TPSA). TPSAs of the investigated compounds were found to be in the range of 47.9-116.5 Å 2 (Ertl et al., 2000). Results showed that all molecules meet the criterion stated by Veber, Muegge and Egan rules. Average partition coefficients (average of iLOGP (Daina et al., 2014), XLOGP (XLOGP Program Version: 3.2.2), WLOGP (Wildman & Crippen, 1999), MLOGP (Moriguchi et al., 1992) and SILICOS-IT (FILTER-IT Program Version: 1.0.2)) of the molecules were found to be in the range of 1.37-4.20. These results showed that all molecules meet the criterion stated by Lipinski, Ghose, Muegge and Egan rules. Water solubilities (ESOL (Delaney, 2004), ALI (Ali et al., 2012) and SILICOS-IT (FILTER-IT Program Version: 1.0.2)) of the investigated compounds were estimated to be varied from poorly soluble to soluble. Gastrointestinal absorptions of all molecules were predicted to be high, but it was predicted that only compounds 4, 8 and 9 can penetrate blood-brain barrier. Additionally, skin permeations were estimated to be varied from À7.22 to À4.88 cm/s (Potts & Guy, 1992). Results showed that all molecules obey Lipinski, Ghose, Veber and Egan rules. Bioavailability score was found to be 0.55 while synthetic accessibility scores were found to be in the range of 3.77-4.80. In PAINS analysis (Baell & Holloway, 2010) no alert was received for all compounds, and in Brenk analysis (Brenk et al., 2008) no alert was received for compounds 8 and 9. Compounds 1, 2 and 3 have an isolated alkene moiety, compound 4 has a diketo group, and compounds 10 and 11 have a Michael acceptor, thus, for compounds 1, 2, 3, 4, 10 and 11, one alert was received in Brenk analysis. In leadlikeness analyses (Teague et al., 1999), one violation was observed for compound 1 because of having a molecular weight higher than 350 g/mol, for compounds 2, 3, 8, 9, two violations were observed because of having a molecular weight higher than 350 g/mol and a XLOGP3 value higher than 3.5, and no violations was observed for compounds 4, 10 and 11.

Conclusion
In this study, eleven natural compounds which have entered the literature recently, have been investigated computationally for their possible antiviral activity against SARS-CoV-2. In the study, SARS-CoV-2 M pro inhibition potentials of the natural compounds were investigated with the assistance of DFT calculations, molecular docking and MD simulation studies. Results showed that most of the investigated compounds, especially compounds 3, 9 and 2 have considerably high binding affinity to SARS-CoV-2 M pro . It was concluded that these natural compounds can be promising structures in the treatment of COVID-19, and worth for further studies.

Disclosure statement
The authors declare that there is no conflict of interest.

Funding
The computational studies reported in this paper were partially performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TRUBA resources) and partially at Kocaeli University.