Structural and energetic properties of the potential HIV-1 reverse transcriptase inhibitors d4A and d4G: a comprehensive theoretical investigation

A comprehensive quantum-chemical investigation of the conformational landscapes of two nucleoside HIV-1 reverse transcriptase inhibitors, 2′,3′-didehydro-2′,3′-dideoxyadenosine (d4A), and 2′,3′-didehydro-2′,3′-dideoxyguanosine (d4G), has been performed at the MP2/6-311++G(d,p)//B3LYP/6-31G(d,p) level of theory. It was found that d4A can adopt 21 conformers within a 5.17 kcal/mol Gibbs free energy range, whereas d4G has 20 conformers within 6.23 kcal/mol at T = 298.15 K. Both nucleosides are shaped by a sophisticated network of specific noncovalent interactions, including conventional (OHO, NHO) and weak (CHO, CHN) hydrogen bonds, as well as dihydrogen (CHHC) contacts. For the OHO, NHO, and CHO hydrogen bonds, natural bond orbital analysis revealed hyperconjugative interactions between the oxygen lone pairs and the antibonding orbital of the donor group. For the CHHC contacts, the electron density migrates from the antibonding orbital, corresponding to the CH group of the sugar residue, to the bonding orbital relative to the same group in the nucleobase. The results confirm the current belief that the biological activity of d4A and d4G is connected with the termination of the DNA chain synthesis in the 5′–3′ direction. Thus, these nucleosides act as competitive HIV-1 reverse transcriptase inhibitors.


Introduction
The advent of computer technology coupled with improvements in computational chemistry algorithms has given impetus to in silico design of new efficient drugs against acquired immunodeficiency syndrome. HIV-1 reverse transcriptase (hereafter referred to as RT) is a multifunctional enzyme that catalyzes two different processes: (i) DNA synthesis using RNA or DNA as the template; (ii) endonucleolytic ribonuclease H (RNase H) activity that specifically degrades the RNA strand of RNA:DNA duplexes (see e.g. Sluis-Cremer, Temiz, & Bahar, 2004 and refs. therein). Due to its paramount role in the HIV-1 life cycle, RT has been the main target in the fight against this virus (Skowron & Ogden, 2006). In general, RT inhibitors can be classified into two groups: nucleoside RT inhibitors (NRTIs), which affect the DNA synthesis by blocking the growth of a polynucleotide chain (Jeang, 2000), and non-nucleoside RT inhibitors (NNRTIs), which are alloteric and noncompetitive inhibitors (Sahlberg & Zhou, 2008).
The conformational properties of NRTIs play a key role in their binding to target enzymes (see e.g. Herdewijn, 2008 and refs. therein). When introduced into a cell, a nucleoside passes through several phases of biomolecular recognition by nucleoside transporters (King, Ackley, Cass, Young, & Baldwin, 2006), kinases/5′-nucleotidases (Van Rompay, Johansson, & Karlsson, 2003) and, finally, polymerases (Alber & Carloni, 2000). At each phase, the nucleoside can adopt a unique threedimensional structure or conformation. The structural properties of NRTIs, particularly, those governed by intramolecular interactions, determine to a great extent their potential therapeutic efficiency. For instance, efficient phosphorylation depends largely on the spatial structure of the nucleoside (el Kouni, 2002). Therefore, an extensive conformational analysis identifying all minima on the potential energy surface can be considered as a first step in the design of nucleoside analogs as anti-HIV drugs.
The 2′,3′-didehydro-2′,3′-dideoxynucleosides are promising drug candidates for anti-retroviral treatment. The structural features that distinguish them from the canonical nucleosides are the presence of a double bond in the sugar residue, making the sugar almost planar, and the lack of a hydroxyl group at the 3′-position of the base. It is thought that, after activation by cellular kinases and interaction with the RT nucleoside triphosphate binding site, the 2′,3′-didehydro-2′,3′-dideoxy nucleosides act as terminators of viral replication (Everaert et al., 1993). So far, only 2′,3′-didehydro-2′,3′-dideoxythymidine (d4T, stavudine) has been widely used as part of combined antiretroviral therapy (Hitchcock, 1991), due to its favorable risk/benefit profile (Russell, Whiterock, Marrero, & Klunk, 1989). The other 2′,3′-didehydro-2′,3′-dideoxy nucleosides containing canonical DNA and RNA bases are also considered as potential RT inhibitors (Everaert et al., 1993). However, the full mechanisms of their biological activity are still unknown.
X-ray structural analysis of d4A (Hutcheon & James, 1974) and d4G (van Roey & Chu, 1992) attested that both nucleosides adopt high-anti conformations in the crystal state, which differ by the orientation of the γ (C4′-C5′) torsion angle. These high-anti conformations are energetically unfavorable in the gas phase. Therefore, a theoretical investigation of the d4A and d4G conformational landscapes is essential for determining the full set of biologically active conformations.
Several earlier works were devoted to the conformational analysis of 2′,3′-didehydro-2′,3′-dideoxynucleoside analogs. Conformational energy calculations of d4A (Rao, 1998a) and d4G (Rao, 1998b) were performed by Rao employing different force fields. It was established that the preferred combination for the glycosidic ( χ) and C4′-C5′ bond (γ) conformations are ( χ 2 anti, γ 2 gauche À ) and ( χ 2 anti, γ 2 trans) for d4A and d4G, respectively. However, the consistency of the calculated preferred conformations with the X-ray data (Hutcheon & James, 1974;van Roey & Chu, 1992) appeared to be very sensitive to the force field employed. In a recent study, Palafox, Iza, de la Fuente, and Navarro (2009) computed structures and energies of selected conformers of stavudine (d4T), in the isolated phase as well as in the presence of solvent, to analyze the effect of hydration on the d4T molecular structure and energetics.
In the current work, we apply this methodology to the d4A and d4G nucleosides, since, to the best of our knowledge, their conformational space has not yet been comprehensively explored. As the biological activity of nucleosides is structure-dependent, the results will be useful, for example, for molecular docking investigations of enzyme-substrate systems.

Computational methodology
The chemical structures of d4A and d4G are presented in Figure 1, together with the atom numbering and definition of the dihedral angles (Saenger, 1984). The sugar ring pseudorotational concept by Altona and Sundaralingam (1972) was employed in this work. A conformational analysis was performed by density functional theory (DFT) using the B3LYP nonlocal hybrid exchange-correlation functional (Becke, 1993;Lee, Yang, & Parr, 1988) and the 6-31G(d,p) basis set. Single-point calculations were performed for all conformers using the MP2 (second-order Møller-Plesset perturbation theory) method and the 6-311++G(d,p) basis set. Closedshell interactions were identified and analyzed using the same methodology as in our previous work (Ponomareva et al., 2012(Ponomareva et al., , 2013. Further details are provided in the Supplementary Materials. All quantum-chemical calculations were performed using Gaussian 03 (Frisch et al., 2003) for the Win32 platform.
In earlier work (Yurenko et al., 2007(Yurenko et al., , 2008, a theoretical reconstruction of gas-phase IR spectra of thymidine and 2′-deoxyuridine using the same level of theory as applied in the current work, MP2/6-311++G(d,p)// B3LYP/6-31G(d,p), led to excellent agreement with experimental IR spectra recorded in low-temperature argon matrices (Ivanov, Krasnokutski, Sheina, & Blagoi, 2003;. Since IR spectra are very sensitive to conformation and energetic populations of the conformers as well as to the nature and strength of non-covalent intramolecular interactions, particularly hydrogen bonds (H-bonds), a good agreement between the theoretical and experimental spectra is a good indication of the reliability of the applied theoretical method. In addition, Zhou and Qiu (2009) assessed the ability of 42 different basis sets in connection with the B3LYP functional in reproducing the experimental intermolecular distances in the adenine-thymine base pair, and established 6-31G(d,p) as the most reliable basis set. Thus, the MP2/6-311++G(d,p)//B3LYP/6-31G (d,p) level of theory applied here is expected to accurately reproduce the energetic, conformational, and vibrational properties of the DNA nucleosides studied in this work and to accurately characterize their intramolecular interactions.

Conformational characteristics
Tables 1 and 2 list the energetic and structural properties of the d4A and d4G conformers located, respectively. The most energetically favorable four conformers of d4A and d4G are displayed in Figures 2 and 3, respectively. The d4A and d4G global minima are both syn/O4'-endo is anti/O4′-endo, whereas the most stable d4G conformer is syn/O4′-endo. The d4A global minimum is shaped by a single O5′HÁ Á ÁN3 H-bond, whereas the d4G global minimum contains two rather strong O5′HÁ Á ÁN3 and N2H1Á Á ÁO5 H-bonds. X-ray diffraction data (Hutcheon & James, 1974;van Roey & Chu, 1992) (Table 3), shows that in the crystal state d4A and d4G adopt structures that are high-energy conformers in the gas phase, with very different β, γ, and χ torsion angles than in DNA (Hartmann & Lavery, 1996). Some differences between the crystal structures and isolated conformers, especially in the χ angle values, are due to strong intermolecular H-bonds and crystal packing forces.
The full conformational families of d4A and d4G contain only one DNA-like conformer (conformer 5 for d4A and conformer 4 for d4G, χ 2 anti, γ 2 g + , β 2 t; see Tables 1 and 2). As the sugar in d4A and d4G lacks the O3′H hydroxyl group occurring in DNA, these nucleosides do not have the δ and ɛ torsion angles associated with this group. Both d4A conformer 5 and d4G conformer 4 are governed by C8HÁ Á ÁO5′ H-bonds.
The full ground-state conformational families of the 2′,3′-didehydro-2′,3′-dideoxy analogs of the natural purine nucleosides are relatively small: At 298.15 K 21 d4A conformers lie within about 5 kcal/mol Gibbs free energy of each other (Table 1), whereas 20 d4G conformers lie within about 6 kcal/mol of each other ( Table 2). The number of conformers in the d4A and d4G conformational families is much smaller than the number of starting structures (36). We think this is due to the existence of relatively strong H-bonds of the O5′HÁ Á ÁN3, N2H1Á Á ÁO5′ and С6HÁ Á ÁO5′ types, which have a stabilizing effect, as well as to other types of intramolecular interactions (such as Van der Waals forces and anomeric effects (Thibaudeau, Acharya, & Chattopadhyaya, 2005)). The combination of the preference for formation of these interactions with steric hindrances prevents the formation of some conformers.
The χ angle of the syn conformers of d4A and d4G varies from 55°to 69°for d4A and from 30°to 72°for d4G, the second interval being much larger. The χ angle of the anti conformers ranges from 168°to 179°for d4A and from À178°to À116°for d4G (see Tables 1 and 2). The corresponding ranges for the high-anti conformers are 134°-103°for d4A and À116°to À99°for d4G. Thus, like d4T, d4C, and d4U (Yurenko et al., 2007(Yurenko et al., , 2011, but in contrast to the canonical nucleosides, d4A and d4G can adopt the high-anti base orientation. All nine possible combinations of the γ and β torsion angles, which determine the orientation of the methoxy group on the 5′-end of the sugar residue are observed in the conformational families of the d4A and d4G nucleosides. Both angles present a trimodal distribution (Figure 4), falling into three sectors (g + , t and g À ).
In addition to the distribution of torsion angles, we also examined the structural changes induced by nucleoside formation in the d4A and d4G conformers, including changes in the bond lengths and angles and changes in the structure of the base. The adenine and guanine heterocycles in the minimum-energy structures of d4A and d4G are, strictly speaking, (slightly) nonplanar: the values of the intracycle torsion angles range from .5°to 3°. It was earlier demonstrated that isolated DNA bases are rather flexible molecules (Hovorun, Gorb, & Leszczynski, 1999;Nikolaienko, Bulavin, & Hovorun, 2011), with the result that non-planar deformations of the heterocycle of the bases occur upon nucleoside formation.
Another structural peculiarity of the d4A and d4G conformers is that the glycosidic bond (C1′N9) deviates only slightly (up to 4°) from the mean plane of the imidazole ring. The mean base plane was calculated for each conformer by a least-squares fit algorithm. In addition, the values of the angles between the C2N2 bond in d4A or the C6N6 bond in d4G and the respective base planes varies from 13°to 22°for d4A and from 34°to 40°for d4G. These data indicate that the exocyclic fragments in d4A and d4G are essentially pyramidalized, as was also observed earlier for d4C (Ponomareva et al., 2012).
Correlation analysis confirmed that the conformational variables γ, β, and χ do not intercorrelate (the absolute values of the corresponding correlation coefficients do not exceed .26). This is expected, as these conformational variables are basically independent.  χ, β, and γ are the backbone torsion angles (see Figure 1); b Calculated data; c X-ray crystallographic data (Hutcheon & James, 1974;van Roey & Chu, 1992).
According to the Boltzmann distribution, the conformational equilibrium of d4A and d4G at 298.15 K is shifted toward syn conformers, with syn:anti ratios of 88:12% (d4A) and 94:6% (d4G). In this respect, d4A and d4G differ drastically from stavudine (d4T), which favors the anti-base orientation in the free state (Ponomareva et al., 2013). The distributions of the d4A and d4G sugar conformational subfamilies differ strongly from those of the corresponding canonical nucleosides (Yurenko et al., 2007(Yurenko et al., , 2011. The γ angle preferentially adopts the g + conformation: γ 2 g + in 94% of d4A and 98% of d4G conformers (whereas less than 1% of d4A and d4G conformers contain the g À conformation for the γ angle). A similar distribution is observed for the β angle: β 2 g + in 93% d4A and 97% d4G conformers, whereas only $1% of d4A and d4G conformers adopt the t conformation of the β angle.
The deoxyribose unit was shown to be the major source of the conformational flexibility of DNA (Foloppe & MacKerell, 1999;Poltev et al., 2010) and, thus, the overall DNA structure is to a large extent determined by the conformational properties of 2′-deoxyribonucleosides. Due to the tightness of the active sites of polymerases (Kim, Delaney, Essigmann, & Kool, 2005), the nucleoside triphosphate should possess a DNA-like conformation to allow incorporation into DNA. To investigate possible conformational effects on the biological activity of d4A and d4G, we optimized two structures for each nucleoside with the χ angle fixed at the "ideal" (Foloppe & MacKerell, 1999) A-and B-DNA forms, À158.9°and À101.9°, respectively. The results, shown in Table 4, indicate that, like d4T (Ponomareva et al., 2013), d4U and d4C (Ponomareva et al., 2012), d4A and d4G can adopt A as well as B conformations. This is an important requirement to allow their incorporation into double-helical DNA. All four "ideal" DNA-like conformations of d4A and d4G are stabilized by a C8HÁ Á ÁO5′ H-bond, which plays an important role in maintaining the mutual orientation of the base and sugar residues in DNA (Wahl & Sundaralingam, 1997). These results show that there are no energetic or conformational barriers for the incorporation of d4A and d4G into double-helical DNA. This confirms the earlier suggestion that the biological activity of the 2′,3′-didehydro-2′,3′-dideoxy analogs of the natural nucleosides is directly related to the lack of a hydroxyl group at the С3′-position of the sugar moiety (Hamamoto et al., 1987). As a result, they function as competitive RT inhibitors through termination of the DNA biosynthesis in the 5′-3′ direction. We are currently performing calculations to investigate the interactions of the triphosphates of the 2′,3′-didehydro-2′,3′-dideoxy nucleosides with HIV-1 RT, with the aim to provide a more detailed understanding of the biological activity of d4A and d4G.

Specific intramolecular interactions.
In general, 21 specific intramolecular contacts involving a hydrogen atom (hereafter called H-contacts), were identified in d4A and 23 in d4G by means of QTAIM (quantum theory of atoms in molecules) electron density topological analysis (Bader, 1990), see Tables SI and SII (Supplementary Materials). Most conformers (19 d4A and 14 d4G conformers) are shaped by only one H-contact, whereas one d4A and four d4G conformers are stabilized by two H-contacts.
The magnitude of the electron density ρ at the bond critical points of the noncovalent interactions in d4A and d4G (Tables SI and SII, Supplementary Materials) are in good agreement with the H-bonding criteria of (Koch & Popelier, 1995). All H-contacts, except the CHÁ Á ÁHC and O5′HÁ Á ÁC8 interactions, meet Bondi's geometric requirements for H-bonding (Bondi, 1964), in particular the requirement that the distance between the donor and acceptor atoms should be smaller than the sum of the van der Waals atomic radii, Σr vdW (2.72 Å for HÁ Á ÁO and HÁ Á ÁN and 2.2 Å for HÁ Á ÁH). In general, there is a good correlation between the Σr vdW values and the HÁ Á ÁB distance (where B is the acceptor atom), with a correlation coefficient of .988.
In this work, we used Grunenberg's compliance constant theory (Brandhorst & Grunenberg, 2008Grunenberg, 2004) to evaluate the individual strengths of the intramolecular interactions. Compliance constants (C ij ) are elements of the inverse force constant matrix. Unlike the elements of the regular force constant (Hessian) matrix, they depend only on the definition of the two internal coordinates pertaining to it and are independent of the definition of all other coordinates. Tables SI and SII, and Figure S1 (Supplementary Materials) clearly show that smaller values of C ij correspond to stronger contacts, as quantified by the energies of the specific interactions E HB calculated by the Espinosa-Molins-Lecomte formula (Equation (1) in Supplementary Materials). The correlation is evidently not linear, as was assumed in our previous work (Ponomareva et al., 2012(Ponomareva et al., , 2013. The solid line in Figure S1 corresponds to a least-squares power fit (C ij = 53.365 Â E HB À1.2285 ; R = À.93). Some of the CHÁ Á ÁHC contacts are characterized by very high С ij values (> 50 Å/mDyn), indicating the relatively weak nature of these contacts.
The C8HÁ Á ÁH2C5′/H1C5′ hydrogen-hydrogen contacts in d4A and d4G deserve special attention. These Hbonds were earlier detected in canonical nucleosides (Yurenko et al., 2007) and some amino acids (Matta & Bader, 2000) by means of Bader's topological analysis (Bader, 1990). To find the H-bond donating (donor) and accepting (acceptor) groups in these interactions (assuming these are genuine H-bonds (Matta, 2006;Matta et al., 2003)), an integration (in natural coordinates) over Bader atomic basins was performed for each hydrogen atom (Tables SV and SVI in Supplementary Materials). The hydrogen on the base moiety bears a partial positive charge and has less negative atomic energy than the hydrogen of the sugar residue, which is negatively charged and more energetically stable. (Note that the charges are calculated with the QTAIM method.) Thus, the H-bond donating group is located on the sugar residue, whereas the acceptor is on the base unit. This is in agreement with the results for d4T (Ponomareva et al., 2013), d4U, and d4C (Ponomareva et al., 2012).
The results indicate that there are no energetic or conformational obstacles to the incorporation of d4A and d4G into double-helical DNA. This provides further support for the suggestion that the biological activity of 2′,3′-didehydro-2′,3′-dideoxy analogs is directly related to the lack of a hydroxyl group at the С3′-position of the sugar moiety (Hamamoto et al., 1987). As a result, these nucleosides act as competitive RT inhibitors through termination of the DNA biosynthesis in the 5′-3′ direction.
The biological relevance of the results presented in this work lies in the considerable flexibility of the 2′,3′-didehydro-2′,3′-didehydro-nucleosides; as biological function of these nucleosides is structure-dependent, a conformational analysis is the initial, but very important, step in elucidating their biological activity. A detailed knowledge of the full conformational families of these nucleosides is essential for establishing the biologically relevant conformations that are formed in the course of biomolecular recognition by different enzymes (transporters, kinases/5′-nucleotidases, polymerases). In addition, we have identified and characterized among all the conformers the one that can be introduced in the active site and potentially block the DNA biosynthesis. A study of the interactions of the triphosphates of these nucleosides with HIV-1 RT could provide a more detailed understanding of the biological activity of d4A and d4G. Such calculations are currently being performed in our research groups.