Understanding the co-evolutionary molecular mechanisms of resistance in the HIV-1 Gag and protease

Abstract Due to high human immunodeficiency virus type 1 (HIV-1) subtype C infections coupled with increasing antiretroviral treatment failure, the elucidation of complex drug resistance mutational patterns arising through protein co-evolution is required. Despite the inclusion of potent protease inhibitors Lopinavir (LPV) and Darunavir (DRV) in second- and third-line therapies, many patients still fail treatment due to the accumulation of mutations in protease (PR) and recently, Gag. To understand the co-evolutionary molecular mechanisms of resistance in the HIV-1 PR and Gag, we performed 100 ns molecular dynamic simulations on multidrug resistant PR’s when bound to LPV, DRV or a mutated A431V NC|p1 Gag cleavage site (CS). Here we showed that distinct changes in PR’s active site, flap and elbow regions due to several PR resistance mutations (L10F, M46I, I54V, L76V, V82A) were found to alter LPV and DRV drug binding. However, binding was significantly exacerbated when the mutant PRs were bound to the NC|p1 Gag CS. Although A431V was shown to coordinate several residues in PR, the L76V PR mutation was found to have a significant role in substrate recognition. Consequently, a greater binding affinity was observed when the mutated substrate was bound to an L76V-inclusive PR mutant (Gbind: −62.46 ± 5.75 kcal/mol) than without (Gbind: −50.34 ± 6.28 kcal/mol). These data showed that the co-selection of resistance mutations in the enzyme and substrate can simultaneously constrict regions in PR’s active site whilst flexing the flaps to allow flexible movement of the substrate and multiple, complex mechanisms of resistance to occur. Communicated by Ramaswamy H. Sarma


Introduction
Under drug selection pressure pathogens can effectively and rapidly accumulate mutations in various drug targets (Kennedy & Read, 2017;Nguyen et al., 2018). The protease inhibitors (PIs) were developed to mimic the substrate transition state of the Gag (Lv et al., 2015). Gag is the main structural polyprotein for HIV-1 and all other retroviruses alike. Apart from structural integrity, its cleavage ensures infectious virions are produced at maturation (Ning et al., 2016). The polyprotein itself comprises four major structural domains (i.e. matrix: p17, capsid: p24, nucleocapsid: p7, p6) (Bharat et al., 2014) and two small spacer peptides (p2 and p1) (Sundquist & Kr€ ausslich, 2012). During maturation, the proteolytic cleavage of Gag occurs in a precise and ordered manner (Fun et al., 2012) by the viral protease (PR) as shown in Figure 1. Therefore, as the PIs are structural analogues of the Gag cleavage sites (CSs), their primary role is to inhibit the viral maturation process. The human immunodeficiency virus type 1 (HIV-1) PR can accumulate mutations close to the active site (King et al., 2004) and in distal locations, such as the flexible flaps and substrate cleft to evade drug binding (Ragland et al., 2014). Although these resistant PR's are functionally weaker, they are still able to cleave the substrate and thereby outcompete the inhibitor for hydrolysis (Kantor et al., 2002;Rhee et al., 2003). Consequently, amino acid (AA) variation at positions L10, M46, I54, V82, I84 and L90 comprise some of the most commonly associated protease resistance mutations (PRMs) (Wensing et al., 2016). Mutations at these positions may occur as combinations in a single variant or as single mutations in multiple variants.
Using Bayesian network learning, our previous study found strong associations between mutations L10F, M46I, I54V, L76V and V82A in the HIV-1 subtype C PR (Marie & Gordon, 2019). These mutations (L10F, M46I, I54V, V82A) have serious implications on Lopinavir (LPV) (Paredes et al., 2017) and Darunavir (DRV) drug binding, particularly if substrate cleft mutation L76V is co-selected, as it can enhance the level of drug resistance (Boender et al., 2016;Louis et al., 2011;Young et al., 2010). Although phenotypic and genotypic tests can be employed to evaluate resistance mutations, these assays only provide information on the functional aspects of the virus (Hu et al., 2011). Consequently, information evaluating the structural impact of such a complex combination of mutations are not clearly understood. Furthermore, as inter-subtype differences in Gag-PR replication are correlated with disease progression (Kiguoya et al., 2017); and that certain AAs are selected more frequently than others (Supporting information; Figure S1), the question arises as to why and how are mutations structurally selected?
Incidentally, mutations arising in Gag are frequently being reported (Dam et al., 2009;Feh er et al., 2002;Su et al., 2018;Teto et al., 2017;Williams et al., 2019), particularly at the NCjp1 and p1jp6 Gag CSs (Borman et al., 1996; Dam et al., Figure 1. Structural representation of the HIV-1 PR enzyme (A) and the ordered proteolytic cleavage of the Gag polyprotein (B). In (B) cleavage begins at the p2 and p7 junction. This is followed by cleavage at the p17jp24 and p1jp6 sites. Finally, the small spacer peptides are cleaved from the p24 and p7 domains. Note: p17 ¼ matrix; p24 ¼ capsid and p7 ¼ nucleocapsid.
2009; Nijhuis et al., 1999;Prado et al., 2002). While most mutations in Gag arise to restore substrate processing (Ho et al., 2008;Parry et al., 2009), some mutations have been shown to confer drug resistance (Gupta et al., 2010;Nijhuis et al., 2007). The NCjp1 A431V Gag CS mutation is of particular interest, since it has been associated with patients failing PI therapy (Giandhari et al., 2015), even in the absence of mutations in the viral PR (Nijhuis et al., 2007). Furthermore, in agreement with other studies (Malet et al., 2007;Verheyen et al., 2006), our previous analyses revealed direct associations between major LPV PRMs and A431V in Gag (Marie & Gordon, 2019).
However, the mechanism by which hydrolysis is restored and the molecular basis in which Gag CS mutations are coevolutionarily selected is unclear. Therefore, we used extensive molecular dynamic (MD) simulations to investigate the structural implications of a complex combination of PRMs on LPV and DRV drug binding in the HIV-1 subtype C PR. Additionally, the mechanism by which the A431V Gag CS mutation structurally interacts with the mutant PR to effectively maintain enzymatic cleavage under drug selection pressure was also investigated. To the best of our knowledge, this is the first study to model, dock and investigate mutant PR-Gag CS complexes using theoretical simulations.

Multi-drug resistant PR sequences
Molecular simulations were carried out on viral PR sequences derived from patients previously failing PI-inclusive treatment regimens as described in Marie and Gordon (2019). The PCS124 and PCS069 PR mutants were selected due to their specific resistance profiles which comprised commonly associated LPV and DRV PRMs as shown in Table 1. Importantly, these viral mutants co-selected the NCjp1 Gag CS mutation, A431V.
Theoretical models of the PR and Gag mutants were generated, structurally optimized and molecularly docked prior to full MD production simulations (Supporting information, SI Text).

MD production simulations
Molecular simulations were performed using the GPU accelerated PMEMD CUDA module in Amber 14 (Case et al., 2014). Parametrization of the ligands with AM1-BCC charges was performed using the Antechamber module in the AmberTools program. The General Amber force field (GAFF) was applied to the receptor-ligand system. Topology and coordinate files were generated using the LeaP program. All ionisable residues were protonated and counter ions were added to the system where necessary to maintain charge neutrality. The ff03.r1 Amber force field for proteins was applied to the system. The system was explicitly solvated by a rectilinear TIP3P periodic box with a margin of 12.0 Å. The SHAKE algorithm was used to restrain the covalent hydrogen bonds (Ryckaert et al., 1977) and the particle mesh ewald (PME) method with a 10.0 Å cut-off was used to calculate the Coulomb (electrostatic) interactions in the system (Darden et al., 1993). Prior to equilibration, a two-step minimization was performed as follows: (i) minimization of the waters with 3,000 steps of steepest-descent and 2,000 steps of conjugated gradient and (ii) minimization of the entire system with 7,000 steps of steepest-descent and 3,000 steps of conjugated gradient minimizations. Following minimization, the entire system was gradually heated from 0 K to 300 K over 50 ps using the constant volume and normal temperature (NVT) mechanics. Thereafter, the system was switched to the isothermal isobaric (constant pressure and normal temperature; NPT) mechanics for a further 50 ps. The temperature of the system was monitored using the Langevin dynamics thermostat. Additionally, when force constraints were applied, the atomic residues were restrained at 2.0 kcal/mol. Final simulations were carried out over 100 ns (10,000,000 cycles). Snapshots were taken every 5,000 th generation to ensure that an ensemble of uncorrelated frames were obtained. Two-dimensional AA interaction maps for the receptor-ligand complexes were then generated using the Discovery Studio Visualizer 4.1 program (Dassault Syst emes, San Diego, CA, USA).

Binding-free energy calculations
The binding-free energies of the receptor-ligand complexes were calculated using the Molecular Mechanics-Generalized Borne (MM-GBSA) method (Kollman et al., 2000). Free-energies were calculated over 2,000 snapshots from the last 80-100 ns production trajectories. The binding-free energies for each snapshot were calculated according to Equation (1), where G complex , G protein and G ligand represent free-energies of the complex, protein and ligand, respectively. Each of the molecular components were calculated according to Equations (2)-(5), where DG gas is the molecular mechanics free-energy in the gaseous phase, including the van der Waals (DE vdw ) and electrostatic (DE ele ) contributions according to Equation (3). The solvation free energy (DG solv ), including the non-polar (DG SA ) and polar (DG GB ) contributions are solved using Equation (4). SASA is the solvent accessible surface area and was estimated using a probe radius of 1.4 Å (5). T and S represent the overall entropy of the temperature and solvent, respectively.

Predicted binding-free energies
The predicted binding-free energies and its individual energy contributions are shown in Table 2 below. In general, the LPV-bound PR models displayed lower binding energies and thus a stronger structural stability in comparison to the DRVbound mutants. Interestingly, while the PCS069-LPV mutant only stabilized around 65 ns into the simulation, the PCS124-LPV mutant displayed greater structural stability throughout the 100 ns. Contrastingly, the DRV-bound PCS124 mutant only reached a relatively stable equilibrium around 80-100 ns. In the DRV-bound PCS069 mutant a plateau was observed between 35 and 55 ns as well as in the last 90-100 ns of the simulation indicating multiple folding conformations (Supporting information; Figure S2).
The PCS124-A431V model maintained a relatively stable complex from 50-100 ns whilst the PCS069-A431V model showed greatest structural equilibrium at approximately 48-77 ns and then again from 88-100 ns. Although definitive equilibrium was reached at specific intervals, the overall A431V-bound PCS124 and PCS069 models depicted very similar RMSD values over the entirety of the production simulations (Supporting information; Figure S3). Expectedly, the NCjp1 Gag CSs are stereo-chemically larger than their PI antagonists. As such, the CSs encompass a greater surface area within PR's substrate cavity. Although, the NCjp1 CSs comprise several bulky side chains, such as phenylalanine's aromatic benzene ring at position 433, these side chains are rotatable and can fold into alternative conformations as shown in Figure 2. The A431V NCjp1 mutants exhibited significantly stronger binding as compared to the LPV-and DRV-bound PR mutants. Specifically, when comparing the most energetically-favourable binding mode, there was a three-fold energy difference between the A431V mutant (-62.46 ± 5.75 kcal/mol) and the drug-bound mutant (PCS069-LPV: À23.41 ± 6.71 kcal/mol). Interestingly, the L76V-inclusive PCS069 model exhibited a greater binding affinity (-62.46 ± 5.75 kcal/mol) for the NCjp1 CS, as a À12 kcal/mol energy difference was observed in comparison to the PCS124 model (-50.34 ± 6.28 kcal/mol).
Of note, theoretical predictions revealed that vdW forces were significantly involved in ligand binding as opposed to the electrostatic energy contributions (Table 2). However, several pertinent hydrogen bonds were noted in the enzyme-ligand complexes, particularly where active site and/ or substrate recognition PR residues were involved. Bond interactions are discussed extensively in the PR-ligand interactions networks below.

PR-drug interactions
Conformational superimposition of interactions between the mutant PRs and the LPV and DRV PIs are shown in Figures 3  and 4, respectively. Detailed cluster amino acid interaction maps are provided in the Supporting information ( Figures S6  and S7). Interestingly, in the PCS124-LPV mutant (Figure 3), residues I50, G49 and catalytic triad AA G27 hydrogen (H) bonded with LPV's oxygen (O) atoms. While V82A formed an alkyl bond with LPV, V82A' (V181A) 1 formed a ℼ-alkyl bond with LPV's methylbenzene ring ( Figure S6A). In the PCS069-LPV mutant ( Figure S6B), conventional H-bonds were seen with D25 and G48 whilst codon R8' (R107) formed a double H-bond with two O-atoms in LPV. Notably, in addition to V82A/V82A', minor PRM L10F' (L109F) formed a ℼ-alkyl interaction with the drug's methylbenzene ring. Interestingly, L10F in chain A was not involved in ligand binding (Figure 3).
In the PCS124-DRV mutant (Figure 4), R8 and G48' (G147) formed conventional H-bonds with the 4-aminobenzenesulfonamide molecule and the nitrogen (N) atom of the bis-furyl acetate. Furthermore, R8 also formed a ℼ-cation interaction with the benzene ring of 4-aminobenzenesulfonamide. Residue I50 formed a ℼ-donor H-bond with DRV's aromatic benzene. While catalytic residue D25 formed a vdW interaction with DRV, only L10F and V82A formed vdW and ℼ-alkyl interactions with DRV, respectively ( Figure S7A). The PCS069-DRV mutant (Figure 4) revealed carbon H-bonds with residues D30' (D129) and G48' (G147) as well as the heterocyclic furan of DRV's bis-furyl acetate. Similarly, to the PCS124-DRV mutant, D25 only retained vdW interactions with DRV. Interestingly, the PCS069-DRV mutant was the only complex to have lost its interactions with codon 82 in PR. Contrastingly, it was also the only mutant to gain a vdW interaction with I54V ( Figure S7B).

Mutant PR-NCjp1 interactions
To evaluate the specific interactions between the NCjp1 Gag CS and the PR mutants, we constructed Gag-PR interaction maps as shown in Figures 4 and 5. Accordingly, the PCS124-A431V mutant ( Figure 5) revealed H-bonds between residues A28_PR and K436_Gag as well as between G48_PR and the N-atom connecting residues L434 and G435 in Gag. A double H-bond was also observed between PR's R8' (R107) and the O-atoms of Gag residues E428 and Q430. Additionally, a Hbond between D25_PR and the carbon (C) atom of I437 in Gag was also observed. Moreover, residue D29 in PR was attractively charged to R429 in Gag. Of note, whilst L10F_PR formed an alkyl bond with A431V in Gag, G27 and I47 in PR also formed close vdW interactions with Val431. In addition, major PRMs V82A' (V181A) and M46I formed vdW interactions with residues R429 and F433 in Gag, respectively. Furthermore, a ℼ-ℼ T-shaped stacking was formed with F53 in PR and F433 in Gag. Interestingly, in addition to substrate cleft residue V32 in PR, L76_PR formed vdW interactions with K436_Gag whilst both I50/I50' (I149) formed close vdW interactions with each other, G48_PR, L76_PR and the NCjp1 substrate. When compared to the PCS124-A431V model, it was evident that the PCS069 PR mutant (Figure 6) formed significantly more interactions with the substrate. In particular, 12 H-bonds were formed between the PCS069 residues and the NCjp1 Gag CS. Of these, PR residue R8' formed three H-bonds with the O-atoms in residues E428, R429 and Q430 in Gag. Interestingly, while D29_PR formed two H-bonds with both O-atoms in Q430_Gag, D30_PR also formed an additional H-bond with the second O-atom in Q430_Gag. Moreover, AA G48 in PR H-bonded with the O-atom of the A431V mutation. Finally, residues G49' (G148), I50' (I149) and P79' (P180) also H-bonded with the mutated NCjp1 substrate. Furthermore, whilst V82A also formed a vdW interaction with the substrate, V82A' (V181A) was proximal to residues R429 and Q430 in Gag and was closely positioned to minor PRM L10F' (L109F). Interestingly, the switch from  leucine to valine in residue 76 of PR positioned the mutant valine closer to Q430_Gag instead of K436_Gag as seen in PCS124-A431V mutant. Additionally, several interactions were observed with A431V, namely, multiple alkyl bonds with residues V32 and I47 in PR. Finally, whilst D25 did not H-bond with the mutated NCjp1 substrate, it formed close vdW interactions to I47_PR which in turn was closely bonded to A431V in Gag.

Discussion
Here, we predicted molecular models of multidrug resistant PR mutants bound to two PIs and a mutated NCjp1 Gag CS to understand the structural co-evolutionary molecular mechanisms of resistance. Our analyses indicated greater conformational stability was observed when the mutant PR was bound to LPV rather than DRV. Interestingly, while increased particle flexibility was observed in the DRV-bound mutants, the same particle trends were also seen in the PCS-NCjp1 models (Supporting information; Figures S4 and S5). Proteins are dynamically flexible macromolecules that can exhibit large structural changes over time (Haspel et al., 2010). Goodchild et al. (2011) theorized that proteins can alter its folding in response to its environment whilst regulating protein functionality. Furthermore, a study conducted by Porter and Looger (2018) identified several proteins in the Protein Databank that rapidly switch folds, thus having more than one conformation. In this context, this suggests that apart from drastic changes that would render the enzyme inactive, PR can alter its conformation to adapt to its mutated substrate as well as the drug-pressured environment through particle flexibility. € Ozen et al. (2011) hypothesized that the ability of the Gag CSs to fit within the substrate envelope is also determined by substrate size and dynamics. This suggests that PR's capability to incorporate Gag depends on the peptide's rotamers and ultimately its flexibility. This flexibility may also contribute to the stronger binding observed in the PCS-A431V NCjp1 mutants as compared to the drug-bound PR models.
Detailed AA interaction maps revealed several associations between the PRMs and its respective ligands. Our analyses indicated that alkyl/ℼ-alkyl interactions between V82A/V82A' (V181) and LPV in the PCS124 and PCS069 mutants suggests a strong association with the mutant alanine at position 82. Other bond types including alkyl and ℼ -alkyl bonds can improve the hydrophobic bonding between a receptor and ligand (Arthur & Uzairu, 2019). Weber and Agniswamy (2009) demonstrated that in the presence of PIs Saquinavir, Indinavir and DRV, the alanine substitution shifted residues 81 and 82 to compensate for the loss of interactions induced by the smaller side chain. In addition, a study conducted by Liu et al. (2008) revealed that in a PR comprising a single I54V mutation, significant conformational changes observed in the flap region were coupled with structural changes in the 80's loop. In a similar manner, Nakashima et al. (2016) revealed that double PR mutant I47V/I50V induced structural flexibility to the enzyme. Since M46I is found in PR's flaps, it is sufficient to assume that this mutation may also influence PR flexibility. Therefore, in this study, the stronger bond coupled with the shift in main chain residues suggests that partial enzymatic rigidity was induced. Furthermore, this bonding effect could also be correlated with L10F which formed a ℼ -alkyl bond with LPV in the PCS069 mutant. Agniswamy et al. (2013) showed that loss of contact between PR and novel PIs was due to the displacement of a water molecule by L10F. Since only one L10F residue was involved in ligand binding, this asymmetry could aid in maintaining PR's flexibility by not restricting the enzyme with phenylalanine's large bulky aromatic rings in both chains.
In DRV, the bis-furyl (or bis-THF) moiety was designed to produce a network of H-bonds with the main chain atoms of the viral PR (Agniswamy et al., 2015). Interestingly, in a study evaluating the binding differences between DRV and its structural analog Amprenavir, Hou et al. (2009) found that stable interactions were observed between specific AAs (A29, D30, G48) and the bis-furyl and 4-aminobenzenesulfonamide moieties of DRV. Our analyses revealed more vdW interactions instead, particularly in the L76V-PCS069 mutant ( Figure  S7B). Therefore, a loss of contact between PR and these chemical moieties (bis-furyl and 4-aminobenzenesulfonamide) is the main reason for DRV resistance (Raugi et al., 2016). Finally, vdW interactions between I54V and the PCS069-DRV mutant suggests that the flaps were pulled down closer to the drug. This agreed with a study conducted Our analyses revealed that key H-bonds involving several Gag residues across the NCjp1 substrate was observed in the PCS-A431V mutants. Generally, H-bonds facilitate protein-ligand binding through the displacement of receptor water molecules (Chen et al., 2016). These water molecules are displaced by the ligand or are subtly shifted (Huggins & Tidor, 2011). While water molecules are important mediators in protein-ligand binding (Brenk et al., 2006), Chen et al. (1998) showed that in an enzyme-inhibitor complex, the ligand displaced an active site water molecule which created favourable inhibitor orientation. In our study, the extensive formation of H-bonds suggests that protein water displacement may have occurred to properly orient the ligand within the substrate cavity. Furthermore, the proximity of M46I in PR and the phenylalanine of F433 in Gag suggests that coordination between these two residues can regulate movement within the flaps. Contrastingly, the coordination between L10F_PR and A431V_Gag, suggests that these two residues are important for substrate recognition and linkage.
These data also revealed the structural importance of the L76V PRM in NCjp1 substrate recognition. In this study, conformational changes in other regions of PR were observed by interactions between PRMs L10F, V82A and L76V. This suggests that when L76V is co-selected with A431V in Gag, the mutational dynamics of the L10F þ M46I þ I54V þ V82A resistance combination changes to constrict these regions in PR and allow for efficient substrate processing in favour of drug binding. The vdW interactions between D25_PR and the substrate indicates an indirect mechanism of cleavage due to a loss of contact. This concept is supported by Wong-Sam et al. (2018) which showed that L76V promoted reduced hydrophobic contacts and greater flexibility as an alternative mechanism for drug resistance. While vdW interactions are considered weak, these forces are often important in the interaction and shape of molecules (Atkins & de Paula, 2006). Therefore, in this instance, cleavage is coordinated by the flaps and strong vdW interactions rather than direct active site dynamics.
This study showed that distinct patterns of PRMs can act synergistically with each other; and through the 'intelligible' coselection of specific mutations in Gag to restore cleavage in the presence of PIs. It is therefore important that the structural coevolutionary mechanisms of resistance are taken into account when evaluating and/or developing new drug combinations.

Study limitations
This study identified important co-evolutionary molecular mechanisms of resistance between multi-drug resistant PRs, two third-line PIs and a mutated NCjp1 Gag CS. While these data provide strong evidence of complex resistance mechanisms to inhibitory drugs and substrate recognition, some limitations were noted. Although, rapid conformational switching was proposed as a mechanism of resistance for the DRV-bound PR models, accelerated molecular dynamics should be performed on a millisecond scale to confirm if conformational switching is actually occurring. Additionally, per-residue energy decomposition analyses should be applied to evaluate the energy contributions of each AA and their overall correlation to drug binding and/or substrate recognition. Finally, to fully understand the impact of resistance mutations on substrate recognition and/or drug binding, comparative conformational assessment of wild-type PRs and its respective ligands are required. Note 1. In nature, the HIV-1 PR enzyme consists of two identical chains that comprise 99 AAs and runs anti-parallel to one another. While chain A is numbered from 1-99, chain B is numbered from 100-198. In practice, chain B residues may also be referred to as prime, for example, residue ALA 82 (V82A) in chain A is represented as ALA 181 (V82A') in chain B.