Insights on the structural perturbations in human MTHFR Ala222Val mutant by protein modeling and molecular dynamics

Methylenetetrahydrofolate reductase (MTHFR) protein catalyzes the only biochemical reaction which produces methyltetrahydrofolate, the active form of folic acid essential for several molecular functions. The Ala222Val polymorphism of human MTHFR encodes a thermolabile protein associated with increased risk of neural tube defects and cardiovascular disease. Experimental studies have shown that the mutation does not affect the kinetic properties of MTHFR, but inactivates the protein by increasing flavin adenine dinucleotide (FAD) loss. The lack of completely solved crystal structure of MTHFR is an impediment in understanding the structural perturbations caused by the Ala222Val mutation; computational modeling provides a suitable alternative. The three-dimensional structure of human MTHFR protein was obtained through homology modeling, by taking the MTHFR structures from Escherichia coli and Thermus thermophilus as templates. Subsequently, the modeled structure was docked with FAD using Glide, which revealed a very good binding affinity, authenticated by a Glide XP score of −10.3983 (kcal mol−1). The MTHFR was mutated by changing Alanine 222 to Valine. The wild-type MTHFR-FAD complex and the Ala222Val mutant MTHFR-FAD complex were subjected to molecular dynamics simulation over 50 ns period. The average difference in backbone root mean square deviation (RMSD) between wild and mutant variant was found to be ~.11 Å. The greater degree of fluctuations in the mutant protein translates to increased conformational stability as a result of mutation. The FAD-binding ability of the mutant MTHFR was also found to be significantly lowered as a result of decreased protein grip caused by increased conformational flexibility. The study provides insights into the Ala222Val mutation of human MTHFR that induces major conformational changes in the tertiary structure, causing a significant reduction in the FAD-binding affinity.


Introduction
Folate (vitamin B9) is an essential nutrient required for the de novo synthesis of deoxythymidine monophosphate (dTMP) and s-adenosylmethionine (AdoMet) through one-carbon metabolism and methionine cycle, respectively (Casero & Marton, 2007;Kim, 2005). Methylenetetrahydrofolate reductase (MTHFR) is a pivotal enzyme in folate metabolism that catalyzes the reduction of 5, 10-methylenetetrahydrofolate into 5-methyltetrahydrofolate, which is the circulating form of folate and also the carbon donor for the methylation of homocysteine to methionine (Nijhout et al., 2004).
Of the many known mutations existing in the MTHFR gene, the most well studied one is the C677T leading to alanine to valine amino acid change at the 222nd position (A222V; rs1801133). This results in the synthesis of a thermolabile form of MTHFR, whose activity is 50-60% lower at 37°C and approximately 65% lower at 46°C for C677T homozygotes than in similarly treated controls (Frosst et al., 1995;Goyette et al., 1994). Allele frequency of the C677T polymorphism is about 33-37% in European Whites. About 10% of this population is homozygous for the mutation (having TT genotype) (Goyette et al., 1998). Allele frequency of the C677T polymorphism in East Asians is similar to that in European Whites, while the frequency in sub-Saharan Africans is 6%. People who are homozygous for the C677T allele tend to have mildly increased blood homocysteine levels, which can undergo auto-oxidation in plasma, producing free oxygen radicals, and thereby enhancing endothelial tissue damage and inflammation (Kang, Lee, & Chi, 1992;Ravi Kanth et al., 2011). In addition, excess homocysteine can directly impair DNA methylation, resulting in altered gene expression. Elevated level of serum homocysteine is also an established independent risk factor for cardiovascular disease and is associated with neural tube defects in the fetus as well (Morrison et al., 1991).
Mammalian MTHFRs are dimers of 70 kDa subunits; each subunit contains non-covalently bound flavin adenine dinucleotide (FAD). MTHFR is FAD-dependent flavoenzymes. FAD acts as a redox cofactor in several vital metabolic reactions. It exists in two different redox states, acting as both electron donor and acceptor. The polypeptide contains an N-terminal catalytic domain, and a C-terminal regulatory domain, where the allosteric inhibitor AdoMet binds and regulates enzyme activity in response to the cellular level of methionine (Morrison et al., 1991). The N-terminal catalytic domain is homologous to the smaller MTHFR in Escherichia coli, which lacks the regulatory domain. Human MTHFR is expressed at very low levels in E. coli and in yeast; thus, biochemical characterization of the phenotype associated with the C677T polymorphism in the purified enzyme has not been possible (Oikawa, Murakami, & Kawanishi, 2003).
The potential medical importance of this polymorphism, and the emerging recognition of the role of dietary folate supplementation in lowering mild hyperhomocysteinemia in humans, has in turn led to studies of MTHFR from E. coli, and construction of a homologue of the C677T mutation in the E. coli enzyme (Li, 2002). The lack of experimentally established tertiary structures of human MTHFR protein makes computational methods such as homology modeling and threading as the only methods capable of predicting its structure. The study was carried out with an aim to gain insight into the conformational changes in the tertiary structure of human MTHFR induced by Ala222Val mutation, by employing molecular modeling and dynamics studies.

Materials and methods
While there have been several attempts to model the human MTHFR structure, none have been entirely successful in unraveling its complex 656 amino acid-long structure. Moreover, the perturbations induced in the protein structure caused by the clinically vital MTHFR C677T mutation needs to be analyzed.

Sequence retrieval and template identification
A three-dimensional model of a protein with unknown structure (target) can be built using sequence alignment to a protein of known structure (template) if the sequence identity is high enough (Moult, 2005).
Six hundred and fifty-six amino acid-long sequence of the human MTHFR protein was retrieved from the UniProt database (P42898). The preliminary step in building a protein model by comparison method is to identify the suitable template that shares a high degree of sequence similarity with the human MTHFR.
A position-specific iterated (PSI) protein BLAST was performed against the protein databank website to identify proteins with already established 3D structures possessing a high degree of sequence similarity with the human MTHFR protein. The expected threshold was set to 10 and the word size as 3. BLOSUM62 was chosen as the substitution matrix. A gap existence penalty was set as 11 with an extension penalty of 1.
The protein structures taken as templates shared a high degree of sequence similarity with Human MTHFR in the range 42-338 with very low e-values. All the 5 template proteins were methylenetetrahydrofolate reductase proteins of bacterial origins.

Protein modeling
For building the protein structure, multiple alignments between the chosen templates and query sequence were carried out by employing global dynamic programming utilizing linear gap penalty option of Modeller9v7. Model building was carried out by satisfaction of spatial restraints, employing 'auto model' class. Python script was used for generating the model. Steric clashes, which arise as result of Pauli repulsion, were removed by employing steepest decent energy minimization ((Modeller) http://www.salilab.org/modeller).

Model validation
Discrete optimized protein energy (DOPE) was employed for refining the loops of models constructed by Modeller9v7 (Shen & Sali, 2006). Psi/Phi Ramachandran plot for the protein was generated through PROCHECK to gauge the goodness of the modeled structure (Ramachandran, Ramakrishnan, & Sasisekharan, 1963). Model accuracy was evaluated using Rosa and ERRAT2. Rosa employs knowledge-based potentials of mean force while ERRAT2 works by analyzing the statistics of nonbonded interactions between different atom types (Wiederstein & Sippl, 2007). Modeler objective function, in conjunction with DOPE scores and ERRAT2 scores, was used to recognize the best model. The overall quality of the modeled protein was evaluated using VERIFY-3D server, which facilitates a visual analysis of the modeled protein structure (Bowie, Luthy, & Eisenberg, 1991;Colovos & Yeates, 1993).

Binding site prediction
Multiple sequence alignment of the template proteins -1B5T, 3FST, 3FSU, and 1V93with human MTHFR sequence was carried out with CLUSTALW2. The template structures were superimposed upon the modeled MTHFR to recognize the location of conserved FAD-binding sites. The human MTHFR structure was submitted to CASTp server for confirming the binding site as detected by multiple sequence alignment and superimposition.
Molecular docking analysis FAD structure was downloaded from the PubChem database (CID 643975). Docking of the modeled MTHFR against its natural cofactor, FAD was performed using Glide docking program present in Maestro 9.0. A series of hierarchical filters were employed using Glide to find the optimal ligand binding locations in a previously built receptor grid space (Friesner et al., 2006). The filters included a systematic search approach, which samples the positional, conformational, and orientational space of the ligand before evaluating the energy interactions between the ligand and the protein. The standard precision module of Glide was employed. A grid box of 30 Å × 30 Å × 30 Å was first centered on the residue Ala222. Default docking parameters were used. Next, minimization of the ligand in the field of the receptor was carried out using the OPLS-AA force field with a distance-dependent dielectric of 2.0 (Kaminski et al., 2001). Afterward, the lowest energy poses were subjected to a Monte Carlo (MC) procedure that samples the nearby torsional minima. The best pose of FAD ligand was determined by the Emodel score and Glide score (Friesner et al., 2006).

Molecular dynamics simulation
Gaining a detailed insight on any biological phenomenon is dependent on analyzing biological systems at molecular and atomistic level (Pikkemaat et al., 2002). GPU version of PMEMD engine provided with Amber12 was employed for carrying out all-atom molecular dynamics simulations of the modeled wild-type MTHFR-FAD complex obtained from the aforementioned docking studies (Salomon-Ferrer, Götz, Poole, Le Grand, & Walker, 2013). After the docking of the wild-type MTHFR-FAD complex, the Alanine 222 residue was mutated to valine using the graphical user interface (GUI) of Chimera (Pettersen et al., 2004), in order to generate the mutated variant. Both the wild and mutated (Ala222Val) MTHFR-FAD complexes were subjected to an all-atom molecular dynamics simulation using the previously reported parameters provided by Bhakat, Martin, and Soliman (2014) and Karubiu, Bhakat, and Soliman (2014). During molecular dynamics simulation, the effect of in silico mutations was monitored by performing different postdynamics analyses as described by Bhakat et al. (2014). In brief, the ligands (FADs) were extracted from their bound conformation and parameterized using GAFF force field integrated with Amber 12, whereas the protein structure was defined by FF99SB variant of amber force field (Wang, Wolf, Caldwell, Kollman, & Case, 2004). The Leap module integrated with Amber 12 was used to create a complex, immersed in a water box of 10 Å surrounded by TIP3P water molecules. The missing residues and hydrogen atoms were added and counter ions were placed to neutralize the systems. All systems were then subjected to a rigorous process of minimization, heating, and equilibration as described by Bhakat et al. (2014). The systems were subjected to a 50 ns production run in a NPT ensemble with a target pressure set at 1 bar and coupling constant of 2 fs. The trajectories were saved in every 1 ps and analyzed using PTRAJ and CPPTRAJ modules integrated with Amber 12 (Roe & Cheatham, 2013). All visualizations were carried out using visual molecular dynamics (VMD) (Humphrey, Dalke, & Schulten, 1996), Chimera (Pettersen et al., 2004), and Maestro's GUI interface.
Root mean square deviation (RMSD) for the modeled protein-FAD complex and the docked ligands within the binding pocket of protein was calculated for the entire simulation trajectory with reference to their respective first frames. H-bond analyses were carried out for all the frames of 50 ns MD simulation of wild and mutant variant of MTHFR. The hydrophobic interactions and H-bonds were calculated using ligand interaction option integrated with Maestro GUI, where H-bonds were defined as acceptor-donor atom distances of less than 3.3 Å, hydrogen-acceptor atom distance of maximum 2.7 Å, and acceptor-H-donor angle greater than 90°.
Docking studies with Glide were carried out in a Linux Centos 6.2 OS platform. A Dell High Performance Computing system with an Intel Xeon E5 with 48 cores at 1.8 GHz with 96 GB DDR RAM and Nvidia Quadro 4000 GPU was employed for the purpose.

MM/GBSA binding free energy calculation
Post-molecular dynamics, molecular mechanics-/generalized Born surface area (MM/GBSA)-based relative binding free energy quantification has emerged as an effective tool to understand mutational effect in large biomolecular systems Tsui & Case, 2000). MM/GBSA calculations were averaged over 1000 snapshots with an interval of 50 ps through the 50 ns production run. MM-/GBSAbased calculation is a thermodynamic endpoint calculation which gives valuable insights into association and also highly corresponds to its biological activity. The following set of equations give a detailed explanation of the calculation of binding free energy: The detailed explanation of these parameters was reported by Bhakat et al. (2014) in a recent report.

Results and discussion
Human MTHFR protein structure modeling and active site prediction Homology modeling of human MTHFR was accomplished by taking already available crystal structures of methylenetetrahydrofolate reductase from E. coli (3FSU, 3FST, and 1B5T) and Thermus thermophilus Hb8 (1V93, 3APT) as template. All template proteins showed an identity of~40% each and with a query coverage of >90% in the amino acid region 42-338.
Multiple sequence alignment revealed a high degree of conservation of the amino acid residues involved in FAD binding in all the 5 template proteins from Thermus thermophilus and E. coli used for modeling the human MTHFR. Figure 1 depicts the multiple sequence alignment of the template proteins with human MTHFR. Sequence conservation within the catalytic domains suggests that the mechanism of transmission of structural distortions arising from the Val222 mutation to the FAD in humans might be similar to the mechanism existing in E. coli (Pejchal et al., 2006). The predicted protein model contained 97% of all residues in the allowed regions. Ramachandran plot for the modeled MTHFR protein structure revealed that of the 568 non-glycine and non-proline residues, 78.2% (444 amino acid residues) were present in the most favored regions, 101 residues (17.8%) in additionally allowed regions, 12 amino acid residues (2.1%) in the generously allowed region, and 11 amino acids were present in disallowed regions, which constituted a mere 1.9% (Figure 2(A)). ERRAT2 score of 58 was obtained for the modeled wild-type MTHFR. The ERRAT2 score can be attributed to the slightly low resolutions of template structures used for modeling. Verify 3D was used to assess the compatibility of a molecular model with its primary structure. Existence of a conserved fingerprint in the FADbinding region and the high degree of sequence similarity in the corresponding amino acids in human MTHFR revealed that the same set of residues need to be targeted in human MTHFR to carry out molecular docking with FAD. The template structures were superimposed upon the modeled MTHFR to recognize the location of conserved FAD-binding sites. The FAD-binding site residues, as identified by employing CASTp, were in concurrence with the results of superimposition. Try174, Ala175, and Lys178 were recognized as plausible amino acid residues that constitute the FAD-binding pocket in human MTHFR.

Docking of FAD and human MTHFR
The prepared structure of FAD was docked with the stabilized structure of MTHFR and the results showed good binding affinity as expected. The best docking pose for FAD ligand was identified by examining their relative total energy score, and the energetically most favorable conformation was selected as the best pose. FAD was bound to the modeled MTHFR structure with a Glide XP score of −10.3983 (kcal mol −1 ). FAD interacted with the amino acids Gly221, Ser219, Phe207, Asp291, and Thr297 via H-bonds; while Ala220, Ile224, Ile284, Leu305, Leu229, Ile225, Ile226, Ala222, Ala292, Phe224, and Val218 interacted with FAD forming hydrophobic interactions ( Figure 3). The Glide score was employed as a function of fitness and the best-fit pose for a given ligand was determined by the Emodel score. Figure 4 highlights the secondary structure representation of both wild and mutant variants of MTHFR as well as the superimposed view of mutant over wild. This gives an overall static view of difference in secondary structure as well as the structural difference caused by the mutation.

Structure perturbations analysis: molecular dynamic simulation result
Moreover, RMSD of backbone C-α atoms was calculated to analyze the conformational stability of the wild and mutant variant of MTHFR and to observe the alignment of all the protein frames with that of the reference frame backbone. RMSD is a measure of the average change in displacement of a selection of atoms for a particular frame in relation to a reference frame. This can yield information on the RMSD evolution of the protein and offer insights on the structural conformation throughout the simulation. RMSDs of all Cα atoms from their initial configuration as a function of simulation time (50 ns of MD simulation) for the investigated systems (wild-type and mutant MTHFRs) was plotted and illustrated in Figure 5.
The backbone RMSD converged well in both cases and an average difference of~.11 Å was found between the wild and the mutant variants. The average RMSD of WT-MTHFR was found to be 5.92 Å and that of MT-MTHFR was 6.03 Å. The RMSD fluctuation highly corresponded with the increased root mean square fluctuation (RMSF) and Rg in case of MT-MTHFR and thus validates the fact that the mutation leads to an increased flexibility resulting in loss of protein grip on ligand.
The RMSFs of FAD were broken atom-by-atom and were plotted corresponding to the 2D representation of FAD ( Figure 6).
The ligand RMSF may yield insights on how FAD interacts with the MTHFR. These RMSF values reflect the internal atom fluctuations of the ligand. Further, per-residue C-α fluctuation was calculated for both wild and mutant variants of MTHFR in order to gain an insight into the effect of mutation on conformational flexibility of overall residues. Often mutations led to higher fluctuations of overall residues which contributed to the decreased receptor grip on ligand (Chetty, Bhakat, Martin, & Soliman, 2015). In case of wild and mutant variants of MTHFR, a higher fluctuation was observed at residue 222 as a result of mutation which confirms that Ala222val mutation led to an increased conformation motion in case of MT-MTHFR leading to a decreased receptor grip on FAD, ultimately resulting in reduced functionality (Table 1). Average RMSF for all the amino acid residues in the WT-MTHFR was found to be 2.20 Å, in comparison with 2.34 Å in case of MT-MTHFR, which further confirmed the conformational instability as a result of mutation (Figure 7). Higher average fluctuation at the point of mutation in MT-MTHFR suggests a possible mechanism by which distortion in the tertiary conformation of the protein occurs, as well as, loss of average residueresidue H-bond connections leading to loss of overall receptor grip on FAD.

Radius of gyration (Rg)
Radius of gyration (Rg) is a measure of the compactness of the protein structure. Rg of native WT-MTHFR and the MT-MTHFR variant complexes was calculated at 300 K. In the conformational analysis, radius of gyration was defined as the moment of inertia of the group of atoms from their center of mass (Hong & Lei, 2009). Figure 7 highlights the comparative radius of gyration of Cα atoms between the wild and the mutant variants. This clearly reflects the highly unstable nature of the mutant complex with an average Rg of 26.41 Å, as compared to the wild type with an average Rg 26.26 Å. This is in accordance with our assumption that mutation decreases the interaction among neighboring amino acids, which leads to unstable moment of inertia. The fluctuation of Rg clearly corresponds with the RMSF fluctuation and validates the fact that Ala222Val mutation leads to an increase in conformational flexibility and a decrease in receptor grip on ligand as evident from a decreased binding free energy.

MM-/GBSA-based binding free energy
Post-dynamics MM-/GBSA-based techniques have emerged as any effective tool to identify binding affinity of the docked complexes according to their relative binding free energy. The difference in binding free energy between WT-MTHFR and MT-MTHFR was found to be 4 (kcal mol −1 ), which corresponds to the fact that enzyme loses its binding grip to the FAD as a result of mutation (Figure 8), Table 2).

FAD interactive profile with wild-type and mutant MTHFR
After molecular dynamics simulation, H-bond with Gly221, Phe207, Asp291, and Thr297 disappeared while Ser219 was still in contact with FAD via H-bond [depicted in Figure 9]. Glu383 and Lys288 were found to form new H-bonds after MD simulations. While amino acid residues Ala220, Ile284, Leu305, Leu229, Ile225, Ile226, Ala222, Ile225, Ala292, and Phe224 of the wild-type MTHFR formed hydrophobic interactions with FAD after the docking, residues Phe207, Ala292, Tyr297, and Ile225 were found in contact with FAD post-MD simulations (Figure 10(A)). In the Ala222Val mutant MTHFR, the amino acids involved in hydrophobic interactions with FAD included Ala220 and Pro198, while Gly298, Glu383, and Ser219 were involved in hydrogen bonds. Ser219 is the only residue which interacted with the phosphate group of FAD. Out of all the   residues which constituted protein-ligand contact in wild-type MTHFR, only Glu383 interacted with FAD in the mutant protein (Figure 10(B)).

H-bond monitoring
Further, the number of bonding contacts including hydrogen bonds, hydrophobic interactions, and dispersion forces between FAD and MTHFR was monitored through the 50 ns simulation. Average residue-residue hydrogen bond count for wild and mutant variant was found to be 255 and 247, respectively ( Figure 11). The deletion of residue-residue hydrogen bonds as a result of mutation led to an increase in flexibility. The lower number of hydrogen bond interactions in case of the Ala222-Val mutant when compared to the wild-type revealed a distinct reduction in the FAD-binding affinity in the mutant, as a result of increased conformational flexibility which resulted in decrease of receptor grip on the ligand.
In order to determine the stability of hydrogen bonds in the FAD-binding pocket of MTHFR, the stability of MTHFR-FAD complex was monitored over the course of the molecular dynamics simulations. Hydrogen bond profiles of FAD with both the wild-type and Ala222Val mutant MTHFR were monitored. The analysis revealed that FAD comprises 4 (highest) average hydrogen bonds during the course of simulation period sharing H-bonds with Glu383, Asp382, Glu285, and Lys288, while the C677T mutant showed poor hydrogen bond interactions with 1-2 H-bonds on an average during the trajectory period ( Figure 10).
High degree of differences in the amino acid residues which constitute the bonding interactions between the wild-type and mutant MTHFR indicates a major conformational shift in the protein structure due to the single amino acid change. Reduction in total number of interactions in the mutant MTHFR indicates a reduction of its FAD-binding affinity and also explains the results from previous biochemical studies, indicating an enhanced rate of FAD loss as the possible mechanism by which Ala222Val mutation leads to reduced catalytic functionality (Yamada, Strahler, Andrews, & Matthews, 2005). Findings of the study were found to be in accordance with studies by Guenther et al. (1999) who explored the possible structural perturbations caused by the Ala177Val mutation in E. coli corresponding to the Ala222Val polymorphism in humans. The study revealed that the mutation does not have any impact on the kinetic properties of MTHFR but induces increased loss of FAD cofactor. The enhanced FAD cofactor loss as an effect of the Ala222Val mutation is further validated by experiments on extracts of mutant and wild-type human MTHFR (Yamada, Chen, Rozen, & Matthews, 2001).

Conclusion
The MTHFR enzyme requires a FAD cofactor as an essential requirement for the catalytic activity of human MTHFR, which is critical for the conversion of homocysteine to methionine. The current study employed a computational approach involving homology modeling, in silico mutagenesis, and MD simulation helps to gain insight on MTHFR Ala222Val mutation's impact on the protein tertiary structure. The study revealed local structural perturbations that accounts for weaker binding of FAD by the mutant MTHFR enzyme thereby enhancing the FAD dissociation rate leading to decreased enzyme activity. Study provides clarity on how local structural distortions in the mutant MTHFR caused by a single amino acid change leads to elevated hyperhomocysteine levels and hypomethylation, which in turn addresses the C677T polymorphism's association with several clinical features. The study has opened up avenue for carrying out similar in silico studies on clinically significant natural variants of other proteins involved in the folic acid metabolism