Towards the identification of the allosteric Phe-binding site in phenylalanine hydroxylase

The enzyme phenylalanine hydroxylase (PAH) is defective in the inherited disorder phenylketonuria. PAH, a tetrameric enzyme, is highly regulated and displays positive cooperativity for its substrate, Phe. Whether Phe binds to an allosteric site is a matter of debate, despite several studies worldwide. To address this issue, we generated a dimeric model for Phe–PAH interactions, by performing molecular docking combined with molecular dynamics simulations on human and rat wild-type sequences and also on a human G46S mutant. Our results suggest that the allosteric Phe-binding site lies at the dimeric interface between the regulatory and the catalytic domains of two adjacent subunits. The structural and dynamical features of the site were characterized in depth and described. Interestingly, our findings provide evidence for lower allosteric Phe-binding ability of the G46S mutant than the human wild-type enzyme. This also explains the disease-causing nature of this mutant.


Introduction
The hyperphenylalaninemias (HPAs) are autosomal recessively inherited disorders that, if left untreated, can lead to severe mental retardation in humans. Due to their relevance, many countries include a test for HPAs in neonatal screening programmes. In this classical inborn error of metabolism, the gene primarily affected is phenylalanine hydroxylase (PAH). Mutations in the PAH gene cause impaired function of the enzyme that results in the accumulation of high levels of blood plasma Phe and toxic concentrations in the brain (Erlandsen & Stevens, 1999). Generally, mutations in human PAH (hPAH) result in reduced enzyme activity (Cerreto et al., 2011;Daniele et al., 2008Daniele et al., , 2009, decreased stability (Gersting et al., 2008;Knappskog, Flatmark, Aarden, Haavik, & Martinez, 1996) and misfolding of the enzyme (Pey, Stricher, Serrano, & Martinez, 2007).
PAH catalyzes the conversion of L-Phe (referred as to Phe) into L-Tyr utilizing (6R)-L-erythro-5,6,7,8tetrahydrobiopterin cofactor (BH4) and molecular oxygen. PAH is a tetrameric enzyme in equilibrium with a dimeric form (Bjørgo, de Carvalho, & Flatmark, 2001;Martínez et al., 1995). Each monomer adopts an α/β structure and is built from an N-terminal regulatory domain (RD, residues 1-117), a catalytic domain (CD,, which includes binding sites for iron, substrate and cofactor, and a tetramerization domain (residues 411-452). The available 3D structures of the human enzyme lack the RD. However, a full-length composite tetrameric model of hPAH is commonly built utilizing the structure of rat PAH (rPAH) (Kobe et al., 1999), which shares a high sequence identity with hPAH and includes almost the entire RD. The RD of PAH contains an ACT domain that is a structural module found in a group of allosteric enzymes regulated by the binding of a specific amino acid (Aravind & Koonin, 1999). In the proteins containing ACT domains, ligands tend to bind at the interfaces between ACT domains and in proximity of a conserved Gly (Gly46 in hPAH) (Grant, 2006). In the commonly accepted tetrameric full-length PAH structure, the four ACT domains do not associate with each other, but are in contact with the CD of the adjacent chain in the dimer.
PAH is tightly regulated in order to maintain the correct levels of Phe in the blood. The three main regulatory mechanisms of hPAH are: (i) allosteric activation by the substrate Phe (Shiman & Gray, 1980;Shiman, Jones, & Gray, 1990), (ii) inhibition by BH4 (Døskeland, Martínez, Knappskog, & Flatmark, 1996) and (iii) additional activation by phosphorylation (Shiman & Gray, 1980). In particular, the allosteric Phe binding site is still unknown, hence the regulatory mechanism of the activation by the substrate is controversial. It was postulated that the Phe-binding site lies within the RD (Fitzpatrick, 2000;Kaufman, 1993;Li, Dangott, & Fitzpatrick, 2010;Li, Ilangovan, Daubner, Hinck, & Fitzpatrick, 2011;Shiman & Gray, 1980). In contrast, it was also postulated that Phe-activation involves binding only in the active site (Thórólfsson et al., 2002). However, very recently it was definitely established that activation of PAH by Phe does not require binding of the amino acid in the active site (Roberts, Khan, Hinck, & Fitzpatrick, 2014), supporting the existence of a different allosteric site, besides the active site.
Recently, Jaffe and co-workers proposed a new morpheein model for allosteric regulation of PAH (Jaffe, Stith, Lawrence, Andrake, & Dunbrack, 2013). In this model, ACT domains associate in groups of two ACT, forming dimers with an ACT:ACT interface within the tetramer (Jaffe et al., 2013). The latter authors postulate the presence of the allosteric Phe binding site at the intersubunit interface of ACT domains (Jaffe et al., 2013), as occurs in ACT-containing proteins. Accordingly, it was recently shown that RD dimerization is linked to allosteric activation of PAH (Zhang, Roberts, & Fitzpatrick, 2014). This new model expands the debate on the Phe-binding allosteric site and deserves further investigations.
In a previous molecular dynamics (MD) study on the isolated RD, we found a correlation between the dynamics of the enzyme and Phe-binding to a hydrophobic region corresponding to the ligand binding site of the ACT-protein family (Carluccio, Fraternali, Salvatore, Fornili, & Zagari, 2013). This hydrophobic region, which contains the sequence motifs GA(S)L/IESRP, is located at the dimeric interface between the interacting RD and CD of two distinct subunits. Moreover, it has been demonstrated that the GA(S)L/IESRP motifs are involved in the binding of Phe to the ACT-containing protein prephenate dehydratase (PDT), which also binds Phe (Tan et al., 2008), and to the isolated RD of hPAH (Gjetting, Petersen, Guldberg, & Güttler, 2001). Accordingly, we proposed (Carluccio et al., 2013) that the putative hydrophobic region for Phe-binding lies at the dimeric interface between the RD and the CD (lined by the conserved ACT-domain motifs GAL/ESRP) of two distinct subunits. Therefore, here, we investigated dimeric PAH forms. It is noteworthy that the dimeric form of PAH, which lacks the C-terminal 24 residues, is enzymatically active and allosterically activated by the Phe substrate (Hufton, Jennings, & Cotton, 1998).
The aim of this study was to shed light on the open questions related to the regulatory role of the PAH enzyme by identifying, in silico, the putative allosteric Phe-binding site in the commonly accepted PAH dimers. Thus, using molecular docking combined with dynamics simulations, we identified a Phe binding site at the interface between the RD and catalytic domains. This allowed us to describe the Phe-PAH interactions and also to elucidate the structural effects of the mutation on the Phe binding site in the disease-causing G46S mutant.

Building of dimeric models
Models of the dimer (residues 33-424 of two chains A and B) of the wt-hPAH enzyme and the G46S mutant were generated in silico with the Modeller 9v8 programme (Šali & Blundell, 1993). To build reliable models, we used the available dimeric crystallographic structures of rat and human PAH. Indeed, a composite dimeric model, built by superimposing the secondary structure elements of the corresponding catalytic domains of the dimeric rPAH (PDB code 1PHZ) on the dimeric hPAH (PDB codes 1PAH) structures, was used as template. The dimeric rPAH structure was used for the N-terminal RD (residues 33-116), the dimeric hPAH structure for the catalytic domain (residues 117-410) as well as for the dimerization motif (residues 411-424). The template and the target sequences were aligned using Bodil (Lehtonen et al., 2004). Subsequently, the alignment files were used as input in Modeller, and 200 models of each studied forms were generated. The model that presented the best Modeller DOPE score was selected for the MD simulations.

MD simulations of the unbound and Phe-bound PAH dimeric forms
MD simulations were performed with the GROMACS 4.5.5 (Van Der Spoel et al., 2005) package using the GROMOS 43a1 force field (van Gunsteren et al., 1996) on the generated unbound dimeric models and on the rat dimeric crystal structure. These dimers (each consisting of 784 residues) were solvated in dodecahedral boxes containing simple point charge water molecules (Berendsen, Postma, van Gunsteren, & Hermans, 1981). To neutralize the systems, 6Na + ions in the human and mutant systems and 10Na + ions in the rat system were randomly placed far away from the surface of the enzymes. Simulations were carried out for 30 ns utilizing the procedure described elsewhere (Carluccio et al., 2013). The same procedure was used for 10-ns long MD simulations of selected Phe-bound hPAH, rPAH and G46S dimers (see Section 2.3). Two replicas were run for each system. MD trajectories were analysed using GROMACS analysis tools. MD simulations were run on 128 processors of the HECToR (High End Computing Terascale Resources) supercomputer at EPCC at the University of Edinburgh.

Molecular docking and energetic analysis
Phe docking into the dimeric PAH enzymes was performed with the Molecular Operating Environment (Vallian, Barahimi, & Moeini, 2003) software (Chemical Computing Group Inc, 2012). Phe was docked into one putative allosteric site (at the interface between the RD of chain A and the CD of chain B) of the unbound dimeric enzymes. The Phe-bound structure of the PDT (PDB code: 2MQX) was used as template to identify the Phe-binding site. Thus, the binding site was defined by The free energy of binding for each simulated complex was calculated with the MOE using the Pose Rescoring method of the Dock module. The calculations were carried out on representative structures extracted by cluster analysis from the MD simulations of the bound forms. The representative structures were previously validated with the MOE Structure Preparation module. The GBVI/WSA dG scoring function (Corbeil, Williams, & Labute, 2012) with the Amber 12EHT force field was used. The average values calculated over the representative structures of the three largest clusters are reported.

Mobility of the unbound dimeric PAH forms
In silico models were generated for the unbound dimer (residues 33-424 of chain A and B) of the wt-hPAH enzyme and of the G46S hPAH mutant using the dimeric wt-rPAH and wt-hPAH (RD missing) structures as templates. Figure 1 shows the subunit A 3D structure of the wt-hPAH model with the loop-containing regions analysed in this study, namely L1 (42-45), L2 (59-64), L3 (70-75) and L4 (82-90) in the RD, the hinge (111-117), and L5 (135-151), L6 (336-339), L7 (360-371) and L8 (376-384) in the CD. Two replicas of MD simulations were performed on the generated models and the dimeric wt-rPAH template. There were no significant differences among the replicas. Overall, during the simulations, the systems showed deviations from the initial structures and, at the end of the simulations, all trajectories reached a plateau within 4-5 Å in the RMSD from the initial structures calculated over C α atoms.
The dynamics of wt-hPAH is mostly characterized by the large motions of the loop-containing regions of the RD (L1-L4), the hinge region between the RD and the CD (111-117), and four segments in the CD (L5-L8) ( Figure S1). As shown by essential dynamics analyses, the movements of RD, of the hinge 111-117 segment and of the above-mentioned CD regions are represented by the first PC of the motion in the human wild-type enzyme (Figure 2 During all simulations, a high RD mobility was observed in at least one chain, thereby confirming the high flexibility found in our previous MD simulations on the isolated RD (Figure 2 . In particular, in chain A of wt-hPAH, the two α-helices and the loop-containing regions (from L1 to L4) have a higher mobility than the β-sheet (Figure 2(A) and Figure S1). A large rearrangement is also observed for the hinge 111-117 segment (Figure 2(B) and Figure S1) connecting the RD to the CD. The principal movement of the RD in chain A and the hinge region results in an RD shift with respect to the CD of the adjacent subunit within the dimer (Figure 2(A)), accompanied by changes in the pattern of interactions and in the distances between residues at the dimeric interface. Moreover, in the CD dynamics, segments L5, L6, L7 and L8 show large amplitude motions (Figure 2(C) and Figure S1). All segments are surface loops and L8 is located over the active site cavity (Figure 1).
Overall, all these loop-containing regions (in RD, in the hinge and in CD) were the most flexible ones also in the rat and in the G46S mutant dimers ( Figure S1). However, in these systems, in particular in the rat system, the RDs did not undergo a significant shift with respect to the adjacent CD, even though their movements lead to differences from the initial structures in the positions and orientations of some residues at the dimeric interface.
In summary, the regions involved in the RD/CD interface and close to the putative Phe-binding site differed substantially among the three systems (see Section 3.2).

Docking poses of the Phe-bound dimeric PAH forms
Taking into account (i) experimental information about the residues in the RD that are affected by Phe-binding (Gjetting et al., 2001), (ii) the data generated by our previous MD simulations (Carluccio et al., 2013) and (iii) the known position of the allosteric Phe-binding site in the ACT-containing PDT protein (Tan et al., 2008), we generated models of the PAH/Phe complex by molecular docking.
The docking procedure was performed with the MOE software, using the minimum energy conformation extracted from the MD simulations of the unbound hPAH, rPAH and G46S mutant dimers. Phe was docked into a putative allosteric site between the RD of chain A and the CD of chain B. Although the Phe-bound structure of the PDT was used as template for the Phe position in the PAH dimers, in all docking simulations Phe was in a position slightly different from that of the template. However, the binding site of all the generated poses is located in the proximity of the L1 loop and the Rα1 helix in the RD at the interface with the Cα4 helix (residues 204-216) in the CD (Figure 3). Figure 3(A)-(C) shows the binding site residues for each PAH form. In the three unbound forms, because of the differences in the motion of regions involved in the RD/CD interface, the residues that define the binding site in the docking simulations had different relative spatial positions. In the wt-hPAH, as a result of an RD shift, residues of the N-terminal part of the Rα1 helix (46-49) were interfaced to residues of the N-terminal part of the Cα4 helix (204-209) and were close to the 295-297 segment (Figure 3(A)−(D)). All the above-mentioned residues were arranged, so that there was enough space between them to accommodate the Phe. In the other two systems, residues of the N-terminal part of the Rα1 helix Figure 2. Porcupine plot illustrating the motions represented by the first PC (PC1) from the simulation of the wt-hPAH. Each Cα atom has a cone attached pointing in the direction of motion described by PC1. Chain A is drawn in magenta, chain B in green. Letters R and C associated to the secondary structure nomenclature refer to regulatory and catalytic domains, according to the secondary structure notation by (Flatmark & Stevens, 1999). The picture shows the movements of the RD, of the hinge region 111-117 and of the surface loop-containing regions L5-L8 in the CD. Red arrows indicate the motions. Colour code is the same as in Figure 1.
were close to residues of the central part of the Cα4 helix (207-211) (Figure 3(B)−(D)). In particular, in the mutant, such residues, including Ser46, were arranged in such a way as to limit the access of the Phe to protein buried areas. Thus, the residues that form the binding site, and also their conformations, differ among the structures of the three systems selected in the docking procedure. Consequently, the position of Phe differs between the systems. In fact, Phe is more exposed to the solvent in the rat and the mutant systems than in the wt-hPAH system (Figure 3(D)). In each system, Phe had a diverse interaction pattern with the residues lining the effective binding site (Figure 3(A)−(C)). In particular, in the G46S complex, Phe interacted with the side chain of Ser46 and lost contact with more buried residues, e.g. Tyr 77. For each complex, 30 poses were generated and only the pose with the lowest MOE score was subsequently subjected to MD simulations.

Mobility and structural features of Phe-bound complexes
The structure of the selected complexes was refined by performing 10-ns long MD simulations. During MD simulations, the trajectories of the selected Phe-bound wild-type forms were stable throughout the simulations and reached a plateau within an RMSD of 1.5-2.0 Å from the initial structures calculated over all C α atoms, which indicates their stability and reliability.
In the wt-hPAH complex, the RD of chain A and the L6-L7 loops of chain B in the CD have a lower mobility than in the unbound enzyme (Figure 4(A)), indicating that the presence of Phe in the identified site reduces the movements of those domains that are involved in Phe-binding. Moreover, also the fluctuations of hinge regions 111-117 in both chains are lower with respect to the unbound enzyme. The differences in the mobility of the RD (chain A) in the wt-rPAH are less relevant, but also in this case, the mobility of the hinge regions and of the L6-L7 loops (chain B) is reduced (Figure 4(B)).
Interestingly, in the wt-hPAH, the L5 and L8 loops, which contribute to accessibility to the active site, show a slightly increased flexibility in the presence of Phe (Figure 4(A)). In the wt-rPAH, the L5 loop shows a decreased flexibility in the presence of Phe, while the L8 loop shows a meaningfully increased flexibility in the presence of Phe (Figure 4(B)). Binding site residues are drawn as sticks in different colours: cyan in hPAH (A), green in rPAH (B) and magenta in G46S (C). Helices are drawn in red, β-strands in yellow. Letters R and C refer to regulatory and catalytic domains, according to the secondary structure notation by (Flatmark & Stevens, 1999). (D) Superposition of the selected docked complexes: hPAH complex in cyan, rPAH complex in green and G46S complex in magenta.
To understand better the structural differences between the residues of the Phe-binding site, we analysed the RMSD of these residues in the complexes at 10 ns from the starting unbound systems. In the human wt complex, the RMSD values of the C α atoms of thebinding site residues did not exceed 1.22 Å (Figure 5(A)). Instead, in the mutant complex, the RMSD values of some residues reached as high as 2.5 Å (Figure 5(B)). This finding indicates that, in the presence of Phe, the conformational changes of these residues were greater in the mutant than in the wt enzyme.
We analysed the interactions of Phe with PAH binding-site residues in the three systems (Figures 6(A), (B) and 7(A)−(C)). In the wt-hPAH complex, during simulations, the position of Phe did not change substantially. The distances of the Phe from the interacting residues (residues 46, 47, 48, 49, 74, 75, 77 of chain A and residues 204, 205, 208, 209, 296, 297 of chain B) at 0 ns remained roughly constant during the simulation (Figure 6(A)). In the wt-hPAH complex during the simulation, the carboxylic group of Phe formed stable H-bonds with residues Ala47, Leu48, Ala49, Tyr77 from chain A (Figure 7(A)) and a fluctuating hydrogen interaction with residue Tyr204.
In the wt-rPAH complex, during the dynamics, Phe slightly shifted towards the N-terminal part of the Cα4 helix and its aromatic ring pointed towards a more buried area of the CD. Also in this case, the distances of the Phe from the interacting residues (residues 43, 46, 47, 48, 49, 52, 77 of chain A and residues 204, 207, 208, 211 of chain B) at 0 ns did not change greatly during the simulation, although new interactions with other residues were formed. Moreover, at the end of the simulation, the amino group of Phe formed an H-bond with residue Lys199 and the carboxylic group formed an H-bond with residue Asn207 from chain B (Figure 7(B)).
In the G46S complex, the behaviour of Phe differed greatly from that of the wt systems. In fact, during the simulations, Phe moved from the initial binding site to the surface of the mutant in the direction of the L6 loop, thus becoming more exposed to the solvent (Figure 8). The distances of Phe from the interacting residues (residues 43, 45, 46, 49, 53 of chain A and residues 207, 211 of chain B) at 0 ns increased during the simulation until contacts were lost (Figure 6(B)). In particular, Phe, during the simulation, lost hydrogen interactions with Ser 46 and formed new H-bonds with residues Glu205 and Lys335 (Figure 7(C)).  Finally, we calculated the free energy of binding (ΔG) for each complex with MOE. The ΔG values (Table 1)

Discussion
We used molecular Phe docking simulations coupled with MD simulations to identify the allosteric Phe-binding site in the PAH enzyme. To investigate the structural and dynamical features of the Phe-bound models, we analysed the MD trajectories of both unbound and Phe-bound dimeric PAH systems.
An analysis of the RMSF of the hPAH complex revealed a decrease in the mobility of the flexible loopcontaining regions in RD (chain A) and the L6-L7 loops in CD (chain B) where Phe is bound. In addition, the movements of the hinge regions 111-117 in the two chains of both wild-type enzymes were lower than those in the unbound enzymes. On the contrary, the mobility of loops L5 and L8 was slightly increased. These two loops close over the active site cavity when both BH4 and the substrate are bound. Thus, it is plausible that movements of these loops can regulate substrate access to the active site when the enzyme is activated by Phe-binding to the allosteric site.
Our findings indicate that the presence of Phe in the identified allosteric site influences the flexibility of the above-mentioned regions and well agree with experimental data on allosteric regulation of PAH (Li et al., 2010). In fact, based on an increase in the kinetics of hydrogen/ deuterium exchange, Li and co-workers (Li et al., 2010) reported that these loop-containing regions are highly flexible, solvent-accessible and, interestingly, that they are also influenced by the presence of Phe.
In our computations of the wt-hPAH complex, Phe remained in the initial binding site during the whole 10ns long MD simulation, and the position and orientation of the binding site residues did not significantly deviate when Phe was bound. These results support our starting hypothesis that the allosteric Phe-binding site is located at the dimeric interface between the RD and the CD. Our data are in line with experimental data on the in vitro Phe binding to recombinant wt and some mutant forms of the isolated RD (Gjetting et al., 2001).
In rPAH, although there are no differences in the primary structure of the residues lining the binding site, Phe was placed in a region slightly different from that in hPAH, as result of docking simulations. This is due to the differences in the conformation of the binding site residues during the dynamics of the two unbound enzymes, which, as we found in previous computations on the isolated ACT domain (Carluccio et al., 2013), show different dynamical behaviour. The differences in the two binding sites led to a different interaction pattern between the docked Phe and the binding site residues in the two wt enzymes and hence to different free energies of Phe binding to rPAH with respect to hPAH. Interestingly, most of the residues that differ in the two wt sequences are located in the loop-containing regions (in RD, in the hinge, and in CD) that we found to be the most flexible regions. Thus, the mobility of these loopcontaining regions could be influenced by the different interactions of such residues within the enzyme. Moreover, in the presence of Phe, the flexibility of loops L5 and L8 differed from that of the human enzyme. Experimental data have shown that the effect of Phe-preincubation on rPAH activity differs from that on hPAH activity, but the reason for this difference is still unknown (Fitzpatrick, 2012). In line with the latter data, all our results support the hypothesis, also postulated in our previous MD work, that the activation process in the two enzymes could occur through different conformational changes (Carluccio et al., 2013).
In the G46S mutant, Phe was placed by docking in a protein region more external than in the wild-type forms. In particular, during the 10-ns long MD simulations, Phe moved from the initial binding site towards the surface, and the position and orientation of residues in the binding site changed in the complex. These results suggest a rearrangement of the Phe-binding pocket in the mutant that could compromise effective Phe binding. Gly46 adopts a conformation energetically disallowed for all L-amino acids, even serine. It is conceivable that the substitution of Gly in the conserved GAL sequence motif with Ser, which having a polar side chain that protrudes into the allosteric site, leads to weaker Phe binding. Consistently, the binding free energy showed that the mutant was less able than the wild-type enzyme to bind Phe. These results strongly suggest that the mutation Figure 8. Superposition of the structure at 0 ns (magenta) and 10 ns (yellow) of the G46S complex. Phe is shown as stick. markedly affects the correct binding of Phe, as previously predicted (Carluccio et al., 2013), and are compatible with the reduced Phe activation of the mutant. In fact, both in vitro and/or in vivo, the G46S mutant has reduced Phe activation and stability and may easily misfold (Leandro, Simonsen, Saraste, Leandro, & Flatmark, 2011).

Conclusion
This study shows that the presence of an allosteric binding site for Phe in the ACT domain of PAH is reliable. We identified residues in the allosteric Phe binding site and analysed their interactions with Phe. In the G46S mutant, defects in allosteric Phe binding can give rise to defects in the regulation of the mutant, which being associated with a severe form of PKU, may explain its disease-causing nature. Our work sets the basis for further experimental work on a large number of site-directed mutants, in which residues interacting with the allosteric Phe could be replaced. The altered features of these mutants could be characterized by means of a variety of techniques, and hence the Phe-induced activation of PAH could be clarified.

Supplementary material
The supplementary material for this paper is available online at http://dx.doi.org/10.1080/07391102.2015.1052016.