A comparative molecular dynamics study on BACE1 and BACE2 flap flexibility.

Abstract Beta-amyloid precursor protein cleavage enzyme1 (BACE1) and beta-amyloid precursor protein cleavage enzyme2 (BACE2), members of aspartyl protease family, are close homologs and have high similarity in their protein crystal structures. However, their enzymatic properties are different, which leads to different clinical outcomes. In this study, we performed sequence analysis and all-atom molecular dynamic (MD) simulations for both enzymes in their ligand-free states in order to compare their dynamical flap behaviors. This is to enhance our understanding of the relationship between sequence, structure and the dynamics of this protein family. Sequence analysis shows that in BACE1 and BACE2, most of the ligand-binding sites are conserved, indicative of their enzymatic property as aspartyl protease members. The other conserved residues are more or less unsystematically localized throughout the structure. Herein, we proposed and applied different combined parameters to define the asymmetric flap motion; the distance, d1, between the flap tip and the flexible region; the dihedral angle, φ, to account for the twisting motion and the TriCα angle, θ2 and θ1. All four combined parameters were found to appropriately define the observed “twisting” motion during the flaps different conformational states. Additional analysis of the parameters indicated that the flaps can exist in an ensemble of conformations, i.e. closed, semi-open and open conformations for both systems. However, the behavior of the flap tips during simulations is different between BACE1 and BACE2. The BACE1 active site cavity is more spacious as compared to that of BACE2. The analysis of 10S loop and 113S loop showed a similar trend to that of flaps, with the BACE1 loops being more flexible and less stable than those of BACE2. We believe that the results, methods and perspectives highlighted in this report would assist researchers in the discovery of BACE inhibitors as potential Alzheimer’s disease therapies.


Introduction
Alzheimer's disease (AD) is a degenerative disease of the central nervous system, which is responsible for most cases of dementia in the elderly (1)(2)(3)(4). The pathogenesis of AD is characterized by an accumulation of the 40-or 42-residue b-amyloid (Ab) peptide. These peptides are produced by the sequential cleavage of the membrane bound amyloid precursor protein (APP) by band g-secretase. The latter cleaves APP in the transmembrane region and b-secretase (beta-amyloid precursor protein cleavage enzyme, BACE) catalyzes the rate-limiting step of the Ab generation by cleaving the Met671_Asp672 peptide bond of APP (isoform 770, identifier P050677) in the extracellular space (5)(6)(7)(8). On the basis of a number of seminal in vivo and in vitro studies, BACE1 has been proposed as a promising target for the treatment of AD (9)(10)(11). Its inhibition would halt the formation of Ab at the very beginning of APP processing; excess Ab can be harmful to cognitive function even before neuronal death by limiting synaptic function. Thus, slowing the progression of AD by inhibiting Ab formation at an early stage is essential. The true function of BACE1 is still unclear; however, it is believed to be involved in myelin sheath formation and other biological processes associated with the processing of neuregulin-1. A tremendous amount of research has been focused on finding potent, selective and bioavailable inhibitors (peptidomimetic and nonpeptidic). However, none have made it through clinical trials to FDA approval, even though many have shown clinical potential. The activity of BACE1 inhibitors is hindered by their large molecular size, the presence of numerous peptide bonds, peptidic features and poor selectivity over other aspartic proteases. The latter is very important since aspartic acid proteases (e.g. Betaamyloid precursor protein cleavage enzyme2, BACE2) are found virtually in all body tissues. An unselective BACE1 inhibitor may, also inhibit BACE2 and others, causing off target side effects ( Figure 1). BACE2, also called Asp1 and memapsin1, a member of the transmembrane aspartic proteases family, which is a homolog (64% identical) to BACE1 (Figure 2), more sequence analysis to follow (12)(13)(14)(15)(16). Contrasting to BACE1, BACE2 is expressed only marginally in the brain and is found in higher concentrations in the colon, kidney, pancreas, placenta, prostate, stomach and trachea (9,17,18). Studies have shown that BACE2 can cleave APP differently from BACE1, thus it does not produce Ab plaques. In fact, BACE2 cleaves more efficiently within the Ab region and has been suggested to degrade C99 as well. Ultimately, BACE2 has been ruled out as a player in b-secretase activity in AD. Still, selectivity of BACE1 inhibitors over BACE2 is important for the therapeutic utility of BACE1 inhibitors. The specificity of the BACE2 binding pocket is very similar to that of BACE1 with slight differences. The binding cavity of BACE adapts according to the size and shape of different inhibitors (3,(19)(20)(21)(22).
In the aspartic proteases, mechanisms of flap(s) closing and dynamics are of great significance, due to their involvement in ligand binding. Thus, defining the appropriate parameters for flap motions will assist in a better understanding of the binding landscape of novel inhibitors in the active site, as well as establish their effect on the flap motions. Flap dynamics is an important feature in understanding the ligand binding landscape, as well as the overall conformational flexibility of enzymes (23). In aspartic proteases, flap(s) covering the active site has been reported to play a major role in ligand binding due to conformational changes that regulate access of incoming substrates or ligands (24). Different conformations (closed/open) have been observed in the crystal structures of different protease.
The most commonly used metric to measure various flap(s) confirmations is based on the distance between the Ca atoms of the tip of the flap and one of the Asp residues in the catalytic dyad (19,22,24,25). This parameter alone does not adequately describe the curling or twisting motion observed in proteases, which has led to the introduction of additional parameters to more accurately describe flap motions and dynamics (22,26,27). Parameters to accurately describe flap opening and closing of BACE are not well defined in the literature, neither experimentally nor computationally. Therefore, we previously proposed additional parameters to measure flap dynamics, which we believe best describes not only flap and flexible region motions, but also the characteristic ''twisting'' motion during the opening and closing of unbound BACE. In this study, we expanded on the findings and included a homology of memapsin-2 (BACE2) utilizing these parameters and other post-molecular dynamic (MD) analyses are utilized to compare the dynamic behavior and essential flap-structure features of BACE1 and BACE2 in order to provide a comprehensive picture of the two proteases regarding similarities and differences. The key differences highlighted between these two enzymes may allow for the design of selective BACE1 or BACE2 inhibitors.

System preparation
The X-ray crystal structures of BACE1 and BACE2 were obtained from the Protein Data Bank (PDB) (28). A free unliganded structure of BACE2 was generated by manually deleting bound inhibitors from the crystal structure complex. Figure 2. Schematic representation of the flap-structure and corresponding sequences for BACE1 (blue) and BACE2 (brown), the green amino acids highlight similar sequences between the enzymes and red highlights the differences in the sequence. The missing residues were added using the graphical user interface of the molecular modeling tool Chimera (29), and MMV molecular modeling suites were used to prepare the protein structures. Table 1 shows all the simulated systems in this study, the PDB codes and corresponding abbreviations.

MD simulations
All MD simulations were performed using the GPU version of the PMEMD engine provided with the Amber 14 software package (30,31). Gaussian 09 at HF/6-31G* level was used to perform geometry optimization for the drugs and the peptide substrate (32). The ANTECHAMBER module was used to generate atomic partial charges for the drugs and peptide substrates by employing the restrained electrostatic potential and the general amber force field procedures (33). The protein system parameters were defined with the FF99SB force field in the Amber 14 suit (34). The catalytic zinc atom was assigned parameters using the nonbonded parameter method. A van der Waals radius of 1.10 Å and a formal charge of +2.0 were assigned to the zinc based on previously derived parameters (35). The LEAP module of AMBER 14 was used to add missing hydrogen and heavy atoms. The addition of sodium counter ions then neutralized the systems. The entire system was then immersed within a box of TIP3P (24) water molecules, such that any solute atoms were within 10 Å of any box edge during the MD simulations. Particle mesh Ewald method (36) was used to treat long-range electrostatic interactions with a direct space and van der Waals cutoff of 12 Å . An initial minimization of 2000 steps was carried out with an applied restraint potential of 500 kcal/mol Å 2 for both solutes, and was performed for 1000 steps using the steepest descend method followed by a 1000 steps of conjugate gradients. Additional full minimization of 1000 steps was further carried out by conjugate gradients algorithm without restrain. A gradual heating of the systems from 0 to 300 K with a 5 kcal/mol Å harmonic restrain potential and a Langevin thermostat of collision frequency of 1/ps using a canonical ensemble (NVT) molecular dynamic simulation was then carried out. All the systems were then equilibrated at 300 K in a NPT ensemble for 500 ps without restrain and the pressure of the systems were maintained at 1 bar pressure using Berendsen barostat. All hydrogen bonds were constrained using the SHAKE algorithm in a time step of 2 fs and all MD runs were done using SPFP precision model (37). A 100 ns production run was performed without any restrain on the systems in an NPT ensemble at a temperature of 300 K with target coupling constant of 2 ps and pressure at 1 bar. Trajectories from all system simulations were then saved and analyzed in every 1 ps. Post-molecular dynamic analysis such as RMSF, RMSD, PCA, Rg, SASA and distances was performed using the CPPTRAJ and PTRAJJ modules (38) of the Amber 14 suit. All plots were carried out using Origin data analysis tool (http://www.originlab.com/). Molecular modeling suite Chimera (29) was used for visualizations.

Principal component analysis
Principal component analysis (PCA) also known as essential dynamics (ED) is one of the most advanced methods for trajectories analysis in identifying prominent conformational changes through the extraction of conformational modes of the protein during MD simulation, constructing a covariance matrix of the C-a atom displacement performed by the PCA. PCA describes the eigenvectors and eigenvalues, which represent the direction of motions and the amplitudes in those directions of the protein, respectively (39). PCA was performed on C-a atoms over 1000 snapshots at a time interval of 100 ps using the CPPTRAJ module in AMBER 14 in computing the first two principal components (PCA1 and PC2), after stripping the 100 ns MD trajectories of ions and solvent. The corresponding PCA scatter plots were generated using Origin software (http://www.originlab.com/).

Sequence analysis and experimentally defined parameters
The full sequence analysis of BACE1 and BACE2 has previously been reported (20,40). The present study provides the first in-depth analysis of the regions that could impact flap dynamics and motions of BACE, such as the 10S loop, 113S loop and the flexible loop opposite the flap including the hinge residue. From the sequence analysis it was observed that BACE2 occupies a lysine residue (Lys86) in the flap region that is occupied by a proline residue (Pro70) in BACE1 (Figures 2 and S1). Further analysis indicated in the 10S loop within the S3 pocket, arginine (Arg28) in BACE2 was replaced by a glutamine residue (Gln12) in BACE1 (Figure 2 and S1). The S2 pocket of BACE2 is composed of residues such as Gln89 in the flap, Arg248, Leu246 and Ser337. Leu246 of BACE2 is substituted with an asparagine residue (Asn233) in BACE1, making the S2 pocket of BACE2 more hydrophobic compared to that of BACE1 (Figure 2 and SI). The flexible loop opposite the flap consists of different residues in the two enzymes, with Thr340, Asn341, Ala342 and Leu343 in BACE2 versus those of Ser328, Thr329, Gly330 and Thr331 in BACE1, respectively (Figure 2 and SI). The differences observed in S2, between BACE2 and BACE1, are restricted to Leu142 and Lys144 in BACE2, which are replaced in BACE1 by Ile126 and Arg128, respectively. The changes observed in the active site sequence would indirectly or directly affect the motion of the flap-structure and the binding landscape of the substrate or inhibitor.

System stability
Prior to post-MD analysis, the root mean square deviation (RMSD) of the C-a backbone and potential energy fluctuations (PE) of the structures was monitored throughout the 100 ns simulations to ensure stability within the systems. The RMSD of both systems was within 1.0-2.3 Å , confirming stable trajectories ( Figure 3). Furthermore, throughout the entire simulation, there is negligible variation in the potential energy for all the systems (S2). This leads to the conclusion that the simulations produced stable trajectories, thus providing a suitable basis for further analyses.

Root mean square fluctuation
In order to analyze the detailed residual atomic fluctuations, the root mean square fluctuation (RMSF) of the Ca atoms has been analyzed for BACE1 and BACE2 ( Figure 4). The reason, we align both system coordinates, is to allow us not only to see the fluctuations in the flap region, but also to observe if there are any fluctuations in any other parts of the protein as a result of flap movement. It is well known that flap motion in HIV-PR has an impact on the overall flexibility of the protein (23).
Overall, both systems show a high conformational flexibility with a similar trend in fluctuations ( Figure 4). The flexibility of the flap region is crucial to the activity of the protease. Analysis of the fluctuations revealed that the greatest degree of flexibility is contained in the flap region for both enzymes. BACE1 fluctuated more in the flap region compared to BACE2 (0.98 Å ±0.202 versus 0.93 Å ± 0.208), confirming that the substitution of Pro70 with Lys86 caused the flap to be slightly less flexible. It is interesting to note that there are several other short loopy segments that frame the active site cavity of BACE1 and BACE2 that also exhibit high fluctuation during the simulation, as evident from the RMSF profile. The C-terminal residues and the regions around it (4250) exhibited the major fluctuations for BACE2 as compared to BACE1. These regions are reported to facilitate the entry and binding of a substrate/inhibitor in the active site through their movements (41). Even though this is not within the scope of our study, their fluctuations can indirectly lead to significant conformational changes in the active site cavity and affect the dynamics of the flap. We have also examined several other regions, the RMSF values for the flexible region, 10S loop and 113S loop showed a similar trend to that of flaps with BACE1 being more flexible and less stable than that of BACE2 ( Figure 4). The B-factor images further verified the RMSF findings when the enzymes were compared.

Radius of gyration
The radius of gyration (Rg) is a parameter linked to the tertiary structural volume of a protein and has been calculated to obtain insight into the stability of molecules in a biological system along the MD simulation. Overall stable Rg profile observed for both BACE1 and BACE2 simulations indicates no structural instability or uncoiling.
Evidently in Figure 5 that both enzymes show much variation during the simulation time, which indicates that the proteins are much flexible. Our results clearly match the experimental evidence that the protein conformational changes are marginal for both systems. In addition, previous reports suggested that during MD simulations radical conformational changes occur upon ligand binding. Still, the Rg for the BACE2 is significantly higher in comparison to that of BACE1 throughout the simulation period. This indicates that BACE1 exhibits an overall more stable conformation than BACE2. However, when comparing the stability of the flap, 113S loop, 10S loop and opposite flexible (S4) it was observed that those of BACE1 were less compact that of BACE 2. This is in agreement with the RMSF and the B-factor studies.

Conformational motions revealed by PCA
The application of PCA or ED provides insight into the nature of conformational changes in protein structures by identifying large as well as concerted conformational motions and underlying fluctuations (42). The major constituents of correlated motions of the protein structures were obtained from the covariance matrix diagonalization for the C-a atoms. In order to gain insight into the significance of such conformational motions, the conformational behavior of BACE1 and BACE2 systems was projected along the first two principal components or eigenvectors. Figure 6 is a representative of the ED emanating from the BACE1 and BACE2 during the entire 100 ns MD simulation. PC1 and PC2 of the X-and Y-axis, respectively, are constructs of a covariance matrix after the removal of rotational motions, commonly termed eigenvectors.
Each eigenvector is paradigmatic of a single direction of motion of the a-carbon backbone, in a multidimensional space with a corresponding eigenvalue of quintessential amplitude. Collation of systems BACE1 and BACE2 ( Figure 6) associates system BACE1 emerging as a structure of restricted motion in relation to its a-carbon backbone. Each point in the essential sub-space defines a unique conformation of the simulated system such that structurally similar conformations overlap in the essential sub-space. The displacements, as compared to that of BACE2, are reduced, thus the spatial occupancy of BACE1 is minimized at any one particular moment, the implication advocates BACE2 having greater flexibility in accordance with its framework as opposed to BACE1.    Comparative molecular dynamics study on BACE1 and BACE2 509

Flap conformations observed during the trajectories
With respect to protease dynamics, the flap(s) represent functionally important segments that allow substrate access to the active site and interact with the ligand (19)(20)(21)(22)(23)(24)(25)27,41). In apo aspartic proteases, the flaps are rather flexible and have been shown to move from closed to open conformations, and vice versa. When a ligand is bound, the flaps of the bound protease close to allow for substrate cleavage. Therefore, the dynamics of the flap(s) play a critical role in the proteases ability to bind and process their substrates or inhibitors.
In our study, we analyzed the dynamics of the protease flaps by measuring the flap-Asp distance over the course of the 100 ns simulation. This was to confirm that the results obtained in this study were in agreement with previous findings before we introduce the parameters. Therefore, we calculated the interatomic distances between the Ca atom of the catalytic aspartate D32 in BACE1 and D48 in BACE2 and the Ca atom of the flap tip residue (Thr72 in BACE1 and Thr88 in BACE2) proteases, this distance has been considered to be the key parameter in the substrate recognition (41). It is worth mentioning that we have previously reported that 100 ns is a long enough simulation time to visit all conformations consequently it was used in the study. To facilitate further discussion, we defined that distances lower than 9.00 Å correspond to the closed flap confirmation, while distances higher than 13.00 Å indicate that the flap is open (20,21,41,43). To characterize the distances between these two values, we classified these structures  (20). Flap distances between 6 and 8 Å for both systems were detected throughout most of the simulation time, indicating that the systems adopted more closed conformations. This shows that the flaps of both systems sporadically adopt a closed conformation during the simulation. For BACE2, distances fluctuate around mean values of 11.15 ± 0.48 Å (distance values in the starting structure is 12.3 Å ). The mean distance is slightly higher in BACE1, 11.4 ± 0.9 Å (distances in the starting structure: 9.1 Å ). The standard deviation of flap distances for BACE1 is higher than that of BACE2, indicating that the flap of the latter enzyme is less flexible (in accordance with the RMSF values). The larger active siteflap distance for BACE1 makes the active site cavity more spacious compared to that of BACE2.
Snapshots (visual analysis) of the MD trajectories of both systems were analyzed throughout the 100 ns simulation (Figure 8). From visual analysis of the snapshots, it can be seen that BACE1 and BACE 2 adopt an open, semi-open and closed conformation throughout the simulation. It was also observed that the flap and flexible loop region of both enzymes recoils inwards toward the active site as the enzymes closes, this is consistent with observations reported in the literature (15,20). In addition, the conformation of the 113S loop of BACE2 adopted a more closed conformation (near residue Ser113) when compared to BACE1 (S5). For BACE1 the 10S loop was in an up (open) conformation and in a ''down'' (closed: near residue Ser10) for BACE2. This implies that the BACE1 active site could be slightly more spacious than that of BACE2.
Previous experimental and computational studies have outlined open/close conformations of the flap for BACE. However, they only used one parameter to define this conformational change, which is measuring the distances between the tip of the flap and the one of the aspartic acids in the catalytic dyad. The snapshots of the MD trajectories of BACE1 and BACE2 throughout the 100 ns simulation (Figure 8) were compared with crystals structures of known flap confirmations to validate that our studies are in agreement with previously reported studies. After confirming that our results are consistent with the available experimental and computational data, we took one step further and went on to introduce parameters that can be used to more accurately define the open/closure of the system as one parameter is insufficient to describe the complex motion of the flap.
These parameters are the distance between the flap tip and the hinge residue of the flexible loop d 1 , in relation to the dihedral angle , as well as the TriCa angles, 1 and 2 ( Figure 9) . Herein, these parameters, in addition to other post-MD analyses, were used to compare the active site motions and dynamics between BACE1 and BACE2. Flap distances of 6-9 Å are detected over the simulation time, indicating that a more closed conformation is temporarily adopted in both enzymes. The mean distances for BACE1 were slightly higher 12.25 ± 1.72 Å , compared to BACE2 11.94 ± 0.951 Å , these results are consistent with the flap-Asp distance findings  According to literature, the curling of the tips of the active site flaps triggers the opening/closing of the flap(s) (21,23,25,41). To investigate this flap behavior, the variability in dihedral angles (i.e. the angle between four adjacent carbons) was measured ( Figure 10B). It is possible to use other methods to measure the curling and twisting behavior of the flap tips, any curling of the flap tips that do occur should involve changes in the dihedral angles of the residues at or near the tip. Herein, the dihedral angle () was calculated in the post-MD analysis and gives insight into the twisting motion of the Ca atoms between the flap tip, active site and hinge residue in the flexible loop region. Overall, BACE1 displays a slight twisting motion, which is reflected by a small change in d 1 . A slight flap twisting (lateral twisting) was observed for BACE1 at 18-20 and 60-70 ns this coincides with the opening of flap, as reflected by an increase in d 1 .
Conversely, BACE2 showed less twisting in the flapstructure; the twisting observed is more gradual and less erratic ( Figure 10B). The mean value of BACE2 is smaller than that of BACE1, which implies more curling in of the flap tips in BACE1 compared to BACE2. In both cases, only small changes were observed, which confirmed that parameterizing flap twisting played a lesser role in opening and closing the flap.
Calculation of TriC angles, y 1 and y 2 In addition to calculating the distance and the dihedral angle fluctuations in the flap-structure, TriCa angles 1 and 2 were also calculated. Previously we reported the use of TriCa angles 1 (Tip-Asp1-flexible loop) and 2 (Tip-Asp2-flexible loop) to better explain the flap dynamics of complexed BACE1 and in this study we expanded on the findings and included a homology of memapsin-2 (BACE2) (44).
In both systems investigated in the present study, 1  The evolution of the flap confirmation from closed to open is a complex phenomenal, therefore the change of the flap conformation is highlighted in Figure 12. It is interesting to note that from the graph, BACE1 flap is more flexible than that of BACE2 which is in agreement with the RMSF, Rg and PCA. In summary, as the flap-structure moves toward a more open conformation (increase in d1) the gap between the tip of the flap region increases, both 1 and 2 increase. As the flapstructure moves toward a more closed conformation (decrease in d 1 ), 1 and 2 decrease accordingly. Both systems display a twisting motion (change in : from negative to positive, and/or positive to negative), which is reflected by a change in d, 1 and 2 .

Conclusion
In the present study, we comparatively show the differences in flap motion and dynamics between BACE1 and BACE2. We also show how the flap opening and closing of BACE1 and 2 can be more accurately described by combining parameters (d 1 ), the dihedral angle () and TriCa angle ( 1 and 2 ). These four parameters more accurately account for the twisting motions responsible for opening the binding cavity (increase in d 1 , as flap moves away), and closing of the active site (decrease in d 1 , as flap moves toward active site) of which reported parameters do not do. This provides information about protein dynamics and more importantly for structurebased drug design, documents the conformational space available to the target. Analysis of 2 , d 1 , 1 and confirmed that the BACE1 and BACE2 predominantly adopted semiopen conformations, with occasionally adopting open and closed conformations, with slight twisting during flap opening. Examination of the fluctuations revealed that of all the BACE systems investigated in the present study, BACE2 display the highest conformational flexibility and is a more dynamic structure versus BACE1. However, the flap, flexible region, 10S loop and 113S loop of BACE1 is less stable than that of BACE2. It was noted that the conformation of 10S loop in BACE2 is distinct from that of the BACE1 conformation. In addition, the 113S loop was in an up (open) BACE1 conformation and ''down'' (closed) in BACE2. This conformational difference between BACE2 and BACE1 is the closest to the active site and offers an opportunity to design BACE2/BACE1 selectivity into low-molecular-weight inhibitors. The strategy applied in this work is applicable not only to aspartyl proteinases, but also to flexible proteins in general. Thereby improving our understanding of the effect of the flap motion in ligand uptake and binding. However, additional studies of BACE1 and BACE2 that also take into account the dynamics of the whole active site of dynamics should therefore provide a promising basis for future drug development.