Evidence for a high pK a of an aspartic acid residue in the active site of CALB by a fully atomistic multiscale approach

Abstract Candida antarctica Lipase B (CALB) is a paradigm for the family of lipases. At pH 7, the optimal pH for catalysis, the protonation state of an aspartic acid of the active site (Asp134) could not be conclusively assigned. In fact, the pK a estimate provided by a widely used computational tool, namely PropKa, that predicts pKa values of ionizable groups in proteins based on the crystallographic structure, is only slightly above 7 (pKa = 7.25). This, along with the lack of an experimental evaluation, makes the assignment of its protonation state at neutral pH challenging. Here, we calculate the pK a of Asp134 by means of a fully atomistic multiscale computational approach based on classical molecular dynamics (MD) simulation and the perturbed matrix method (PMM), namely the MD-PMM approach. MD-PMM is able to take into account the dynamics of the system and, at the same time, to treat the deprotonation step at the quantum level. The calculations provide a pK a value of 8.9 ± 1.1, hence suggesting that Asp134 in CALB should be protonated at neutral, and even at slightly basic, pH. Communicated by Ramaswamy H. Sarma

exposure and the local protein environment. Knowledge of the protonation state of these residues is of great importance to gain molecular level insight into many relevant functional processes. For example, protonation/deprotonation of a specific glutamic acid/glutamate was suggested to be involved in the photocycle branching in channelrhodopsin (Kuhne et al., 2019). Another notable example is the catalytic Cys-His dyad in SARS-CoV-2 main protease that was shown to be neutral in the enzyme resting state, differently from many other cysteine proteases (Pavlova et al., 2021), guiding the search of inhibitors. The proper assignment of protonation states is also crucial in molecular dynamics simulations, in which wrong protonation states can lead to unreliable results. The experimental determination of pK a values provided so far evidences for many non-obvious protonation states inside proteins (Grimsley et al., 2009;Pahari et al., 2019), yet the data set is far to be exhaustive.
A number of computational tools have been developed to estimate pK a s inside proteins, the most widely used being purely empirical methods (Cvitkovic et al., 2019;Milletti et al., 2009;Olsson et al., 2011;Tan et al., 2013). In these approaches a single structure (usually the X-ray crystallographic structure) is typically used to estimate the pK a values. The most widely used empirical tool for pK a prediction is the PropKa web-server , that is trained on a large set of experimental pK a values to empirically relate desolvation effects and intra-protein interactions to the positions and chemical nature of the groups near the ionizable site. The greatest advantage of these tools is that a good agreement with the experimental results can be obtained at a low computational cost. However, the prediction of the pK a of residues embedded in protein environments relevantly differing from the ones used in the training set is still challenging . Additionally, conformational changes, which have been shown to relevantly affect the accuracy of pK a calculations (Witham et al., 2011), are not taken into account.
An example of a critical assignment of the protonation state of a protein side chain is the aspartic acid residue 134 (Asp134) in Candida antarctica lipase B (CALB), a globular protein with the typical a/b hydrolase fold (Uppenberg et al., 1994). This enzyme is able to catalyze a large number of diverse reactions (e.g. hydrolysis, transesterification, polymer production and degradation). In addition, it shows broad substrate specificity and is active under unnatural conditions. Therefore, CALB is suitable for a wide range of industrial applications (Anderson et al., 1998;Galm es et al., 2020;Ortiz et al., 2019).
Asp134 is located in the acyl-binding pocket of the active site that hosts the catalytic triad. It has been shown by means of molecular dynamics simulations that, if protonated, Asp134 is involved in a hydrogen bond (HB) network with Thr40 and Gln157 (Wedberg et al., 2012) which is essential for the solvent accessibility of the active site. Moreover, observation of the crystallographic structure suggests that Asp134 is important also for substrate binding (Nian et al., 2020). An experimental value of the pKa of Asp134 is not available. The estimate provided by PropKa is 7.25, a value very close to the neutral pH which makes the assignment of the protonation state at pH ¼ 7 ambiguous. The assignment is even less straightforward if a phosphate buffer is used in the experiments, with which the pH is commonly buffered at values slightly above the neutral pH, such as in (Dutta Banik et al., 2016;Zhang et al., 2017) (pH ¼ 7.2) and in (Fan et al., 2011;Mangiagalli et al., 2020) (pH ¼ 7.5).
Several approaches based on hybrid quantum/classical methodologies have been proposed for the computation of deprotonation free energies and pK a values (Casasnovas et al., 2014;Ivanov et al., 2006;Jensen et al., 2005;Kamerlin et al., 2009;Park et al., 2006). Herein we use an approach based on molecular dynamics (MD) simulations for the computation of the atomistic interactions on a large set of configurations and on the perturbed matrix method, PMM (Amadei et al., 2007(Amadei et al., , 2010Zanetti-Polzi et al., 2016, 2018. This approach allows the treatment of the deprotonation process at the quantum level and the inclusion of the dynamical perturbation provided by the environment. The MD-PMM has been already successfully applied for the calculation of reduction/oxidation potentials (Daidone et al., 2014;Zanetti-Polzi et al., 2015) and for the investigation of electron transfer processes Zanetti-Polzi et al., 2017). The pK a value of Asp134 within the protein is calculated and compared to that evaluated for the single amino acid in water, for which the experimental value is available.

The MD-PMM approach
The MD-PMM, that is used in the present work to compute the deprotonation free energy for cap-Asp and for Asp134 in CALB, has been described in previous works (Zanetti-Polzi et al., 2016. In our previous work (Zanetti-Polzi et al., 2020), we addressed the computation of deprotonation free energies and pK a values for three different chemical groups (corresponding to three different aminoacidic side chains, i.e. aspartic/glutamic acid, lysine and cysteine). For aspartic/glutamic acid we also considered different protein environments, for a total of 9 systems. For all these systems, we benchmarked our computed pK a values against the corresponding experimental estimates. In the present work, we use the same approach to predict the pK a value for Asp134 in CALB, for which an experimental measurement is not available.
Below, we provide a brief overview of the basic concepts of the MD-PMM approach and of its application to compute pK a values.
The MD-PMM is a computational approach based on the use of classical MD simulation, quantum chemical calculations and a perturbative methodology, namely the perturbed matrix method (PMM) (Amadei et al., 2007(Amadei et al., , 2010Zanetti-Polzi et al., 2018. In the PMM the entire system is partitioned in two groups: the quantum center (QC), that is the region of the system in which the quantum processes of interest take place and that is treated at the quantum level, and the remaining part of the system (that we call the environment), that exerts a classical electrostatic perturbation on the QC electronic properties. The electrostatic contribution of the environment is calculated using the atomistic configurations of the system as generated by classical MD simulations.
The quantum properties of the isolated QC (unperturbed properties) are obtained in the gas phase using quantum mechanical calculations (see section 2.2) providing the unperturbed electronic Hamiltonian b H 0 : The electrostatic effect of the environment is included 'a posteriori' as a perturbation operator b V for each configuration (i.e. frame) sampled along the classical MD simulations. Therefore, the Hamiltonian operator b H of the QC embedded in the perturbing environment is: The perturbation operator b V can be expressed by using a multipolar expansion either centered in the QC center of mass or centered around each atom of the QC. In the first case, the QC-based expansion, the electric field provided by the environment atomic charges is used to calculate In Eq. 2, j runs over all QC nuclei and electrons, q j is the charge of the jth particle and r j the corresponding coordinates, V is the electrostatic potential exerted by the environment, E is the corresponding electric field (E ¼ ÀoV=or) and r 0 is the center of mass of the QC.
In the other case, i.e. the atom-based expansion, the perturbation operator b V is expanded around each atomic center R N (i.e., the position of the nucleus of the corresponding Nth atom of the QC): In Eq. 3, j runs over all QC nuclei and electrons, N runs over all QC atoms and X N is a step function being zero outside and unity inside the Nth atomic region. In the present work, the atom-based expansion is utilized for the diagonal elements of the Hamiltonian matrix, while the off-diagonal elements are calculated using the QC-based operator expansion up to the dipolar term (Eq. 2).
Each frame of the MD simulation provides a different perturbation operator b V : Therefore, a perturbed electronic Hamiltonian matrix b H is constructed and diagonalized at each frame of the MD simulation, providing a trajectory of perturbed eigenvalues and eigenvectors that are used to compute the QC instantaneous perturbed quantum observable of interest. The opportunity of calculating perturbed quantum properties using a very large number of configurations, as obtained from extended classical MD simulations of the system, is one of the main advantages of the MD-PMM, along with the fact that the computationally time-consuming quantum mechanical calculations on the QC have to be performed only in the gas phase and only for a limited number of configurations of the QC (or even only one configuration for rigid QCs, as in the present case). In addition, the MD-PMM is a very general and versatile approach, that can be easily used to compute different observable. For example, it has been applied to compute infrared spectra (Zanetti-Polzi et al., 2012 and to model electron transfer thermodynamics and kinetics (Daidone et al., 2014;Paltrinieri et al., 2017;Zanetti-Polzi et al., 2017). More recently, it has been applied to compute the pK a of titratable residues, also inside proteins (Zanetti-Polzi et al., 2020).
In order to calculate pK a values with the MD-PMM, the titratable group of interest is chosen as the QC and gasphase quantum mechanical calculations are performed in both the protonated and deprotonated forms of the QC. In the present case, acetic acid/acetate are used as the acid/ base QC. With the MD-PMM the perturbed ground state energy is then computed in the protonated and deprotonated states of the QC and the corresponding perturbed ground state energy change upon deprotonation (DU e ¼ U dep e ÀU prot e ) is calculated. This calculation is done twice: once using the system configurations obtained by the MD simulation with the side chain in the protonated state and the other time using the system configurations obtained by the MD simulation with the side chain in the deprotonated state. For the capped aspartic acid in solution and Asp134 in CALB, the side chain group is chosen as the QC and is perturbed by the solvent and the rest of the residue for cap-Asp and by the solvent and the rest of the protein for Asp134. The Gibbs free energy change DG upon deprotonation is then calculated as: In the above equation, the angle brackets subscripts dep/ prot indicate that both the energy change and the averaging are obtained in the deprotonated/protonated MD ensemble. The free energy change is calculated for both cap-Asp and Asp134 in CALB. The pK a for Asp134 in CALB is evaluated from the difference between the deprotonation free energy of Asp134, DG Asp134 , and the one of cap-Asp, DG ref , and using as reference the experimental estimate of the pK a value of cap-Asp (pK ref a , i.e. 4.0 (Lide, 1991)): As previously reported (Zanetti-Polzi et al., 2020), the computed absolute pK a value for cap-Asp is estimated making use of the calculated deprotonation free energy for cap-Asp and of the experimental estimate of the solvation free energy of the proton, the experimental estimate of the gas phase standard reaction free energy (DG gas AH!AþH þ ) and the calculated QC unperturbed (gas phase) electronic ground state energy difference upon deprotonation (see section 2.2). The error associated to the absolute pK a value for cap-Asp we report in Table 1 includes the noise of these experimental estimates and is therefore relatively large (i.e., 1.8 pK a units). In order to reduce the error associated to the computed pK a of Asp134 in CALB, we evaluate the latter from the comparison with the computed pK a for cap-Asp, as reported in Eq. 5. This estimate for Asp134 pK a is in fact only affected by the error on the deprotonation free energy (i.e., 6 kJ/mol % 1.1 pK a units).

Quantum mechanical calculations
Quantum chemical calculations are performed on the gasphase optimized geometry of acetic acid/acetate to obtain the unperturbed electronic properties to be used in the MD-PMM approach. The calculations are performed at the Density Functional Theory (DFT) level (Parr & Yang, 1995) with the 6-31 þ G(d) basis set (Krishnan et al., 1980) in conjunction with the B3LYP functional (Becke, 1993). Timedependent DFT (Adamo & Jacquemin, 2013) (TD-DFT) is used to compute the properties of the excited states. For each protonation state, a 6 Â 6-dimensional Hamiltonian matrix is evaluated and diagonalized at each MD simulation step, according to the MD-PMM procedure (see section 2.1). Charges are computed at the QM level with the electrostatic potential method (ESP charges) (Besler et al., 1990).

Molecular dynamics simulations
To evaluate deprotonation free energies, MD simulation of the system of interest needs to be performed with both the deprotonated and protonated target titratable group. Hence, MD simulations of cap-Asp with either protonated or deprotonated side chain and MD simulations of CALB with either protonated or deprotonated Asp134 side chain are performed. The protonation state of the other titratable side chains in CALB was chosen according to the estimated pK a at pH ¼ 7 as provided by PropKa . Counterions are added in order to neutralize the total charge of the system. All MD simulations are performed using the GROMACS package (Van Der Spoel et al., 2005) and the CHARMM36 force field (Huang et al., 2017) in the NPT (constant temperature, pressure and number of molecules) ensemble at 300 K using the velocity rescaling temperature coupling (Bussi et al., 2007). The starting configurations of cap-Asp and CALB (PDB ID 1TCA (Uppenberg et al., 1994)) are solvated in a dodecahedral TIP3P (Jorgensen et al., 1983) water box large enough to ensure a minimum distance between the solute atoms and the box edges of %1 nm. Periodic boundary conditions (PBC) are employed and the long-range electrostatic interactions are treated with the particle mesh Ewald method (Darden et al., 1993). The LINCS algorithm is used to constrain bond lengths (Hess et al., 1997). After a solute optimization and a subsequent solvent relaxation, each system is gradually heated from 50 K to 300 K using short MD simulations. Four 100 ns productive runs are then performed for CALB: two starting from the crystal structure with Asp134 in either its protonated or deprotonated form, one with Asp134 in its protonated form starting from the final structure of the previous MD with Asp134 deprotonated and one with Asp134 in its deprotonated form starting from the final structure of the previous MD with Asp134 protonated. For cap-Asp, the MD trajectory is propagated for 20 ns. For all trajectories, coordinates are saved every 1 ps. For the calculation of the pK a , the first 20 ns in CALB and first 5 ns in cap-Asp are neglected.
The hydrogen bonds are defined using a geometrical criterion based on the donor-HÁ Á Á acceptor distance and angle (donor-acceptor distance 0.35 nm and H-donorÁ Á Á acceptor angle 30 degree).

Results
To compute the deprotonation free energy of the side chain of Asp134 we performed four molecular dynamics simulations of CALB in water: two in which Asp134 was protonated and the other two in which it was deprotonated. The MD-PMM approach requires in fact the system to be simulated with the ionizable group in both the protonated and the deprotonated state in order to obtain a more reliable estimate of the pK a value (see the Methods section, Eq. 4). Two MD simulations were started from the crystal structure (one with protonated and one with deprotonated Asp134). The other two were started from the final structure of the first two simulation changing the protonation state of Asp134 (see section 2.3). This strategy is adopted to favor the sampling of diverse phase space regions, with possible fallout on deprotonation free energy calculations. In both simulations with protonated Asp134, the structure of the catalytic triad (Ser105, Asp187 and His224) is essentially maintained, even though we observe that the interaction between His224 and Ser105 can be either direct or mediated by a water molecule (see Figures 1A and S1 in the Supporting Information, SI). The extended HB network in the polar region surrounding the catalytic serine (involving the side chains of Thr40, Gln106, Gln157 and Asp134, see Figures 1A and S2 in the SI) is also well conserved. On the contrary, the analysis of the structural properties of the active site provides different results between the two MD simulations in which Asp134 is deprotonated. In the first one (see Figures 1B and S1 in the SI) the structure of the catalytic triad is not preserved: the interaction between the catalytic serine and the catalytic histidine is frequently completely lost, with the two side chains reaching very large distances. In addition, the HB between His224 and Asp187 is not maintained. In the second MD simulation, the catalytic triad undergoes only minor rearrangements with respect to the simulations in which Asp134 is protonated (see Figure S1 in the SI). Yet, in both simulation in which Asp134 is deprotonated, the above mentioned HB network is not maintained. As a matter of fact, rotation of the side chain of Gln157 is necessary to accommodate the deprotonated side chain of Asp134, and this rotation is not compatible with the HB network observed in the crystal structure (see Figure S2 in the SI and Figure 1B and C). Disruption of this HB network leads to a partial destabilization of the oxyanion hole (Thr40 and Gln106, see Figure S2). All together, these results suggest that deprotonated Asp134 leads to conformations that are not optimal for catalysis. Table 1. Calculated deprotonation free energies (DG) in kJ/mol and pKa values for the investigated systems. The standard errors reported are obtained using three subtrajectories for each MD simulation. For the calculated pKa values in cap-Asp the standard error includes the noise of the experimental measures as estimated in Ref. (Zanetti-Polzi et al., 2020). The experimental pKa value (Lide, D. R., 1991) is also reported for cap-Asp.

Asp134
cap-Asp DG 1183 ± 6 1155 ± 6 exp. pK a -4.0 calc. pK a 8.9 ± 1.1 3.9 ± 1.8 We then turn to the calculation of the pK a of Asp134 with the MD-PMM approach. The free energy change upon deprotonation DG and the corresponding pK a value (see Eq. 5) are reported in Tab. 1. Before commenting on the obtained results, we compare the pK a of Asp134 with that of a single aspartic acid residue in water in order to benchmark our procedure against an experimental result. In particular, a capped aspartic acid (cap-Asp) is used as model compound for the single residue (see Figure 2). The experimental pK a value of cap-Asp, reported in Tab.1, is 4.0. To calculate the pK a of cap-Asp, two MD simulations of cap-Asp in water were performed, one with protonated and the other with deprotonated side chain, and the deprotonation DG calculated through the MD-PMM approach. The corresponding pK a value is reported in Tab.1.
The calculated pK a of cap-Asp (pK a ¼3.9) is in very good agreement with the experimental value (pK a ¼4.0), providing support to the accuracy of the methodology employed. The calculated pK a of Asp134 in CALB is 8.9 ± 1.1, 5 pK a units above the value in water. This value is higher than the neutral pH, thus indicating that Asp134 should be protonated at pH ¼ 7. The order of magnitude of our estimate is consistent with that provided by PropKa (pK a ¼ 7.25), although slightly higher indicating that Asp134 should be protonated even at slightly basic pH. In Figure S3 in the SI we report the deprotonation free energies obtained from each of the four MD simulations, together with their average, at increasing time lengths. This Figure shows that the simulation time is sufficient to provide convergence for free energy calculations in CALB. It can be also observed from the Figure, that when Asp134 is protonated the two simulations converge to the same deprotonation free energy. On the contrary, when Asp134 is deprotonated there is a slight difference between the deprotonation free energy obtained from the two MDs. More specifically, the simulation in which the active site structure is lost provides a slightly lower deprotonation free energy (by 10 kJ/mol corresponding to % 1.5 pK a units). This result shows that the conformations in which deprotonated Asp134 is more favored are not compatible with the maintenance of the crystal structure, further supporting the above conclusion that deprotonated Asp134 leads to conformations that are not optimal for catalysis.
As previously shown (Zanetti-Polzi et al., 2016), the most important contribution to the fluctuations of DU e around the unperturbed ground state energy difference along the trajectory is given by the electrostatic potential felt by the QC. To gain insights into the molecular mechanism that determines the high pK a of Asp134, the mean electrostatic potential exerted on the QC (i.e. on the side chain of Asp134) by each residue of the protein has been analyzed. Such an analysis accounts indeed for site-specific contributions to the energy change upon deprotonation thus identifying the residues that most contribute to determine the pK a value. It has to be pointed out that Asp134 in CALB is embedded in the protein matrix and thus desolvation is expected to play a relevant role in determining the increased pK a of Asp134 with respect to cap-Asp. As a matter of fact, we observe that the contribution of the solvent in stabilizing the deprotonated form is reduced by %60 kJ/mol in Asp134 in CALB with respect to cap-Asp. Nonetheless, the protein environment also exerts relevant electrostatic effects that can either stabilize or The catalytic triad (Ser105-His224-Asp187) and the residues belonging to the acyl-binding pocket (Thr40-Gln106-Ser153-Gln157-Asp134), which contains the oxyanion hole (Thr40-Gln106), are highlighted in licorice. In panels B and C the change in the HB network in the polar region surrounding the catalytic serine can be observed. In panels A and C the direct and water mediated interaction between Ser105 and His224 can be observed. destabilize the protonated form and the balance between the protein and solvent effects determines the final pK a of Asp134. In what follows, we focus on the protein contribution, inspecting the residues able to tune the deprotonation free energy change of Asp134.
The electrostatic potential V exerted by each protein residue on the side chain of Asp134 center of mass is reported in Figure 3. The reported values are obtained by averaging the mean electrostatic potential in both the protonated and deprotonated ensembles. The residues with a negative V tend to destabilize the deprotonated state and thus to increase the pK a . On the contrary, the residues with a positive V tend to stabilize the deprotonated state and thus to decrease the pK a .
The most negative and positive contributions, highlighted in the figure with blue and red dots, respectively, are those arising from negatively and positively charged residues. Negatively charged Asp and Glu residues clearly contribute to the stabilization of the protonated form (and hence to an increase of the corresponding pK a ). Besides these expected contributions, our analysis shows that there are less trivial negative contributions of non-charged residues. The contributions with an absolute value > 12 kJ/mol are reported as blue dots in Figure 4A. Such a threshold value is chosen as it is twice the error on the pK a shift (see Table 1), ensuring that the variations taken into account are actually relevant. In particular, three residues have the highest contribution, namely Gln106, Ala132 and Gln191. In Figure 4B these residues are highlighted along with Asp134. Gln106 belongs to the oxyanion hole, as already mentioned above, and is involved with Asp134 in forming a hydrogen bond network. Gln191 is located on the opposite side with its backbone being closer in space to the Asp134 side chain. Concerning Ala132, given that its side chain is apolar (hence exerting a very low, if not null, contribution), the main contribution to V arises from the backbone. While it is widely recognized that charged and polar side chains are able to modulate the electrostatic landscape in their neighborhoods, a similar possible role of the backbone is seldom considered. Yet, the present results highlighting the contribution of Ala132 provide evidence that also the polar groups of the more rigid and structurally important backbone can be important in defining the pK a values of titratable groups.  Contributions with an absolute value > 12 kJ/mol are reported as blue dots. Such a threshold value is chosen as it is twice the error on the pK a shift (see Table 1), ensuring that the variations taken into account are relevant. Panel B: representative configuration of the active site with residues Gln106, Ala132 and Gln191 highlighted along with the side chain of Asp134.

Conclusions
In the present work a multiscale computational procedure is used for the calculation of deprotonation free energies, and corresponding pK a values, of the aspartic acid side chain both in water and within the Candida antarctica lipase B, CALB. The approach used is based on a hybrid quantum/classical method, the MD-PMM, in which the titratable group is treated at the quantum level and the whole system configurational sampling is obtained from classical MD simulations in both the protonated and the deprotonated ensemble for the target ionizable group.
We focus here on an aspartate of the active site of CALB, namely Asp134, for which the experimental pK a is not available. According to our calculations, that are shown to be able to match very well the pK a of a single aspartate residue in water (3.9 versus the experimental 4.0), the pK a of Asp134 is 8.9 ± 1.1. The high pK a value found in the present work is consistent with other catalytic systems containing Asp/Glu residues with very high pK a values experimentally determined. Notable examples are Glu325 in LacY (pK a ¼10.5) (Grytsyk et al., 2017) and Asp309 in Human Aromatase (pK a ¼10.5) (Nardo et al., 2015).
A pK a of $9 indicates that the side chain of Asp134 is protonated at neutral pH, the common pH for catalysis. By means of MD simulations we also show that protonation of Asp134 plays a crucial structural role in maintaining intact the catalytic triad and the acyl-binding pocket of the active site of CALB.

Disclosure statement
No potential conflict of interest was reported by the authors.