Molecular dynamics simulations reveal the impact of NUDT15 R139C and R139H variants in structural conformation and dynamics

Abstract NUDT15, also known as MTH2, is a member of the NUDIX protein family that catalyzes the hydrolysis of nucleotides and deoxynucleotides, as well as thioguanine analogues. NUDT15 has been reported as a DNA sanitizer in humans, and more recent studies have shown that some genetic variants are related to a poor prognosis in neoplastic and immunologic diseases treated with thioguanine drugs. Despite this, the role of NUDT15 in physiology and molecular biology is quite unclear, as is the mechanism of action of this enzyme. The existence of clinically relevant variants has prompted the study of these enzymes, whose capacity to bind and hydrolyze thioguanine nucleotides is still poorly understood. By using a combination of biomolecular modeling techniques and molecular dynamics, we have studied the monomeric wild type NUDT15 as well as two important variants, R139C and R139H. Our findings reveal not only how nucleotide binding stabilizes the enzyme but also how two loops are responsible for keeping the enzyme in a packed, close conformation. Mutations in α2 helix affect a network of hydrophobic and π-interactions that enclose the active site. This knowledge contributes to the understanding of NUDT15 structural dynamics and will be valuable for the design of new chemical probes and drugs targeting this protein. Communicated by Ramaswamy H. Sarma


Introduction
In living organisms, nucleic acids are macromolecules whose fundamental building blocks, known as nucleotides, are susceptible to a variety of chemical modifications.Some of these transformations are due to physiological circumstances such as post-translational or epigenetic modifications, which regulate and tune the cell cycle (Liyanage et al., 2014).However, chemical modifications of DNA can also be due to non-enzymatic processes including oxidative stress, ionizing radiation or radiotherapy (Belli & Tabocchini, 2020;Lomax et al., 2013;Swenberg et al., 2011).In this regard, the human body must have some kind of mechanism to repair nucleobase alterations either, when they are forming part of DNA double strand or as unprotected nucleotides in solution, which are more susceptible to chemical degradation (Topal & Baker, 1982).
The NUDT15, also known as the nudix hydrolase 15 or MTH2, is an enzyme that belongs to the NUDIX (nucleoside diphosphates linked to a moiety x) superfamily of proteins (Maki & Sekiguchi, 1992).It was first identified in Escherichia coli as a member of the sanitizer pool of enzymes responsible of metabolic detoxification due to oxidized nucleotides accumulation.This protein catalyzes the hydrolysis of canonical nucleotides, as well as oxidized deoxynucleotide triphosphates (dNTPs), including substrates like 8-oxo-dGTP, whose incorporation into DNA can lead to mispairing among base pairs, and thus to insidious consequences.
NUDIXs substrate specificity has been the object of extensive research, and recent studies have reported that NUDT15 is able to hydrolyze metabolites of thiopurines drugs like 6thio-deoxyguanosine triphosphate (Rehling et al., 2021;Valerie et al., 2016).Azathioprine, mercaptopurine and thioguanine belong to the thiopurine antimetabolites family, a kind of drugs that, after their bioactivation within the cell into thioguanosine triphosphate (TGTP), inhibit several enzymes which participate in the purine biosynthesis and also produce misincorporation into DNA and RNA (Figure 1) (Karran & Attard, 2008).Thiopurine drugs are used as anticancer and immunosuppressive agents in the treatment of different clinical malignances, including lymphoblastic leukemia, acute myeloid leukemia, and inflammatory bowel disease (Cargnin et al., 2018;Pui et al., 2008;Terdiman et al. 2013).
Several clinical investigations, including a genome-wide association study, have recently described a missense variant in the NUDT15 gene rs116855232 (Arg139 substitution by Cys), that causes the accumulation of 6-mercaptopurine metabolites and cellular toxicity (originating thiopurineinduced leukopenia) during thiopurine treatments (Singh et al., 2017;Yang et al., 2014).
In addition, NUDT15 deficiency is the primary genetic cause of thiopurine intolerance in patients with acute lymphoblastic leukemia, particularly those of Asian or Hispanic descent (Suiter et al., 2020).In this context, the interest of the scientific community in the research of NUDT15, its role in purine metabolism and biology has increased during the last few years.Recently, an active site targeting chemical probe was developed to study the protein as a potential drug for thiopurine combination therapy (Zhang et al., 2020).
However, there are still a lot of questions that remain elusive for NUDT15 such as how it recognizes substrates, the catalytic mechanism, or the impact of point mutations on the activity of clinically relevant variants.To answer some of these questions, a detailed view of the dynamic behavior of the protein at the atomic level is needed.To address some of those unknowns concerning NUDT15 clinical variants, we have performed exhaustive molecular dynamics simulations (MD) in order to shed some light on the conformational changes that the protein undergoes.We compare the wild type (WT) NUDT15 with some of their most relevant variants in the bound and unbound state.This knowledge will facilitate the understanding of NUDT15 genetic variants, as well as the design of more effective drugs or even the optimization of this enzyme as a biocatalyst.

Protein preparation
A model of NUDT15 with the whole sequence (residues 1-164) was built using the Robetta web server (https:// robetta.bakerlab.org/).For this purpose, the RosettaCM protocol was selected, choosing a single template method and considering PDB ID: 6T5J (chain A) as a starting coordinates, to generate a single model (Song et al., 2013).The RosettaCM protocol enables the construction of protein structures by homology modeling in those regions for which there is sequence alignment, while those for which there is no alignment are built using a de novo method.In the case of NUDT15, it has applied homology modeling for residues 10 to 163, and it was generated de novo the fragment corresponding to residues 1 to 9 and residue 164.The quality of the model was characterized using the QMEAN web server, applying the QMEANDisCo scoring function (Studer et al., 2020).The evaluation of the model resulted in a Qmean score of 0.84 ± 0.07, indicating good quality.
Protonation states of NUDT15 titratable residues were assigned using the ProteinPrepare module available at the https://www.playmolecule.comwebsite (Mart� ınez-Rosell et al., 2017).All histidines were considered in their delta tautomeric form except His25, His49 and His91 which were protonated at the epsilon nitrogen atom.To generate TGTP complexes, 6-thio-guanosine triphosphate was manually built up starting from PDB ID: 5LPG, which contains a molecule of 6-Thio-guanosine monophosphate in the active site (Valerie et al., 2016).The molecular graphics program PyMOL 2.1 (The PyMOL Molecular Graphics System, Version 2.0 Schr€ odinger, LLC.) was employed for building the missing b and c phosphate groups of the nucleotide.Regarding NUDT15 point mutations, the superimposition of chain A backbone atoms from crystallographic structures (PDB IDs: 6B65 and 6B66 for R139C and R139H respectively) with the original template and homology model proved to be lower than 0.5 Å.Consequently, all protein mutations were generated in silico by means of CHARMM-GUI, during the protein preparation step, using the mutation tool (Jo et al., 2008) in order to facilitate reproducibility and standardize the protocol.

Molecular dynamics simulation
Systems were built using the Solution Builder tool from the CHARMM-GUI server (https://www.charmm-gui.org/).Each protein or protein-ligand complex was solvated into a cubic box extended 9 Å from solute (� 9220 molecules), then Na þ and Cl -(� 37 and 42 respectively) ions were added to neutralize the systems to a concentration of 0.15 M. Water molecules were described with the TIP3P model (Jorgensen et al., 1983), AMBER ff14SB force field was applied for protein (Maier et al., 2015) and TGTP parameters were generated using the GAFF2 general force field (Wang et al., 2004).
Energy minimization was performed via the steepest-descent method during 5000 steps, with a tolerance of 1000 kJ mol À 1 nm À 1 .The van der Waals forces were defined with a cutoff of 8 Å, while the cutoff for the particle mesh Ewald (PME) method applied to consider the long-range electrostatic interactions was set on 8 Å (Essmann et al., 1995).Long range dispersion corrections were applied for energy and pressure.For all hydrogen-containing bonds, harmonic constraints were applied using the LINCS algorithm.For the backbone of the protein and the protein side chains, harmonic potential restraints of 400 kJ mol À 1 nm À 1 and 40 kJ mol À 1 nm À 1 were applied, respectively.The equilibration phase was performed at the canonical ensemble (NVT) and was carried out during 125 ps.In this stage, for temperature coupling, a Noose-Hover extended ensemble was used, with a bath temperature of 298 K and a time constant of 1 fs.The Verlet pair list was updated every 20 steps.A Maxwell distribution, starting from a pseudo random seed, was used to generate the initial velocities at 298 K. Finally, the production phase was carried out using a time step of 2 fs, at the NTP ensemble controlling the temperature with the Nose-Hoover thermostat to maintain the system at 298 K and maintaining a constant pressure of 1 bar using the Parrinello-Rahman barostat (Vanommeslaeghe et al., 2010).All simulations were conducted using Gromacs 2020.21 on a Nvidia GeForce GTX 980, resulting in an aggregate simulation time of 3 ms (Bauer et al., 2022).For analysis purposes, MD snapshots were recorded every 100 ps yielding 5000 frames per trajectory.Binding free energy was calculated in the last 2500 frames with the ultrafast software MM-ISMSA compiled for a Windows 64-bits machine from source code (https://github.com/accsc/cMMISMSA) (Klett et al., 2012).

Analysis
Prior analysis, all trajectories were centered on the protein atoms and aligned to the first frame using Gromacs 2021.1.RMSD, RMSF and distances were measured using MDtraj 1.9.3 (McGibbon et al., 2015) and VMD 1.9.4 (Humphrey et al., 1996) softwares.After production runs, the RMSD of each independent simulation was used to discard the data in the equilibration phase, so mean RMSD quantities were computed after discarding the initial 75 ns of production run.Protein-ligand contacts were calculated on the whole trajectory, setting a threshold of 4 Å.Volumetric occupancy maps were generated via VolMap plugin implemented in VMD.Hydrogen bonds were defined by acceptor-donor atom distances shorter than 3.0 Å. Statistical analysis between data sets was carried out by means of t-test via SciPy library implementation in Python 3.7 (Virtanen et al., 2020), considering a significance threshold p < 0.05.The visualization tool VMD was used to visualize trajectories and, together with ChimeraX 1.1.2(Pettersen et al., 2021), to render all pictures, while graphics were generated with the Matplotlib 3.1.0library (Hunter, 2007).

Protein stability and dynamics
Among the different NUDT15 variants identified up to date, we selected for our study two well characterized mutants, both with substitutions in the position of Arg139 by a cysteine (R139C) and a histidine (R139H).This decision was based in the clinical relevance of these mutants regarding thiopurine therapy and the availability of experimental data to validate and compare our computational results (Table 1) (Cargnin et al., 2018;Rehling et al., 2021;Yang et al., 2014).A preliminary sequence conservation analysis, run with ConSurf-DB webserver (https://consurfdb.tau.ac.il/), predicted low conservation score for this position (1/10) by analyzing 300 high sequence identity homologues (supporting info.Figure S1) (Goldenberg et al., 2009).This is interesting since, normally conserved residues are critical for protein function, however for NUDT15 a non-well-conserved position seems to have a high impact in enzyme activity when is mutated by cysteine or histidine.
NUDT15 has been crystallized in form of homodimeric complexes with or without inhibitors or nucleotides (Carreras-Puigvert et al., 2017;Carter et al., 2015;Rehling et al., 2021;Valerie et al., 2016;Zhang et al., 2020).Due to the total symmetry of the crystallographic assembly, we decided to perform MD simulations using the monomeric form of NUDT15.Since their N-and C-termini motifs present a high level of disorder and they have not been solved, homology modeling techniques were used to construct a whole structure of NUDT15 encompassing residues 1-164 (Uniprot code: Q9NV35).
The model was generated by RosettaCM presenting a high overlap with the crystallographic structure of NUDT15 used as a template (PDB ID: 6T5J) (Song et al., 2013), with an RMSD of 0.48 Å calculated on the alpha carbon trace.The unstructured regions corresponding to residues Val101-Pro109 and Pro148-Leu156 have the largest RMSD differences, at 0.78 and 1.99 Å, respectively.Regarding the active site, minimum changes are observed, bending approximately 90 � with respect to the initial crystal coordinates in the orientation of the side chains of Gln44, His49, Tyr90 and Tyr92.
Once a plausible model of NUDT15 was generated, to assess its feasibility, we simulated apo and holo (in complex with TGTP) NUDT15 structures over 500 ns of unrestricted classical MD (supporting info.Figure S2).Simulations of the explicitly solvated system revealed different outcomes when comparing apo and holo proteins.Trajectories converged after approximately 100 ns of production dynamics, either for apo and holo complexes (supporting info.Figure S2).These findings demonstrate that the monomeric form of NUDT15 is stable when simulated in the presence of explicit water and/ or in the presence of this nucleotide, and also suggest that this could be extended to other cognate ligands.When no TGTP molecule is bound to the R139C mutant, some noticeable fluctuations are observed in the protein N-tail, corresponding to an outward-inward movement of this motif from its initial position (supporting info.Figure S3).This segment of the protein would be in close proximity to a second monomer of the NUDT15 homodimeric assembly (PDB ID: 6T5J) (Zhang et al., 2020), used as a template for our homology models.However, after these fluctuations, the N-terminus returned to its initial position.Since crystal structures did not solve this protein segment due to its high mobility and disorder structure, our observations are consistent with experimental findings (Carreras-Puigvert et al., 2017;Carter et al., 2015;Rehling et al., 2021).Additionally, this observation proves that the monomeric enzyme can exist in a similar three-dimensional configuration to the crystalized homodimeric assemblies, which is also observed in vivo (Rehling et al., 2021).
To further investigate protein dynamics and obtain deeper insights into the behavior of NUDT15, we also calculated the mean RMSD value across the simulation (Figure 2).Some qualitative correlations could be found between apo values and protein folding temperature (T m , see Table 1).While apo WT displayed a mean RMSD of 2.27 ± 0.13 Å, the substitution of Arg139 with a cysteine results in a significant increase of 2.49 ± 0.26 Å (p < 0.005), according to a less stabilized system during our MD simulations.This is in agreement with experimental stability data measured in terms of T m (see Table 1), which accounts for the protein stability with a decrease in T m for this variant.Interestingly, this tendency is not observed in the R149H mutant, whose T m is lower compared with the wild type (WT).Regarding protein-ligand complexes, it was found that the presence of TGTP molecule in the catalytic site produces a slight stabilization of NUDT15, as evidenced by mean RMSD values lower in all cases compared to apo states (Figure 1A).This effect has been observed previously in computational studies employing this metric to examine the effect of direct ligand binding when a molecule binds to the enzyme's active site (Garc� ıa-Mar� ın et al., 2022;Ge et al., 2017;Makurat et al., 2022).The tendency is also in accordance with T m values when the inhibitor is present in the assay, confirming the ability of the substrate to stabilize protein structure (Man et al., 2019;Rehling et al., 2021).Although this situation is also observed in the R139H variant, the subtle change does not appear to be statistically significant (p ¼ 0.66) for this catalytically inactive mutant.Overall, these data confirm the suitability of our computational biomolecular models to study NUDT15 systems and also confirm that TGTP can stabilize not only the surroundings of the active site, as previously reported, but the whole protein (Man et al., 2019).
The lack of structural information regarding protein-substrate dynamics and interactions prompted us to conduct MD studies, including the nucleotide 6-thio-guanosine triphosphate (TGTP) bound in WT, R139C, and R139H NUDT15.First, a thorough analysis of the cognate nucleotide inside the active site pocket was carried out in order to identify potential differences in ligand dynamics and binding mode (Figures 2 and 3 and supporting info.Figure S2).In holo complexes, only minor conformational changes were observed, indicating that the ligand binding pose is extremely stable, with RMSD values below 3 Å (Figure 1B).In general, only a single metastable binding mode was observed during our simulations, coincident we that observed in the crystallographic structure.
The TGTP molecule does not undergo significant shifts or movements during our simulations in either WT or mutant variants.In fact, due to two hydrogen bonds between the purine amino group and Leu45 and Val16 carbonyls, the ligand is firmly anchored to the active site (Figure 3).This binding mode is reinforced by an extra hydrogen bond between the pyrimidine N1 nitrogen and carboxamide -NH 2 from Gln44.Sulfur atom is deep buried into a small subpacket where interacts with -NH from Leu45 and -Leu138.A strong electrostatic interaction is provided by Arg34 which binds the a phosphate from TGTP thanks to its negatively charged oxygen atom.The main interactions are also complemented with CH-p interactions from Phe135 and Trp136 towards the purine ring.Finally, the ribose ring maintains a stable conformation during our simulations thanks to a hydrogen bond between C3 ' -OH group and His49 side chain.This analysis proves that TGTP is firmly bound to the protein, thanks to a high number of intermolecular interactions.Considering RMSD values (Figure 2B) calculated for TGTP, most residue contacts are shared among variants, with the exception of R139C and residues Val14, Gly15, and Val16.These provide hydrophobic interactions that might be important for ligand affinity.Taking into consideration these outcomes, it seems that both the WT and the allelic variants can bind TGTP similarly, and the differences observed in catalytic activity are not due to NUDT15 affinity towards the ligand per se.Because of the stability observed for the ligand during our MD simulations, it is expected that other nucleotides bind and behave similarly.
After establishing the molecular recognition of TGTP by NUDT15, we turned our attention to differences in protein motions and protein intrinsic flexibility.To accomplish this, we first investigated the root mean square fluctuation during equilibrated dynamics (Figure 4 and supporting info.Figure S4).
Both C-and N-termini, presented high mobility, as previously mentioned, which is expected since they lack the stabilization of the complementary NUDT15 protein of the homodimeric assembly.Additionally, this is also supported by the high temperature factors reported in the crystal structure used as a template (PDB ID: 6T5J) (Song et al., 2013).Furthermore, relatively high fluctuations were also observed in several enzyme loops.Nonetheless, interesting breakthroughs arose from RMSF calculations when comparing holo and apo states.These findings are related to two loops that span residues Gly37-Ser42 (loop 1) and Ile85-Tyr90 (loop 2), responsible for enclosing the TGTP molecule in the catalytic throat (Figure 5A).In the initial modeled structure, loop 2 is arranged in the form of a 3 10 -helix, also known as the g3 helix.During our simulation, this portion loses partially its helicity of 3 10 -helix, resulting in an uncoiled loop (loop 2) which defines the boundaries of the catalytic active site.As consequence, this motif oscillates actively and participates in TGTP occluding.Binding of the TGTP molecule leads to a partial blockage of loop 1 (residues Gly37-Ser42), which could be evidenced by the decrement of RMSF in this region (Figure 4 and supporting info Figure S4).In contrast, loop 2 remains partially immobile, with subtle fluctuations, especially in the R139C variant.This observation suggests that loop 1 plays a key role in anchoring the substrate inside the shallow active site and partially closing the catalytic throat, while loop 2 must preserve some mobility during enzyme catalysis.

Essential dynamics of NUDT15 loops
After discovering the distinct responses of these loops to the presence of TGTP into the active site, we decided to get a deeper insight into their structural role.To fully characterize the dynamic behaviour of these two partners, a Principal Component Analysis (PCA) was performed by using the cartesian coordinates of backbone atoms forming both loops (Figure 5B and C and supporting info.Figure S5 and S6).This type of analysis has been successfully employed to describe the essential dynamics of biomolecules and decipher the main motions of protein atoms (David & Jacobs, 2014).By calculating the first five principal components, we were able to capture subtle opening-closing movements from both loops among the first and second principal components, which account for the highest contributions to the system total variance (� 35-75%) and motions (Figure 5).This analysis corroborates the hypothesis that both loops participate actively in the structural configuration of the active site, and therefore can be part of a putative ligand/substrate entrance pathway.
As above mentioned, loop 1 reduces its mobility upon TGTP binding, and this tendency is also well illustrated in the projection of the first two components, which reveals that this protein segment explores a more restricted conformational space (Figure 5 and supporting info.Figure S4 and  S5).This behavior is especially notorious for the case of WT NUDT15, whose loops explore different energy minima in the configurational landscape when no nucleotide is present in the active site (Figure 5B and S6).After ligand binding, the loop is almost confined in an energy minimum, which corresponds to a close state of the loop observed during classical MD simulations.A detailed visual inspection of loop 1 closed conformation identified by MD simulations showed that it is quite similar to that observed in crystallographic structure (PDB ID: 5LPG) (Valerie et al., 2016), but even narrower, likely due to the simulation conditions mimicking a physiologicallike environment.
Regarding the other two variants, it was observed that R139C mutant locates its loop 1 in a similar conformation as the WT; however, this was not observed in the case of R139H, whose loop oscillates between two different conformations according to our PCA analysis (see supporting info.Figure S4 and S5).Considering the enzymatic activity data reported in the literature (Table 1), this loop, despite being far away from phosphate reactive groups, should be quite immobile during enzyme catalysis.This scenario was not observed in the R139H model, which may be related to its lowest enzymatic activity.To further characterize the overture degree of the enzyme, we traced the distance between alpha carbons of Gly38 and Glu87, present in loop 1 and loop 2 respectively (Figure 6 and supporting info.Figure S7).Using bidimensional distribution plots we were able to determine that apo structures was tended to exhibit greater distribution of distances, with values near 15 Å compared to TGTP bound complexes (Figure 6A).The ligand bound complexes presented thinner peaks of distance distribution near 10 Å.These findings support our hypothesis that nucleotide binding leads to a closer conformation of the enzyme, which is mediated by these two loops.
Based on these findings, it seems reasonable to hypothesize that NUDT15 must be able to close, or at least delimit, its active site cavity with these two loops in order to be a catalytically efficient enzyme.Clinically relevant variants R139C and R139H, undergo point mutations in a residue which is located in the a2 helix.According to crystallographic reports, mutations in this solvent-exposed region produce instability and distortions in the helix (Rehling et al., 2021).A more exhaustive analysis of this region lets us observe that RMSFs values are slightly increased when comparing variants with WT (Figure 6B and supporting info.Figure S3).Furthermore, this phenomenon is even more pronounced in holo structures, when the TGTP is bound in the protein active site.
A mutation of Arg139 by Cys or His leads to the breakdown of the ionic interaction with present in the a2 helix, which seems to play a key role in the stability of this helix where Trp136 is located (Valerie et al., 2016).However, when we monitored the hydrogen bond for the interaction between Arg139 and Asp132 side chains, we just observed a  15% and 1.84% occupancy for apo and holo systems, respectively.This seems to indicate that, despite the fact that this interaction affects the whole stability of NUDT15, its final contribution is moderate.It is also noteworthy that the salt bridge observed in the crystallographic structures of WT NUDT15 between Arg139 and Glu143 was not well maintained during our simulation (supporting info.Figure S8).Indeed, we observed an occurrence of 30.63% and 7.13% for apo and holo WT MD simulations.According to our findings, this interaction seems to be promoted during protein crystallization rather than in solution.
A closer inspection of movements and residues present in this a2 helix revealed a network of p-stacking interactions that keep NUDT15 closed across loops 1 and 2. The key partner in this hub of hydrophobic interactions is the Trp136, which constitutes an anchoring point between both loops (Figure 6C).This amino acid interacts with loop 1 via CH-p interactions with Val38 -CH 3 groups.On the other hand, Glu88 from loop 2 is occasionally hydrogen-bonded through its carboxylate group with the indole -NH group from Trp136.Furthermore, Phe135 also present in a2 helix, is sandwiched by Trp136 and Arg34 from loop 1 which in turn, maintains interconnected the a2 helix and loop 1 thanks to a parallel p-cation stacking interaction, as well as hydrophobic contacts with Val38.Additionally, Trp136 closes NUDT15 via CH-p interactions with Ile85 methyl groups and the aliphatic portion of the Glu86 sidechain.During MD simulations, it was also discovered that Tyr90 and Trp136 can establish parallel p-stacking interactions.In these cases, Tyr90 displaced Ile85 and Glu88, which resulted in a less compact but close protein packing.These observations are well illustrated when the percentage of contact occupancy for Trp136 and surrounding residues are studied (Table 2).
The greatest change in residue contacts corresponds to Val38, which increases its contacts with Trp136 upon ligand binding, being especially notorious for WT and R139H.This is the main connector between loop 1 and a2 helix thanks to the hydrophobic and CH-p interactions with the Trp136 indole ring.Those contacts observed on Glu88 are mainly due to a hydrogen bond with -NH from Trp136, as previously stated, when the nucleotide is present in the active site.This interaction represents a direct path to enclose loop 2 towards loop 1.Therefore, occasional hydrophobic interactions could be found between the aliphatic portion of the Glu88 side chain Ile85 and Trp136.Finally, Phe135, contacts are the most prevalent due to its close proximity to Trp136 in order to establish parallel the p-stacking.The stabile contact between Phe135 and Trp136 correspond to high prevalent parallel p-stacking, which transmits movements in the a2 helix.
Based on these findings, it appears that situations that cause Trp136 fluctuation may disrupt this cluster of residues and their interactions, altering the functional structure of the enzyme.Indeed, RMSF values for Trp136 in R139C (0.88 Å) and R139H (1.21 Å) mutants with TGTP bound, are higher when comparting with WT (0.5 Å) NUDT15.Thus, instability in this molecular anchor leads to fluctuations in the whole architecture of the active site region, especially in loops 1 and 2. This is in agreement with the relative activities reported in the literature (Table 1) for in vitro assays, supporting our hypothesis (Rehling et al., 2021).Moreover, the importance of Trp136 as a structural node in NUDT15 is also supported by the small molecule inhibitor TH7755 whose binding mode is stabilized by hydrophobic interactions with Val38 (Rehling et al., 2021).Consequently, the design of chemical probes targeting this residue could yield NUDT15 inhibitors with a mechanism independent of substrate active site competition.
Finally, we decided to gain additional insights into the TGTP binding thermodynamics towards NUDT15 by using end-point free energy calculations (supporting info.Table S1).The implicit solvent model approach MM-ISMSA was selected to calculate ligand binding free energy and per-residue decomposition (Klett et al., 2012).This technique has been used recently into several researches, providing analogue results comparing with more computing-demanding methods such as MM-GBSA or MM-PBSA ( � Alvarez-Fern� andez et al. 2022;Garc� ıa-Mar� ın et al., 2021, 2022;Perona et al., 2020;Singer et al., 2022).All variants displayed affinity towards the substrate, as shown in favorable binding energies.This was somewhat expected since the architecture of the active site is similar in all cases.However, total binding energy was greater for NUDT15 WT (-39.57kcal/mol), followed by the R139C and R139H variants (-35.9 and À 33.90 kcal/mol respectively).These slight differences are due to the grade of TGTP enclosing by enzyme loops 1 and 2.Moreover, these findings can also be related to experimental observations regarding catalytic activity, in which the R139H variant was found to have lower activity (Rehling et al., 2021).Additional information was extracted when perresidue dissection was carried out.
It is interesting to highlight the contribution of HIS49, responsible for forming a hydrogen bond, in its e protomer, with the C3' hydroxyl group of TGTP and whose contribution to the total binding free energy is similar for WT and the R139C variant (-4.82 and À 5.33 kcal/mol respectively), but whose contribution decreases considerably in the case of the R139H variant (-2.55 kcal/mol) because of the breaking of the hydrogen bond during dynamics.Finally, two aromatic residues Phe136 and Trp136 which are responsible of NUDT15 packaging as previously mentioned, also contributes to total binding free energy with TGTP thanks to the anti-parallel p-stacking with the purine ring.Positively charged residues Arg34 and Lys116 interacts with phosphate groups through ionic bonds, so they also provide remarkable contributions to the total affinity.Finally, further contributions are provided by hydrophobic and aromatic residues (Gly48, Tyr92, Val16, etc.).

Conclusions
The investigation of the dynamic behavior of NUDT15 and some important clinical variants with atomic detail, allowed us to further understand some structural features and protein functions not previously observed or described.
Our study highlights the strong contribution of the TGTP nucleotide to the conformational stability of the whole protein once bound, which is consistent with experimental evidences for WT and R139C proteins.In addition, the conformational landscape of two loops has been investigated thoroughly.In the absence of any bound nucleotide, loop 1 (Gly37-Ser42) and loop 2 (Ile85-Tyr90) display moderate fluctuations.Upon ligation with a TGTP molecule, loop 2 must be partially blocked, while loop 1 tends to retain some mobility, probably to promote NUDT15 catalysis.As such, it seems that these loops can serve as gatekeepers harboring for the nucleotide.PCA analysis revealed that loop 1 spans a larger conformational landscape during MD simulations for the R139H point variant.Furthermore, thanks to a network of hydrophobic contacts mediated by Trp136, these protein segments are important for maintaining NUDT15 in a compact conformation, possibly critical for its catalytic mechanism of action.This residue present substantial fluctuations in R139C and R139H variants, destabilizing the network of hydrophobic and p-stacking interactions, which in turn, might account for the diminished NUDT15 enzymatic activity.As such, it appears that mutation of residues present in a2 helix can affect to the architecture of this protein by perturbing mobility of these loops and the catalytic throat.
Our investigation not only let us to gain insights into the experimental data for this hydrolase, but they also illustrate how point mutations affect the whole conformational diversity of the protein, affecting its enzymatic activity.This knowledge may be useful for the understanding of new point mutation variants in a pharmacogenomic context, as well as for the design of new inhibitors targeting a possibly noncompetitive mechanism of action that leads to the blockage of NUDT15 loops.

Figure 1 .
Figure 1.Chemical structures of thiopurine drugs approved for the treatment of immunologic and neoplastic disorders.Highlighted in aquamarine blue, the pharmacophoric scaffold of thiopurine.

Figure 2 .
Figure 2. (A) Barplot of mean RMSD values ± Standard Deviation (SD) for backbone atoms in all apo and holo MD simulations.(B) Heatmap plot of the RMSD values for TGTP heavy atoms over time.

Figure 3 .
Figure 3. (A) Detailed representation of TGTP (thick sticks) and neighboring residues (< 4 Å) of NUDT15 the active site extracted from a representative snapshot of MD trajectory.Hydrogen bonds between TGTP and protein residues are depicted as dashed lines.(B) Protein residues and their corresponding contact frequency between TGTP and NUDT15 during whole dynamics (only residues with contact frequencies above 30% are shown).

Figure 4 .
Figure 4. (A) Representative snapshots from MD simulations of NUDT15 apo proteins superimposed and colored according to the RMSF.(B) Representative snapshots from MD simulation of NUDT15 holo complexes superimposed and colored according to RMSF.

Figure 5 .
Figure 5. (A) Representative structure from MD simulations showing loop 1 (residues Gly37-Ser42) and loop 2 (residues Ile85-Tyr90) movements identified through PCA analysis.(B) On the left, a projection of the first two principal components for apo NUDT15 WT loop 1, along with their contribution percentage to the total variance.Inside the plot there are a representative structures from the MD trajectory of loop 1 in open (green) and closed (red) conformation.On the right, landscapes of free energy derived from PCA projections for apo NUDT15 WT.C) On the left, a projection of the first two principal components for NUDT15 WT loop 1 bound to TGTP, along with their contribution percentage to the total variance.Inside there is a representative structure of loop 1 in closed conformation (red).On the right, free energy landscapes derived from PCA projections of NUDT15 WT bound to TGTP.

Figure 6 .
Figure 6.A kernel density estimate (KDE) map showing the Ca distances between Gly38 (loop 1) and Glu87 (loop 2) residues.The RMSF values for the NUDT15 a2 helix residues.B) Representative snapshots from MD simulation of NUDT15 WT in complex with TGTP showing the cluster of p-stacking interactions.Residues are embedded in a light blue isosurface showing their occupancy at iso level of 0.65.

Table 2 .
Percentage of Trp136 contacts along MD simulations with residues within 4 Å of the cutoff distance.