Rad5 HIRAN domain: Structural insights into its interaction with ssDNA through molecular modeling approaches

Abstract The Rad5 protein is an SWI/SNF family ubiquitin ligase that contains an N-terminal HIRAN domain and a RING C3HC4 motif. The HIRAN domain is critical for recognition of the stalled replication fork during the replication process and acts as a sensor to initiate the damaged DNA checkpoint. It is a conserved domain widely distributed in eukaryotic organisms and is present in several DNA-binding proteins from all kingdoms. Here we showed that distant species have important differences in key residues that affect affinity for ssDNA. Based on these findings, we hypothesized that different HIRAN domains might affect fork reversal and translesion synthesis through different metabolic processes. To address this question, we predicted the tertiary structure of both yeast and human HIRAN domains using molecular modeling. Structural dynamics experiments showed that the yeast HIRAN domain exhibited higher structural denaturation than its human homolog, although both domains became stable in the presence of ssDNA. Analysis of atomic contacts revealed that a greater number of interactions between the ssDNA nucleotides and the Rad5 domain are electrostatic. Taken together, these results provide new insights into the molecular mechanism of the HIRAN domain of Rad5 and may guide us to further elucidate differences in the ancient eukaryotes HIRAN sequences and their DNA affinity. Communicated by Ramaswamy H. Sarma


Introduction
The Rad5 is a member of the switching defective/non-fermentable sucrose family (SWI/SNF), an ATP-dependent chromatin remodelers with DNA translocase and binding functions formed by a large subunit complex called SWI2/ SNF2 with a RING C3HC4 motif, characteristic of ubiquitin ligases (Blasty ak et al., 2007(Blasty ak et al., , 2010Hayashi et al., 2018;Xu et al., 2016). During the replication process, the Rad5 protein complexes with other proteins that recognize the stalled replication fork, caused by DNA errors, and start two signaling pathways: the Translesion Synthesis (TLS) or the Fork Reversal (FR) (Blasty ak et al., 2007;Elserafy et al., 2018;Hargreaves & Crabtree, 2011;Shin et al., 2018). The presence of the HIP116 Rad5p N-terminal (HIRAN) domain in the Rad5 structure is crucial to the initial formation of the four-way junction in the FR DNA repair pathway (Achar et al., 2015;Chavez et al., 2018;Kile et al., 2015;Korzhnev et al., 2016).
The HIRAN domain is commonly found in the N-terminal regions of the proteins that contain the motifs of the SWI2/ SNF2 ATPases subunit (Iyer et al., 2006). The NMR structure of the HIRAN domain from the HLTF protein shows 6 betasheets (b1, b2, b3, b4, b5, b6) and two alpha-helices (a1, a2) organized as a barrel, diagonally twisted, with the two alpha helices in an Oligonucleotide/Oligosaccharide-Binding (OBlike) structure (Korzhnev et al., 2016) (Figure 1A). Residues Y72 and Y93 handle crucial interactions within the DNA interaction pocket through p-p stacking interactions with two nucleotides (Kile et al., 2015;Korzhnev et al., 2016). Residues N91, R71, H110 and K113 stabilize ssDNA through electrostatic interactions, and the residue F142 interacts with a third nucleotide, also with p-p stacking interactions outside the pocket ( Figure 1B). The HLTF's HIRAN domain interacts with the ssDNA independent of the nucleotides' composition (Hishiki et al., 2020).
The HIRAN domain is described in several DNA-binding proteins from all kingdoms, being a conserved domain and well distributed in eukaryotic organisms (Iyer et al., 2006). The elucidation of its molecular interaction of ancient tertiary structure functions is an initial step to comprehend the evolution of complex protein functions involved in both FR and TLS DNA repair pathways.
In this work, we built a multiple sequence alignment (MSA) with 48 sequences presenting a better evolutionary conserved residues profile of the HIRAN domain. Because of the lack of experimental structures in databases, we built the putative tertiary structure of the Rad5's HIRAN domain by comparative modeling and used it to get insights into the molecular mechanism involving its ssDNA interaction. Studies of structural variations by using Molecular Dynamics (MD) and Molecular Docking allowed us to infer the structural behavior of yeast and human HIRAN domains and to understand their affinity to ssDNA. Here, we report the structural dynamics of the yeast HIRAN domain, which is related to the HLTF's HIRAN domain, and the role of key residues in ligand stabilization.

Primary structure analysis
The Rad5's HIRAN domain sequence (GenBank ID: CAA97556.1) was obtained by extending the N and C-terminal from the primary structure located at Pfam (Finn et al., 2014) database (access code PF08797) and performing MAFFT (Katoh, 2002) global alignment for a maximum score. This sequence was also predicted by Korzhnev and colleagues (Korzhnev et al., 2016). The HLTF's HIRAN sequence (GenBank ID: XP_011511393.1) was taken from the tertiary structure calculated by nuclear magnetic resonance (NMR) data (PDB ID 5K5F) (Korzhnev et al., 2016). BLASTp (Altschul et al., 1990) was used to acquire all the 48 HIRAN sequences from other species using 10 À6 e-value cutoff with Rad5's and HLTF's HIRAN domains as a reference. The resulting MSA was visually inspected using MEGAx v10.1.8 software (Kumar et al., 2018). The logo representation and calculation were performed with ggseqlogo R package (Wagih, 2017) using the fasta alignment as input. All domain family regions from Rad5 and HLTF were extracted from the Pfam database.
Comparative modeling MODELLER v9.22 (Sali & Blundell, 1993) program was used to build the tertiary structure of the Rad5's HIRAN domain of Saccharomyces Cerevisiae (scRad5) by comparative modeling. The BLASTp (Altschul et al., 1990) program was used to acquire homologous sequences used for global alignment within the PDB database. The alignment of the scRad5 sequence with the HLTF's HIRAN domain model (PDB ID: 5K5F); (Korzhnev et al., 2016), was performed with salign function and the calculations with the automodel function, as described in the software's user manual. 25,000 structures were calculated using the 5K5F structure as a template, and the structure with the lowest DOPE score (Shen & Sali, 2006) was selected. This structure had its global energy minimized using the CHIMERA v1.13.1 (Pettersen et al., 2004) program with the AMBER ff99SB force field with 100 steepest descent steps with a 0.02 (Å) step and ten conjugate gradient steps with a 0.02 (Å) step size. Subsequently, the structure was submitted to the SAVES v5.0 platform for structural validation analysis with PROCHECK (Laskowski et al., 1993) and ERRAT (Colovos & Yeates, 1993), and selected to be used in the molecular docking and dynamics steps.
Molecular docking DOCK v6.9 (Allen et al., 2015) software was chosen to perform molecular docking using a non-commercial academic license. For the grid calculations, a cubic box with 4 nm edges was created around the DNA interaction site. The molecular surface of the receptors was generated using the CHIMERA Write DMS (Sanner et al., 1996) tool from the structure with no hydrogen atoms. Then, the spheres were generated by the sphgen tool of the DOCK6 package using the DMS file and a maximum distance of 4 Å and minimum of 1.4 Å for each sphere was established as the cutoff for a contact. The center of the box has the same vector position as the center of the selected sphere group. Later, the grid calculation was performed using the grid tool from the DOCK6 package (Allen et al., 2015), with default parameters. Structural pose calculations were performed using the DOCK6 tool, considering a flexible ligand, a rigid receptor, resulting in 100 conformations of the sDNA. However, only the top 10 poses with the lowest overall energy were evaluated. The RNA ribbon of the HIRAN domain molecular docking assays with ssDNA was performed using the chain G from PDB ID 4S0N as a template. This structure has four nucleotides of thymines (Poly-T), and it is a stable conformation of the ssDNA within the HLTF interaction pocket.

Molecular dynamics simulations
The GROMACS v2018.3 (Abraham et al., 2015) software was used to perform the molecular dynamics simulations. The HIRAN domain structures with a poly-T ssDNA were obtained from the molecular docking runs. All systems were created and performed under the same conditions to compare the systems characteristics. Table 1 summarizes and describes all systems used in this work.
The replicates were performed with random initial velocities to search for a vast number of structural samples. The AMBER99SB-ILDN (Hornak et al., 2006) force field was used for all systems without ssDNA and the AMBER14SB force field (Maier et al., 2015) for all systems with ssDNA. AMBER force fields are commonly used in protein and DNA systems because of its accuracy in predicting these molecules' structural behaviors. AMBER14SB only differs from AMBER99SB-ILDN in having the ff99bsc0 parameters for nucleotide simulation (Galindo-Murillo et al., 2016;Petrovi c et al., 2018;Weber & Uversky, 2017). The simulation box was solvated with TIP3P (Transferable Intermolecular Potential, three-point; (Jorgensen et al., 1983)) water model, and sodium and chlorine ions to neutralize the system electrostatic charges. The box contains periodic boundary conditions (PBC) parameters in three dimensions, and the coulombic interactions were treated by the PME algorithm (Darden et al., 1993) with a cut-off of 1 nm for short-distance interactions (electrostatic and van der Waals). Minimization was performed with the steepest descent algorithm and stopped when the maximum force of the system was less than 1000.0 kJ/mol/nm. The system was equilibrated using 300 ps of NVT and NPT ensembles. The thermostat algorithm used was the V-rescale (Bussi et al., 2007), with 300 K as the reference temperature. The pressure coupling algorithm used was the Parrinello-Rahman barostat (Parrinello & Rahman, 1981), with isotropic type and isothermal water compressibility value of 1 bar À1 . The production stage used the leapfrog integrator and had a time integrator of 2 fs and recorded the atoms trajectories, speeds, and energies every 50 ps.
The trajectory analyses were done with the GROMACS package commands (rms, trjconv, gyrate, rmsf, and make_ndx), and the graphics representations plotted using ggplot2 (Wickham, 2016) R package. The Y-axis values of the RMSF plots were converted into a B-factor by GROMACS using the -oq flag in the rmsf command. These values were color rendered at the initial structure of each model using the Render by Attribute tool of CHIMERA (Pettersen et al., 2004), and the terminal residues with high RMSF values were removed from this renderization. The GROMOS algorithm (Daura et al., 1999) was used as a method for the representative structures clustering using the cluster tool. All cutoff values and their respective number of structures selected can be found in Supplementary Table S1. A specific cutoff value was used for each system to find a minimum number of structures that represent the structural conformations adopted throughout the production stage. Contact analyses were performed using the CPPTRAJ (Roe & Cheatham, 2013) tool from the AMBER package. The nativecontacts command was used to extract native and non-native contacts with the maxdist to establish a maximum distance cutoff of 4 Å. Native contacts are those initially present in the system, while non-native contacts are acquired during the simulation step (Case et al., 2018). This analysis quantifies the number of frames that the atoms distance was below the cutoff distance in relation to the total frames, which indicates that the frames containing this interaction are equivalent to a certain percentage of the total simulation time. All calculations were performed using NMRBox (Maciejewski et al., 2017).

Results and discussion
Yeast and human HIRAN domains share key residues in the ssDNA interaction HIRAN is identified as a domain rich in beta-sheets fused to other catalytic domains and in the N-terminal region of SWI2/SNF2 proteins (Iyer et al., 2006). This domain is found as a standalone protein in prokaryotes. Alignment of the HIRAN domain with sequences from a BLASTp search using the primary structure of HLTF as a reference revealed a conservation pattern and the key residues for the ssDNA interaction in previous work (Kile et al., 2015). Here, we used both HLTF and Rad5 sequences as references for BLASTp searches and multiple sequence alignments to cover species within a broader evolutionary distance range. We analyzed the primary structures of Rad5 and HLTF proteins, and their respective HIRAN domains, using the Pfam database. These analysis revealed a similar domain architecture distribution with the presence of SNF2 and RING domains in similar regions of the sequences (Figure 2A). Rad5 and HLTF proteins have a global identity of 30.20%, according to a pairwise alignment. Therefore, HLTF can be considered the closest human homolog to Rad5.
The Pfam seed sequences used to create the HIRAN HMM profile contain a 1-phosphatidylinositol 4-kinase (UniProt entry: Q5AQZ8), a protein not associated with DNA interaction or recognition. In addition, the used region ofHIRAN domain of HLTF do not include its alpha-helix region (a2) and may be the reason for the identification of primary structures from bacteria (UniProt entry: Q5LCS0 and Q47IS2) that have a few conserved residues in comparison to the eukaryotic sequences. Thisalso raises another question about the free-life HIRAN domain in prokaryotes, as this alpha-helix region is important for protein stability, as further shown in this work. This domain database, deposited by Iyer and Aravind in 2006, predicted the HIRAN domain to be an 8beta-sheet domain. This prediction was made years before the first HIRAN tertiary structure was deposited in the PDB in 2015 (PDB ID: 4XZF); (Hishiki et al., 2015), evidencing the relevance of new studies of this domain.
To better understand the HMM profile and amino acid conservation of HiRAN sequence among distant species, we performed and curated a multiple sequence alignment with 48 sequences obtained by BLASTp search. Residue conservation regarding the ssDNA and dsDNA interactions are evidenced by the LOGO plot (Supplementary Figure S1). The lysine/arginine and tyrosine residues are conserved at positions 30 and 31, respectively ( Figure 2B). These residues correspondt to R71 and Y72 from the HIRAN domain of HLTF. The asparagine and lysine residues in position 53 are located near to a tyrosine residue in position 55 (N91 and Y93, respectively, from the HIRAN domain of HLTF). These two regions represent the side chain responsible for the formation of the p-p stacking interaction with the nitrogenous base of the nucleotides in the ssDNA, which is important for the ribbon stabilization within the pocket. In fact, most HIRAN domains might pair with ssDNA due to these residues in the loop regions (Hishiki et al., 2020;Kile et al., 2015;Korzhnev et al., 2016).
The most conserved region of the primary structure of the HIRAN domain is the putative b6-region (Supplementary Figure S2). HIRAN sequences obtained from BLASTp searches using HLTF as query show a QVGHL pattern (Supplementary Figure S3), whereas sequences obtained tfrom Rad5 as query show an EIGRI pattern (Supplementary Figure S4). Despite the presence of a conserved glycine only, this region shares common physical-chemical properties of the side chain amino acids, indicating a key region in the primary structures of the HIRAN sequences. The F142 from the HIRAN domain of HLTF is a key residue in the binding site only if duplex DNA is in the minor groove of the genetic material (Hishiki et al., 2020) and the sequences acquired with the HLTF sequence as a reference have the phenylalanine/tyrosine residue at position 138 (equivalent to F142 from HIRAN domain from HLTF). These results suggest that primary structures that are conserved in relation to the HIRAN domain of Rad5 might not have this residue in this specific region. The column "Model" contains the name of the system; the column "Description" contains the elements in each system; the "Components" column lists the main components followed by how many replicates were performed, the time and the force field used.
The HIRAN domain of Arabidopsis thaliana Rad5A protein (atRad5A) interacts with branched DNA with higher affinity compared with ssDNA and dsDNA, although the conserved ssDNA interaction residues are absent (Kobbe et al., 2016). According to our multiple sequence alignment result, compared to the HIRAN domain of HLTF, the Rad5A has a serine residue in the L1 tyrosine position31, and the glycine-phenylalanine in the L2 tyrosine-aspartate position, 55 and 56 respectively. This difference in the affinity for single-stranded and double-stranded DNA may be reflected in the FR or TLS signaling pathway, as the HIRAN domain is critical for stabilizing the protein complex in the stalled fork (Chavez et al., 2018). Arabidopsis thaliana has an HLTF-like protein with a HIRAN domain closely related to human HLTF with conserved DNA interaction residues, and further studies are needed to understand the entire DNA repair pathway in plants. An evolutionary understanding of the HIRAN domain functions, along with other species distant from humans, is required to infer the impacts on the FR and TLS DNA repair signaling pathways Based on these results, we propose that the human and the yeast HIRAN domains have different affinities for DNA. To evaluate this, we focused on predicting the tertiary structure of the HIRAN domain of Rad5 and its molecular interaction with ssDNA. We used HLTF as a reference protein (Hishiki et al., 2020;Kile et al., 2015).
Modeling and assessment of the Rad5's HIRAN domain tertiary structure The tertiary structure of S. cerevisiae Rad5 and its domains cannot be found in structural databases, which may be related to the difficulty of obtaining the recombinant protein to perform experimental studies. Therefore, an alternative is to get the structure through computational methods. Other studies found in the literature used computational approaches to predict different protein structures that interact with DNA, such as the UvrB protein, involved in the nucleotide excision repair process (Bavi et al., 2016), the UvrC protein (Parulekar et al., 2013) and the NAD þ dependent DNA ligase proteins (Shrivastava et al., 2015).
First, we constructed the HIRAN domain of Rad5 using two different approaches, threading and comparative modeling, with three softwares, ROBETTA (Song et al., 2013) I-TASSER (Yang et al., 2015), and MODELLER. The procedures with ROBETTA and I-TASSER with and without a template were performed with default parameters, while for MODELLER only one procedure was performed and used a template. The quality of all models was checked by appling statistical approachs, such as Ramachandran plot and ERRAT values, before and after energy minimization using CHIMERA and 3DRefine Server (Bhattacharya et al., 2016) (Supplementary Figures S5 to S17).
The MODELLER method yielded the best models with all residues in allowed regions in the Ramachandran plot and ERRAT scores below the error lines (Supplementary Figure  S5). Without the use of a template, ROBETTA built models with poor assignments of secondary structures (Supplementary Figure S15). When the HIRAN domain of HLTF was used as a template, ROBETTA generated structures similar to those generated by MODELLER, although a residue was located in a disallowed region (Supplementary Figure  S12). The I-TASSER method predicted models with more residues in disallowed regions even after minimization steps. A) The domain architecture and motifs of the SNF2/SWI2 family proteins involved in the FR and TLS DNA repair signaling. The Rad5 protein has the HIRAN domain, domains of the SNF2 family, and a RING finger domain, a similar structure to HLTF, its human homologous. B) the regions L1, L2, b4 and F142, responsible for the ssDNA and dsDNA interaction, extracted from the logo representation. The cysteine and methionine are represented in dark yellow, the phenylalanine, tyrosine and tryptophan are in orange, the lysine, arginine, and histidine are in red, the proline is pink, the aspartic acid and glutamic acid are in blue, the glycine, alanine, valine, leucine, and isoleucine are in gray, the asparagine and glutamine are in brown, and the serine and threonine are in green. This group color representation aids the visual analysis of residues' side chains with similar physico-chemical properties.
Therefore, the models of I-TASSER were not considered for further calculations (Supplementary Figures S6 to S11).
The ROBETTA and MODELLER models were then submitted to 100 ns MD simulations after CHIMERA energy minimization to assess the stability of their tertiary structures and to ensure the selection of the best-predicted model of Rad5's HIRAN domain. The DSSP data (Kabsch & Sander, 1983) showed unfolding of the a2-helix in both models  The RMSF data extracted was converted into B-factors for movement analysis of the residues in the tertiary structure. The data were normalized from the deleted regions with high RMSF values. From A to C, the hsHLTF apo model is represented and from D to F the scRad5 model, with the replicas 1, 2 and 3, respectively. The white color represents the lower B-factor value and red the highest. The correlation between the B-factor value and the structural flexibility of the residues is positive and proportional. Figure S18). However, the MODELLER model had more structured residues and, together with statistical analysis using ERRAT and PROCHECK, was considered the best structure to represent this domain.

(Supplementary
The HIRAN domain of HLTF (PDB ID: 5K5F) (Korzhnev et al., 2016) (Figure 3A), referred to as hsHLTF apo , was selected as a template for the HIRAN domain of Rad5, named scRad5. This model is a structure in saline solution and represents the domain without ligand in its apo conformation. This structure ensures that we can comparatively analyze the structures of the domains without the influence of any molecule.
The pairwise alignment ( Figure 3B) shows a percent of identity and similarity of 12.50% and 51.47%, between HIRAN sequences of HLTF and Rad5 proteins, respectively. Despite the low identity percentage, the degree of similarity can be used to construct a tri-dimensional model once inter-residue interactions between side chains control the protein folding and stabilization (Newberry & Raines, 2019). Proteins have genetic variations acquired through evolution through natural selection. However, those that share a similar microenvironment for molecular action tend to have similar structures and functions due to environmental pressures and restrictions (Worth et al., 2009). The tertiary structure calculated for the scRad5 model ( Figure 3C) shows six beta-sheets and two alpha-helices, as well as a high number of residues with positive electrostatic potential at the putative ssDNA pocket (Supplementary Figure S19), similar to the HIRAN tertiary structure of HLTF and two small alpha-helices in the N-terminal region and L1 region. The structure also shows regions with more extended loops, suggesting that these residues may have highmobility because their side chains have no spatial restrictions with neighboring residues, and their molecular surfaces are exposed to solvent.
Dihedral angles analysis (Supplementary Figure S5) showed the presence of 99 residues (79.8%) in most favored regions, 23 residues (18.5%) in additional allowed regions, two residues (1.6%) in generously allowed regions, and no residues in unfavorable regions. The two residues found in generously allowed regions are in loop regions in the tertiary structure. The residues in these regions have more degrees of freedom, because they have less atomic interactions with neighboring residues and side chains externalized into the solvent. The scores observed in the ERRAT (Colovos & Yeates, 1993;Supplementary Figure S5) plot show that only residues in loop positions have a high score value and exceed the warning region. These results, based on statistical and comparative calculations with protein structure databases, support a structure that can represent the conformation of the HIRAN domain of Rad5.
Because we found differences in the tertiary structures between the scRad5 and hsHLTF apo models, MD simulations were performed to evaluate these structures, as even small variations in the structures can mean different structural dynamics and different residues in ssDNA interactions.

The HLTF's HIRAN domain has fewer secondary structures unfolding
Molecular dynamics simulations can contribute to the in-silico observation of biomolecules at the atomic level, improving the understanding of biological processes, such as timedependent structural and conformational changes, and the interaction of a protein with a ligand (Ganesan et al., 2017). MD simulations are a reliable method to understand molecular mechanisms involving amino acids and nucleotides, and have been used for ssDNA (Oprzeska-Zingrebe & Smiatek, 2018) and dsDNA systems (Hamed, 2018;Kamaraj & Bogaerts, 2015;Mary et al., 2017;Pradhan et al., 2018), to elucidate the role of hydrophobic interactions in protein inhibition (Verma et al., 2016) and in a protein-DNA complex system (Pitta & Krishnan, 2018).
The RMSF data converted to B-factors allow us to trace a positive fluctuation correlation: the higher the RMSF value of the residue the greater is the B-factor (Figure 4). It was observed that all loops have a greater spatial motion Figure 5. Importance of residues R71/Y72 and K194/Y195 for the ssDNA interaction pocket formation.
In A, B and C, the amino acids responsible for stabilizing the ssDNA are in red, in the hsHLTF apo , hsHLTF holo and scRad5 models, respectively. Its molecular surfaces are also shown in gray compared to the secondary structures in scRad5 and hsHLTF apo models. In hsHLTF apo , the residues with the largest fluctuation were Y72 and Y93, which are responsible for ssDNA interaction (Kile et al., 2015). In scRad5, the a2 alpha helix region has higher spatial mobility than the loop regions due to the unfolding of its secondary structure. The residues of scRad5 with high turnover were K197, Y198, G199 and Y217, which are located in loop regions. These structural behaviors, both in loop regions, may be essential for ssDNA recognition and stabilization. The RMSD and RMSF data (Supplementary Figure S20) show that the N-and C-terminal regions exhibit high motility. Because it is a possibility that neighboring domains help shape the HIRAN domain termini, this high motility seen in our MD simulations may not be present in the cellular environment.
The folding/unfolding of secondary structures and their quantification throughout the MD trajectory is shown by the DSSP plot (Supplementary Figure S21). The hsHLTF apo DSSP data show that the secondary structures remained stable, with a more homogeneous number of residues therein (Supplementary Figure S22), and that N-terminal alpha-helix and helix located at the L1 are transient, confirming statements from other studies (Kile et al., 2015;Korzhnev et al., 2016). The scRad5 model had several secondary structures unfolding in all replicated simulations, with fewer residues within them (Supplementary Figure S22), in the majority at the a2-alpha-helix and at the b1 and b2 beta-sheets, in addition to an alpha-helix structure at the L1. The number of residues within secondary structures can be correlated with the stability of the model, as the variation heterogeneity of this number suggests that the structure undergoes blunter conformational changes that alter its patterns of hydrogen and inter-residue interactions. These analyses suggest that thsHLTF apo has greater structural stability than scRad5, and this instability can be the reason for the lack of experimental structural work on the S. cerevisiae Rad5 HIRAN domain.
These data suggest that the stability of the HIRAN domain of Rad5 may be dependent on interaction with neighboring domains. A previous work of the complete Rad5 protein using molecular modeling methods of the complete Rad5 protein, supported by small-angle X-ray scattering (SAXS) showed no such denaturation of the secondary structure (Gildenberg & Washington, 2019). Gildenberg & Washington, (2019) also performed a coarse-grained MD simulation technique, over a five ms period, and reported that theHIRAN domain of Rad5 exhibits molecular interactions with helicase domains, and it is more energetically favorable near them.

The ssDNA interaction pocket of the HIRAN domain has two conformational states
Structural clustering algorithms were used to acquire conformational samples from a large number of structures generated by MD simulations and help understand the dynamics of tertiary structures over simulation time (Coskuner & Uversky, 2017). The GROMOS (Daura et al., 1999) clustering algorithm was used to obtain representative structures from trajectory data from the hsHLTF apo and scRad5 simulations (Supplementary Figure S23). Together with the full visual Figure 6. Residues of the hsHLTF holo/ssDNA model that interact with ssDNA throughout the MD simulation time.
Plot of atomic contacts generated from the information extracted from the cpptraj tool and nativecontacts command from hsHLTF holo/ssDNA model. In A, the atoms of the residues that interact with their respective nucleotide (NT) atoms, in B, the tertiary structure of the model, and in C, the representation of these residues with their molecular surface. Residues that presented contacts during a period longer than 75% of the simulation time are represented in red, and those that presented between 25% and 75% are in orange.
analysis of all frames calculated at the production stage, these structures increased our understanding about the behavior of amino acid residues in the DNA interaction pocket.
Based on the analysis of the representative structures, we found that the tertiary structure of the hsHLTF apo domain behaves more uniformly than the scRad5. In both models without the ssDNA ribbon, the interaction pocket was not fully formed at any frame during the simulation. These observations suggest that this pocket does not spontaneously acquire its functional conformation. Thus, it is structured only in the presence of a ligand. The loop regions and their residues responsible for ssDNA interaction exhibit high spatial dynamics in the simulations. For conformational induction to occur, the region requires a complementary fit between the pair and various attempts at collisions and interactions between the protein and the ligand, justifying the need for a pocket with high mobility in its free form (Du et al., 2016). Conformational induction by the ligand allows the protein to interact with different substrate conformations, allowing mutations and variations in the ligand or binding site, providing an evolutionary advantage (Du et al., 2016).
The PDB ID 4S0N (Kile et al., 2015), herein referred to as hsHLTF holo , is the structure of the HLTF's HIRAN domain at its holo conformation, complexed with ssDNA, and was used as a comparison to analyze the spatial positions of amino acids in the pocket. During model preparation, all nucleotide atoms were removed to allow simulation of a single HIRAN domain chain without the ligand. This model was used to analyze whether the pocket, once fully structured and stabilized with ssDNA, adopted the conformation found in its apo form.
Comparison between hsHLTF apo and hsHLTF holo interaction pockets revealed the importance of residues R71 and Y72 for its formation (Figure 5A and 5B). The hsHLTF apo presents the R71 internalized into the pocket, with its side chain restricting the ssDNA interaction, while the R71 of the hsHLTF holo is externalized. This outward position of R71, in turn, internalizes the Y72 and makes possible the p-p stacking with nucleotides together with the Y93. It is important to note that this transition of spatial conformation from the hsHLTF apo model to the hsHLTF holo , and vice versa, was not observed during any frame of the production stage. The putative pocket of the HIRAN domain of Rad5 has similarities to the hsHLTF apo pocket. Although the key residues that interact with ssDNA are different, residues with similar side chain physicochemical properties are found at similar positions. All pockets have high spatial dynamics and the side chain conformation of the K194, related to R71, sterically restricts the entry of ssDNA when internalized to the pocket. ( Figure 5C).
The RMSD values of the hsHLTF holo show that this structure has lower structural mobility compared to the hsHLTF apo model (supplemental Table S2). This may be because the Plot of atomic contacts generated from the cpptraj tool and nativecontacts command from hsHLTF apo/ssDNA model. In A, the atoms of the residues that interact with their respective nucleotide (NT) atoms, in B, the tertiary structure of the model and in C, the representation of these residues with their molecular surface. Residues that presented contacts during a period longerthan 75% of the simulation time are represented in red, those between 25% and 75% are in orange and those lesser than 25% are in pink.
original structure originated from the crystallization method and is therefore a situational conformation that is more rigid, or because there is greater stability of HIRAN in its holo conformation with ssDNA (Dauter & Wlodawer, 2016;Su et al., 2015). The representative structures of the hsHLTF holo model (supplemental Figure S24) show a homogeneous profile without major structural deviations. This analysis suggests that the HIRAN domain of HLTF in its holo conformation requires a longer simulation time or another MD simulation method to observe the transition state between holo and apo pocket structures.
The scRad5 ssDNA has more electrostatic side chain residues in the interaction pocket and it is stabilized by the ssDNA The molecular dynamics simulation of the hsHLTF holo model with an ssDNA, herein referred to as hsHLTF holo/ssDNA , was In A, the atoms of the residues that interact with their respective nucleotide (NT) atoms, in B, the tertiary structure of the model, and in C, the representation of these residues with their molecular surface. Residues that presented contacts during a period longer than 75% of the simulation time are represented in red, those between 25% and 75% are in orange and those lesser than 25% are in pink. In A and B, the residues responsible for the ssDNA stabilization, extracted from de native contacts data, are represented in red in the models hsHLTF apo/ssDNA and scRad5 ssDNA, respectively. C) The superposition of the structures A and B, with the hsHLTF apo/ssDNA residues represented in blue and light blue and the scRad5 ssDNA residues in red and pink. used as reference for simulations involving the HIRAN and its ligand. We performed molecular docking simulations to acquire the initial structures used in dynamics with protein-ssDNA components ( Supplementary Figures S25 and S26). The RMSD data of the hsHLTF holo/ssDNA model (Supplementary Figure S27 and Supplementary Table S2) showed a more stable structure and less motility when compared to models without ssDNA. The RMSF data revealed that the loop regions involved in the interaction with the ssDNA had less spatial mobility than hsHLTF holo , without the ligand. Together, these results suggest that the presence of the ribbon interferes with the stability and structural flexibility of the HLTF's HIRAN domain.
The analyses of native and non-native contacts between the HIRAN domain with ssDNA in our simulations agree with the results obtained by Kile and colleagues (Kile et al., 2015), except for F142, which showed an interaction with NT1 for 25% of the simulation time ( Figure 6A). This is due to the exposure of the initial nucleotides of the model to the solvent and may not occur biologically. The p-p stacking interactions of tyrosine Y72 and Y93 ( Figure 6B) with the nitrogenous base of ssDNA, as well as the electrostatic interactions of N91 and H110 ( Figure 6B) with the phosphate chain, was observed throughout the simulation time and have shown the importance of these residues in the ribbon stabilization within the pocket tertiary structure. The contacts between residues R71 and K113 and the ssDNA remained through 25% to 75% of the simulation time ( Figure 6), demonstrating that they also aided the nucleotides stabilization.
The RMSD data of scRad5 ssDNA and hsHLTF apo/ssDNA (Supplementary Figure S28 and Supplementary Table S2) showed the systems in equilibrium without high variation in the backbone atoms distances, indicating that the DNA ribbon interfered with the stabilization of both HIRAN domain tertiary structures. The RMSF data showed that residues in loops regions did not decrease its mobility, suggesting that both structures did not find a stable and a minimum energy conformation with the ssDNA, as presented in the hsHLTF holo/ssDNA model. Analysis of the hsHLTF apo/ssDNA RMSF plot allowed us to infer that loop regions still have high motility, unlike the hsHLTF holo/ssDNA system, and the formation of the interaction pocket, similar to its holo conformation, was not observed.
The atoms' contact data of the hsHLTF apo/ssDNA ( Figure  7A) showed that Y93 atoms interact on average around 30% of the simulation time with the nitrogenous base atoms of NT4 (Supplementary Figure S29). In addition, the stabilization of the ribbon close to the interaction pocket was carried out mainly by electrostatic interactions of residues R71, N91, and H110 ( Figure 7B). These results allow us to hypothesize an initial stage for the pocket conformational change that pairs the tyrosine with the nucleotides, NT4 and NT3. Unlike the HIRAN domain model in its holo conformation, R71 got a longer interaction time with the ssDNA because it was positioned in the interior of the pocket, while Y72 was externalized and had only transient interactions with NT3 (Figure 7). The drastic decrease in atomic contacts from hsHLTF holo / ssDNA to hsHLTF apo /ssDNA can be attributed to the conformation of the interaction pocket. As shown in Figure  5, the hsHLTF apo model has a different spatial conformation of R71 and Y72, making it difficult to access the DNA strand. This difference may lead to steric constraints of the internal residues at the interaction site and no p-p stacking of nucleotides with Y72 and Y93.
The visual analysis of all frames generated by the production stage of hsHLTFapo /ssDNA and native and non-native contacts data showed that the Y93 residue, when pointing to the interior of the pocket and without spatial restriction, can pair and interact with one of the ssDNA nucleotides. Y93 can be the initial residue of interaction with the ssDNA and induce the pocket formation allowing the Y72 to interact with another nucleotide, through the conformational change with the R71, and consequently the stabilization of the ribbon from the electrostatic interactions of residues N91, K113 and H110. The substitutions of residues Y72 and Y93 by an alanine residue prevent the HLTF's HIRAN domain from interacting and complexing with ssDNA (Kile et al., 2015).
In the scRad5 ssDNA model, although Y214 is exposed to the pocket, its predominant contact was with the sugar portion of NT3 (O5' and O4' atoms) (Figure 8). This residue displayed some interactions with the N1 and C2 of NT2, showing the mobility of the ssDNA ribbon ( Figure 8A). The observed interactions cannot be characterized as p-p stacking since they were contacts with specific atoms and the hydroxyl group in its side chain. Despite the high motility and the different conformation, the phosphate chain had become more exposed and more stable in the scRad5 ssDNA system. The non-internalized Y195 in the yeast model has only momentary interactions with the NT1, again showing the motility of the residues exposed to the solvent and the ssDNA ( Figure 8A). The ssDNA remained stable near HIRAN domain of Rad5 because of several electrostatic interactions in the interaction pocket region: residues S243, R241, N191, and K221 presented atoms that had interactions with nucleotides atoms for over 75% of the simulation time, R229, R219, Q248 and E244 between 25% and 75% and the K194 by approximately 10% (Figure 8).
The scRad5 model had several amino acids with positive electrostatic side chains near the DNA interaction pocket, and the simulation data suggests that the ribbon stabilization was due to coulombic interactions. The hsHLTF holo/ssDNA and hsHLTF apo/ssDNA models have only five residues that electrostatically interact with ssDNA (R71, N91, K113, H110, and Y92). In comparison, the scRad5 ssDNA presented nine residues with contacts: N191, K194, R219, K221, R229, R241, S243, E244, and Q24 (Figure 9).
The DSSP analysis (Supplementary Figure S30) suggests that the scRad5 ssDNA is more stable than the scRad5, maintaining its secondary structures and increasing its count. The spatial restriction imposed from the ssDNA ribbon increased the inter-residue interactions and prevented the a2 alphahelix unfolding. The average number of residues with secondary structures was higher than the replicas of the scRad5 model (Supplementary Figure S31 and Supplementary Table  S3). Together with the greater homogeneity of the clustering of representative structures, we can hypothesize that the Rad5 HIRAN domain has greater stability within the presence of ssDNA.

Conclusions
Our work suggests that the tertiary structure of the HIRAN domain of Rad5 has a greater degree of freedom when compared to the HIRAN domain of HLTF in its apo conformation. The three-dimensional instability with constant unfolding of secondary structures of scRad5 models suggests the need for neighboring domains for its structural stabilization.
Structural analysis of the ssDNA interaction pocket revealed an open and a closed state of the pocket, and the transitions were not observed during our independent molecular dynamics' trajectory. These results suggest a possible conformational induction mechanism by ssDNA during the transition from the closed state of the HIRAN domain pocket to its open state, as this region exhibits great structural flexibility. In the HIRAN domain of HLTF, this initial recognition can be attributed to Y93, whereas in Rad5, it was not possible to draw clear conclusions. Based on our findings, we can hypothesize that the time required to observe the full mechanism of pocket formation from apo to holo conformation may require microsecond-scale simulations or another molecular dynamics simulation method, as inferred from the RMSF and RMSD. Analysis of the atomic contacts of the interactions of the HIRAN domain with ssDNA shows that the positively charged side of the scRad5 tertiary structure is critical for stabilizing the ssDNA in the pocket and has more residues in contact with the nucleotides.
Together with the previously published data on the HIRAN domain and our analysis of the key residues, our results provide new molecular insights into the mechanism of action of the HIRAN domain in the recognition process of the stalled replication fork. Here we showed that crucial residues for p-p-stacking and electrostatic interactions with ssDNA, such as Y72, Y93, N91 and H110 are conserved among distant species. Also, our data demonstrate the importance of these residues for band stabilization within the tertiary structure of the pocket. In the HIRAN domain of Rad5, stability is greater in the presence of ssDNA and may depend on interaction with neighboring domains.