Design of a protein-targeted DNA aptamer using atomistic simulation

Abstract The concentrations of specific macromolecular species can be quantified using diagnostic tools that rely on molecular recognition by nucleic acid aptamers. One such approach involves the formation of osmium tetroxide 2,2'-bipyridine protein adducts, followed by electrochemical detection of analytes that bind specifically to electrode-tethered aptamers. In conjunction with a 27-mer DNA aptamer that binds specifically to exosite II on human alpha thrombin, this technique permits, in theory, a highly sensitive diagnostic tool for the quantification of serum thrombin levels. However, thrombin's aptamer binding site is lined by two tryptophan residues and the conjugation of bulky osmium groups to these residues weakens aptamer binding by an estimated 4 to 12 kcal/mol, undermining detection sensitivity. Therefore, we have rationally modified this DNA aptamer to strengthen its thrombin binding in the presence of conjugated osmium. Specifically, aptamers carrying long hydrophobic thymine derivatives in place of guanine 21 have binding affinities for osmium-conjugated thrombin that are enhanced by 10 to 15 kcal/mol, suggesting that these modified aptamers may be effective in a highly sensitive electrochemical sensor for the quantification of low concentrations of thrombin. Our approach of using molecular simulation to subtly re-engineer a DNA aptamer may be generally applicable for the optimization of other macromolecular binding interfaces. Communicated by Ramaswamy H. Sarma


Introduction
Early detection of many diseases requires diagnostic tools that are both accurate and sensitive. Underlying this requirement for sensitivity in molecular detection is a need to maximize productive interactions with target molecules while minimizing non-specific signals. This molecular specificity can be attained through the use of biologically relevant binding partners (Diamandis & Christopoulos, 1991), antibodies (Bauer et al., 2019;Engvall & Perlmann, 1971;Yalow & Berson, 1996), or nucleic acid aptamers (Jayasena, 1999;Nahvi et al., 2002), among other approaches (Kaur et al., 2020;Li et al., 2020). Together with a mechanism of relatively low-noise signal generation upon or after binding, a diagnostic tool is born (Kaur et al., 2018;Zhou & Rossi, 2017). Applications of aptamers are of interested in many fields including cancer diagnostics and therapeutics (Han et al., 2020;Kumar et al., 2020;Li et al., 2020), virus detection and antiviral therapy (Zou et al., 2019) and many other diseases (Jo & Ban, 2016).
Our goal is to develop a sensitive tool for forensic analysis and molecular diagnosis based on the electrochemical detection of specific proteins. In principle, detection can be accomplished by conjugating an electrochemically active moiety to all proteins in a solution and quantifying the resulting voltammetric response at gold electrodes, with molecular specificity furnished by electrode-immobilized aptamers as capture probes for the analyte. Here, we consider an osmium tetroxide 2,2 0 -bipyridine complex (OsO 4 ,bipy) as an electrochemical source that can be conjugated to proteins at near physiological conditions (Billova et al., 2002;Fojta et al., 2008). Following the work of Deetz and Behrman (Deetz & Behrman, 1980, 1981, Billov a et al. showed that OsO 4 ,bipy forms adducts with tryptophan residues but does not significantly interact with glutamine, histidine, serine, tyrosine, glycine, leucine, arginine, or proline residues (Billova et al., 2002).
We initially focus on detecting the protein thrombin, which is a widely used target in aptamer development (Deng et al., 2014) because of its physiological relevance as a central protease of hemostasis and because its available DNA (Bock et al., 1992;Macaya et al., 1993;Martino et al., 2006;Padmanabhan et al., 1993;Tasset et al., 1997) and RNA (Orava et al., 2010;White et al., 2000) aptamer sequences are comparatively well studied given that thrombin was one of the first proteins that does not normally bind nucleic acids for which an aptamer was designed to function in vitro (Bock et al., 1992). Specifically, we focus our evaluations on the HD22-27mer DNA aptamer developed by Tasset et al. (1997), for which a crystal structure in complex with thrombin is available (Krauss et al., 2013). The dissociation constant, K d , of thrombin and HD22-27mer is unavailable in the literature, but a 29mer that differs only by the addition of one base pair in the stem has a K d of 0.5-4 nM (Olmsted et al., 2011;Tasset et al., 1997). Nevertheless, the binding interface between thrombin and HD22-27mer is bordered by two tryptophan residues (Krauss et al., 2013) whose conjugation to OsO 4 ,bipy may substantially reduce the aptamer's binding affinity (Galagedera et al., 2018). Therefore, we use computer simulations to quantify the binding affinity between this DNA aptamer and either unlabeled or OsO 4 ,bipy-labeled thrombin. Indeed, our free energy simulations indicate that osmium adducts reduce aptamer binding strength by 4 to 12 kcal/mol. To alleviate this deficiency, we repeat these computations for a variety of modified aptamers, leading us to propose novel aptamers with a predicted 10 to 15 kcal/ mol enhancement in binding affinity to osmium-labeled thrombin. Future work should attempt to validate these findings in electrochemical assays.

Results
We use computer simulations to quantify the binding between the HD22-27mer DNA aptamer and thrombin before and after osmium conjugation, later optimizing the aptamer sequence to facilitate its binding to osmium-conjugated thrombin. To attain converged estimates of thrombinaptamer binding free energies, DG bind , on attainable simulation timescales, we conduct free energy simulations that employ restraints to inhibit conformational reorganization of protein and DNA backbone atoms while leaving amino acid side chains and nucleotide bases free to move.

Conformational relaxation increases binding affinities
Our initial evaluation of DG bind for thrombin and the HD22-27mer DNA aptamer is conducted in three ways, which differ based on the conformations to which these macromolecules are restrained. First, we employ the crystallographic conformation of Krauss et al. (2013) (T cryst /A cryst simulation system, where T and A represent thrombin and aptamer, respectively); second, restrained backbone positions are defined after a preliminary 30-ns simulation in which the protein, but not the aptamer, is allowed to relax in the macromolecular complex (T relax /A cryst ); third, we allow prior relaxation of both the protein and the aptamer (T relax /A relax ). This approach enables quantification of the extent to which computational estimates of DG bind depends on system relaxation toward the bound-state conformational basin, which may be affected by force field parameters, the removal of crystallographic symmetry, and the transition from a solid state crystal to a liquid state with increased hydration. The root mean squared deviation of backbone atoms during relaxation is 0.35 nm (Supplementary material, Figure S1 and Table S1).
After generating the three aforementioned initial states, we use umbrella sampling virtual replica exchange (US-VREX) simulations to obtain estimates of DG bind . Values of DG bind become more favorable with increasing preliminary relaxation, where these values are À35 ± 4, À39 ± 1, and À41 ± 1 kcal/mol for T cryst /A cryst , T relax /A cryst , and T relax /A relax systems, respectively (Table 1).

Osmium conjugation disfavors aptamer binding
We adopt a similar approach to study simulation systems in which OsO 4 ,bipy is covalently attached to thrombin tryptophan residues 96 and 237, hereafter referred to as Os-thrombin. Estimates of DG bind are À23 ± 1, À35 ± 1, and À33 ± 2 kcal/mol for OT cryst /A cryst , OT relax /A cryst , and OT relax / A relax systems, respectively, where OT denotes Os-thrombin (Table 2). Therefore, the conjugation of osmium reduces thrombin's affinity for the aptamer by 4 to 12 kcal/mol, depending on the relaxation procedure (Table 2 and Supplementary material, Figure S1).

Enhancing aptamer affinity for Os-thrombin
Computational estimates of binding affinities are consistent with the experimentally observed strong binding between this aptamer and thrombin (Tables 1 and 2). Nevertheless, osmium conjugation reduces the aptamer's thrombin binding affinity by 4 to 12 kcal/mol (Table 2), jeopardizing the utility of this aptamer as a highly sensitive diagnostic tool. Therefore, we computationally evaluate twenty-one rationally modified aptamers in search of chemical moieties that strengthen Os-thrombin/aptamer binding.
Using the crystal structure of HD22-27mer and thrombin (Krauss et al., 2013) as a guide, we aim to enhance the binding affinity by modifying aptamer nucleotides at the DNAprotein interface while eschewing modification of the G-8, G-11, G-17, and G-20 nucleotides involved in the aptamer's core G-quadruplex structure (Krauss et al., 2013). These modifications follow three main approaches. First, we attempt to strengthen interactions between thrombin and aptamer nucleotides T-9, T-18 or T-24 by replacing these nucleotides with the thymine derivative 5-(carboxylic acid benzylamide)-2 0 -deoxyuridine (hereafter referred to as D1). Each of these three nucleotides contacts thrombin in the crystallographic complex (Krauss et al., 2013), and we reasoned that the terminal phenyl group on D1 could make favorable interactions with nearby hydrophobic groups on thrombin (nucleotide T-18 with residues Y89, P92, W237, V241, and F245; nucleotide T-9 with residues R93, N95, W96, and R97; and  Krauss et al. (2013). Nucleotides selected for modification, but later abandoned, are shaded grey. Nucleotide G-21, which is modified by D1, D2, and D3 groups to enhance Os-thrombin binding, is shaded black. (B) Snapshot of (orange ribbons and multicolored nucleotide bases) HD22-27mer DNA aptamer interacting with (grey) thrombin, with modeled (yellow) osmium conjugated to tryptophan residues 96 and 237, taken from an OT relax /A relax simulation. (C) Potentials of mean force (PMFs) describing center of mass distance-dependent free energies for thrombin/aptamer association. PMFs show data for (dashed black line) thrombin and the unmodified aptamer, (solid black line) Os-thrombin and the unmodified aptamer, and Os-thrombin and a modified aptamer in which G-21 is replaced with thymine derivatives (orange line) D1, (blue line) D2, and (red line) D3, where R represent the sugar moiety. Standard deviations of the means from block averaging are shown as shaded regions.
± 1 a (T) Thrombin and (A) aptamer. Subscripts cryst and relax indicate the respective presence and absence of crystallographic restraints during the preliminary 30-ns relaxation.
where subscripts p and q represent cryst or relax.
These first two nucleotide modification strategies were generally unsuccessful; thirteen modifications were abandoned during equilibration because the modified nucleotides did not favorably associate with the protein (Supplementary material, Table S2) and five modifications were abandoned for failing to improve the aptamer's binding to Os-thrombin during US-VREX simulations (Supplementary material, Table  S3). Our third approach was more successful. Here, we replace G-21 or G-22 nucleotides with longer, distally hydrophobic thymine derivatives in an effort to form stable interactions with a concave pocket in thrombin near residues R93, R101 and R233. Initial success in improving the binding free energy with D1 in place of G-21 led us to also modify G-21 with 5-(1-phenyl-1H-1,2,3-triazol-4-yl)-2 0 -deoxyuridine (D2), or 5-(1-pentyl-1H-1,2,3-triazol-4-yl)-2 0 -deoxyuridine (D3). In addition to their hydrophobic moieties, these thymine derivatives have an amide group (D1) or a triazole ring (D2 and D3) that is intended to mimic the interaction between G-21 and thrombin residue R93 that we observe in simulations of the unmodified aptamer and thrombin. Nucleotide derivatives D1, D2, and D3 were inspired by Cornell et al. (1995), Dierckx et al. (2011) andVaught et al. (2010) and are structurally compared to guanine in Figure 1B. Derivatives D2 and D3 are chemically stable and easy to synthesize, while derivative D1 is both unstable and difficult to synthesize (data not shown). Importantly, replacing G-21 with any of these three thymine derivatives enhances the aptamer's affinity for Os-thrombin by 10 to 15 kcal/mol (Figure 1 and Table 3). The remainder of this section focuses on aptamers with D1, D2, or D3 substituents at G-21, which we denote as A D1-21 , A D2-21 , and A D3-21 , respectively.
Conformational ensembles from US-VREX simulations of Os-thrombin associating with the unmodified and A D1-21 , A D2-21 , and A D3-21 aptamers are shown in Figure 2. Interestingly, replacement of nucleotide G-21 with these three thymine derivatives leads to a subtle alternation of Osthrombin's exosite II, especially where it interacts with aptamer nucleotides 21 and 24 ( Figure 2). Specifically, modified nucleotides D1, D2, and D3 at position 21 interact with a protein-surface cavity on Os-thrombin, which is formed by residues H91, R93, N98, D100, R101, R175, D178, M180 and R233, and which we refer to as pocket A. Concurrently, nucleotide T-24 interacts with another cavity, this one formed by Os-thrombin residues R126, S129B, L130, Q131, E164, K169, R165, D178, P181, H230, P232 and R233, which we refer to as pocket B. Both of these pockets on the aptamer-binding surface of Os-thrombin become substantially larger when nucleotide G-21 is replaced by thymine derivatives D1, D2, or D3, though the effect is more prominent in pocket B (Figure 2).
The unmodified aptamer frequently forms hydrogen bonds between the base of nucleotide G-21 and the side chain of Os-thrombin residue R93 (Figure 2A). In comparison, the aromatic ring of modified nucleotide D1 (OT/A D1-21 system) extends deeper into Os-thrombin pocket A between R101 and R233 ( Figure 2B). Concurrently, modified nucleotide D1 drives nucleotide T-24 deeper into Os-thrombin pocket B ( Figure 2B). For the OT/A D2-21 system, modified nucleotide D2 forms stable interactions with Os-thrombin, reaching deeply into binding pocket A ( Figure 2C, Supplementary material, Movie M1). Nucleotide T-24 also inserts into Osthrombin pocket B more deeply than in the unmodified aptamer, but to a lesser extent than observed in the OT/A D1-21 system ( Figure 2C). For the OT/A D3-21 system, the flexible carbon chain of D3 tends to pack against Os-thrombin pocket A with extensive insertion of nucleotide T-24 into pocket B ( Figure 2D). As intended, we observe the formation of a hydrogen bond between the modified D3 nucleotide's triazol ring and Os-thrombin's R93 side chain (Supplementary material, Figure S3), though it is unclear why this was not also observed for D2.

Discussion
We used restrained US-VREX simulations to compute binding free energies between thrombin and the HD22-27mer DNA aptamer. Simulations confirm tight binding of this aptamer to unmodified thrombin (Table 1 and Supplementary material, Figure S2A). However, the inclusion of osmium tetroxide 2,2 0 -bipyridine adducts at two tryptophan residues lining the aptamer binding interface reduce the aptamer/thrombin binding free energy by 4 to 12 kcal/mol (Table 2 and Supplementary material, Figure S2). This weakened binding upon OsO4,bipy conjugation motivates our attempt to rationally modify the aptamer to enhance its binding to thrombin, especially in the presence of OsO4,bipy. To this end, we evaluated twenty-one aptamer variants and identified three thymine derivatives at nucleotide G-21 that strengthen the aptamer's binding to osmium-conjugated thrombin by 10 to 15 kcal/mol (Table 3 and Figure 1C). These three thymine derivatives project into a protein-surface cavity that is not accessible to the shorter G-21 and concurrently induce nucleotide T-24 to interact more extensively with the protein (Figure 2). Two of these derivatives, 5-(1-phenyl-1H-1,2,3-triazol-4-yl)-2 0 -deoxyuridine and 5-(1-pentyl-1H-1,2,3-triazol-4-yl)-2 0 -deoxyuridine, which we refer to as D2 and D3, respectively, are both chemically stable and accessible via simple, high-yield chemical reactions.
shorter HD22-27mer because its crystal structure has been solved in complex with thrombin (Krauss et al., 2013). Nevertheless, the 5 0 and 3 0 ends of HD22-27mer do not abut thrombin (Krauss et al., 2013), as shown in Figure 1B, suggesting that these 27mer and 29mer aptamers interact similarly with thrombin and therefore that our G-21 HD22-27mer derivatives will also enhance the 29mer aptamer's binding to thrombin, possibly allowing thrombin detection at even lower concentrations. One reason that the 27mer has reduced thrombin affinity in comparison to the 29mer appears to involve more extensive unfolding of the 27mer's terminal stem in the absence of thrombin, with splaying of the aptamer's 5 0 and 3 0 ends (Hao & Zhao, 2015;Zhao & Cheng, 2013). This reversible partial unfolding of HD22-27mer, while not necessarily problematic in solution, may reduce the affinity of the aptamer for thrombin when the aptamer is tethered to an electrode, further motivating use of the 29mer aptamer. Nevertheless, the use of D1, D2, and D3 thymine derivatives at nucleotide G-21 may also enhance thrombin detection by the eximer-and fluorescence-based systems of Hao, Zhao, and Cheng, which are predicated on the 5 0 and 3 0 ends of H22-27mer splaying in solution and coming together upon binding to thrombin (Mobley et al., 2007;Zhao & Cheng, 2013).
In order to make our US-VREX simulations tractable, we applied conformational restraints to limit reorganization of protein and DNA backbones. This means that the free energies that we compute are not the same as binding free energies for unrestrained macromolecules because they lack the energetic contributions from releasing these restraints in both the bound and unbound states (Mobley et al., 2007). Instead of computing these additional free energy components, we proceeded under the assumption that these components will roughly cancel out when evaluating relative free energies between different aptamer sequences, which is our metric for selecting aptamer variants with enhanced binding to osmium-conjugated thrombin. Despite the fact that thymine derivatives D1, D2, and D3 are longer and more flexible than guanosine, and hence may lose more entropy as they rigidify upon binding to thrombin, this component of the free energy is captured by our US-VREX simulations, in which protein side chains and nucleotide bases are unrestrained. Furthermore, our simulations indicate that thymine derivative D3 remains extensively disordered in the bound state ( Figure  2D), whereas even the unmodified G-21 appears to become locked in a single conformational basin upon thrombin binding ( Figure 2A). However, our agnostic approach to large scale conformational change invalidates interpretations of PMFs as absolute binding free energies given that the HD22-27mer undergoes a folding transition upon binding thrombin (Mobley et al., 2007;Zhao & Cheng, 2013). Therefore, it would be inaccurate to conclude from Figure 2B that the unmodified aptamer binds strongly to osmium-labeled thrombin because we have not computed the free energy associated with aptamer folding in isolation. This later free energy is exceptionally difficult to compute, even for substantially shorter nucleic acids (Chen & Garc ıa, 2013). Nevertheless, our focus on relative binding free energies of thrombin and a cohort of similar nucleic acids circumvents the need to fully characterize DNA folding while enabling the prediction of DNA modifications that will enhance binding of the folded aptamer to osmium-labeled thrombin.

Conclusions
We introduce a computational approach to estimate relative binding free energies of macromolecules that relies on restraints to limit large-scale conformational changes while facilitating induced-fit reorganization of protein side chains and nucleotide bases in an importance sampling procedure that permits repeated macromolecular association and dissociation while remaining at equilibrium in all orthogonal degrees of freedom. Our study reveals that specific DNA-protein binding interactions can be tuned by nucleotide modifications on the aptamer's protein recognition surface. We identified three thymine derivatives whose inclusion in place of a single guanosine nucleotide is predicted to substantially strengthen the aptamer's binding to thrombin, even in the presence of bulky osmium adducts. These results provide mechanisms to potentially overcome a limiting hurdle to the construction of an exquisitely sensitive diagnostic tool for the quantification of thrombin in human serum.

System setup
Simulation systems comprise the 259-residue heavy B-chain of human alpha thrombin and the HD22-27mer DNA aptamer (Tasset et al., 1997) in explicit water. We utilize four types of simulation systems in which thrombin is either unlabeled (T) or conjugated to OsO4,bipy (OT) and the aptamer is either unmodified (A) or rationally modified in an attempt to enhance its binding affinity to osmium-labeled thrombin (A D ), where the superscript D denotes the chemical derivative identifier. Starting configurations are from the cocomplex crystal structure of Krauss et al. (2013) (PDB ID: 4I7Y), in which the aptamer is bound at thrombin exosite II. Protein termini are zwitterionic. Missing backbone atoms are modeled with the program Loopy (Soto et al., 2008;Xiang et al., 2002). Missing side chain atoms are modeled with the program SCWRL4 (Krivov et al., 2009). All titratable residues are in their standard states for pH 7. The unit-cell is approximately 9.6 Â 6.1 Â 6.9 nm, providing a minimum macromolecular distance between bound-state periodic images of 2.0 nm at the largest aptamer-protein displacements. Macromolecules are solvated with approximately 11,800 water molecules, the system is neutralized, and excess NaCl is added at 0.38 M to model the systems of Krauss et al. (2013). Salt is never added near the protein-DNA interface. Systems are energy minimized with 5000 steps of steepest-descent.
Systems are equilibrated by 30 ns of NpT simulation in which harmonic position restraints are applied to protein and DNA backbone atoms in macromolecules designated for later simulation in their crystallographic conformation and orientation (denoted by the subscript "cryst") but not to those macromolecules that are permitted to relax into the liquid-state simulation conditions (denoted by the subscript "relax"). For example, T cryst /A cryst systems retain all aforementioned restraints during this 30 ns equilibration, whereas T relax /A cryst systems have no protein restraints but do have DNA restraints. During this equilibration, restraint force constants are 1000 kJ/mol/nm 2 in all Cartesian dimensions.

Modified aptamers
The unmodified HD22-27mer sequence is GTCCGTGGTAGGG CAGGTTGGGGTGAC. Non-standard thymine-based nucleotides shown in Figure 1B are used to replace guanine at site 21 to generate three modified aptamers (A D1-21 , A D2-21 , and A D3-21 ), in addition to other modifications noted in Tables S2 and S3 (Supplementary material). Hartree-Fock theory with 6-31 G Ã basis-sets (Cornell et al., 1993) is employed to obtained energy-minimized conformations for non-standard nucleotides and partial charges are obtained using the R.E.D. server (Dupradeau et al., 2010;Vanquelef et al., 2011). Lennard-Jones and bonded terms are taken from similar chemical groups in the AMBER99 force field (Cornell et al., 1995) with the modifications of Chen and Garc ıa (2013).

Simulation protocol
MD simulations are conducted with version 4.5.5 of the GROMACS simulation package . DNA is modeled with the AMBER99 force field (Cornell et al., 1995) and the Lennard-Jones modifications of Chen and Garc ıa (2013). The water model is TIP3P (Jorgensen et al., 1983). Water molecules are rigidified with SETTLE (Miyamoto & Kollman, 1992) and macromolecular bond-lengths are constrained with P-LINCS (Hess, 2008). Lennard-Jones interactions are evaluated using a group-based cutoff, truncated at 1 nm without a smoothing function. Coulomb interactions are calculated using the smooth particle-mesh Ewald method (Darden et al., 1993;Essmann et al., 1995) with a Fourier grid spacing of 0.12 nm. Simulation in the NpT ensemble is achieved by isotropic coupling to a Berendsen barostat (Berendsen et al., 1984) at 1 bar with a coupling constant of 4 ps and temperature-coupling the simulation system using velocity Langevin dynamics (van Gunsteren & Berendsen, 1988) at 300 K with a coupling constant of 1 ps. The integration time step is 2 fs. The non-bonded pair-list is updated every 20 fs.

Umbrella sampling with virtual replica exchange
We use the virtual replica exchange (VREX) approach of Rauscher et al. (2009) in the context of umbrella sampling (US) (Torrie & Valleau, 1977), combined by Neale et al. as US-VREX (Neale et al., 2013), to flatten the free energy profile of macromolecular displacement while simultaneously allowing free diffusion along this slowly relaxing degree of freedom.
In US-VREX simulations, the displacement of the centers of mass of thrombin and the aptamer, projected onto the Cartesian x axis, d, is harmonically restrained at a specified value, d 0 i , in each umbrella i, with a force constant, k u , of 2000 kJ/mol/nm 2 . Stochastic jumps in d 0 i are attempted every 4 ps for a total of 90 ns per replica (3.5 Â 10 4 exchange attempts per replica). These US-VREX simulations are conducted for nine systems (Tables 1-3). For each system, 29 umbrellas span 2.4 d 0 i 3.8 nm in 0.05 nm increments. No values are stored on virtual-exchange lists (Neale et al., 2013) in the first 250 simulation segments of any replicas and no exchanges are attempted for the first 300 simulation segments. Sampled values are stored every 0.2 ps.
To significantly reduce the computational investment required to attain convergence, protein and aptamer are prevented from unfolding or changing their relative orientations. To suppress global orientation changes of both macromolecules, large-scale conformational change in the protein, and many conformational changes in the DNA, we apply harmonic restraints on the absolute positions of protein C a atoms in all Cartesian dimensions and on DNA backbone atoms in Cartesian y and z dimensions (the Cartesian x positions of DNA backbone atoms are allowed to move freely so as to allow reversible macromolecular association and dissociation during US-VREX). Concurrently, we supress largescale conformational changes involving changes in displacement between DNA backbone atoms along the Cartesian x dimension by harmonically restraining intra-nucleotide distances between DNA backbone atoms. All restraints use force constants of 1000 kJ/mol/nm 2 . All of these restraints are enforced during US-VREX simulations but absent during the preceding 30-ns unrestrained equilibration period.

Free energies
The values of d sampled in US-VREX simulations are converted to potentials of mean force, PMFs, using Alan Grossfield's implementation (Grossfield) of the weighted histogram analysis method (WHAM) (Kumar et al., 1992). To this end, recorded values of protein-DNA separation in the range 2.4 d 3.8 nm are distributed among 500 histogram bins and the WHAM calculation is performed with a tolerance of 4.184 Â 10 À5 kJ/mol. This procedure is repeated for each set of US-VREX simulations. Each resulting PMF describes the free energy as a function of separation distance, DG d ,. Each PMF is then shifted such that the average value of DG d in the range 3.6 d 3.8 nm, DG bulk , equals zero. The standard deviation of the mean for each location along the PMF is calculated using the block averaging technique (Flyvbjerg & Petersen, 1989) with blocks of 15 ns. Finally, the binding free energy, DG bind , is determined by the lowest energy of the PMF in the range 2.4 jdj 3.8 nm. After discarding initial sampling as equilibration (30 ns per umbrella), the uncertainty in DG bind is estimated by its standard deviation, r DG , which is calculated after dividing the remaining trajectory into four 15-ns blocks.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
The author(s) reported there is no funding associated with the work featured in this article.