Conformational dynamics of full-length inducible human Hsp70 derived from microsecond molecular dynamics simulations in explicit solvent

Human 70 kDa heat shock protein (hHsp70) is an ATP-dependent chaperone and is currently an important target for developing new drugs in cancer therapy. Knowledge of the conformations of hHsp70 is central to understand the interactions between its nucleotide-binding domain (NBD) and substrate-binding domain (SBD) and is a prerequisite to design inhibitors. The conformations of ADP-bound (or nucleotide-free) hHsp70 and ATP-bound hHsp70 was investigated by using unbiased all-atom molecular dynamics (MD) simulations of homology models of hHsp70 in explicit solvent on a timescale of .5 and 2.7 μs, respectively. The conformational heterogeneity of hHsp70 was analyzed by computing effective free-energy landscapes (FELs) and distance distribution between selected pair of residues. These theoretical data were compared with those extracted from single-molecule Förster resonance energy transfer (FRET) experiments and to small-angle X-ray scattering (SAXS) data of Hsp70 homologs. The distance between a pair of residues in FRET is systematically larger than the distance computed in MD which is interpreted as an effect of the size and of the dynamics of the fluorescent probes. The origin of the conformational heterogeneity of hHsp70 in the ATP-bound state is due to different binding modes of the helix B of the SBD onto the NBD. In the ADP-bound (or nucleotide-free) state, it arises from the different closed conformations of the SBD and from the different positions of the SBD relative to the NBD. In each nucleotide-binding state, Hsp70 is better represented by an ensemble of conformations on a μs timescale corresponding to different local minima of the FEL. An animated interactive 3D complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:JBSD:30

All Hsp70 homologs share a common architecture composed of a N-terminal nucleotide-binding domain (NBD) (Flaherty, DeLuca-Flaherty, & McKay, 1990) which catalyzes ATP hydrolysis, and of a C-terminal domain (Zhu et al., 1996) which is divided into a substrate-binding domain (SBD), which binds to nonnative or unfolded substrate proteins, and a C-unstructured terminal domain (Figure 1). The SBD is composed of a β-sandwich subdomain (SBD-β) which has a hydrophobic peptide binding pocket, and a α-helical "lid" subdomain (SBD-α) which regulates the access of the substrate to the SBD-β (Zhu et al., 1996). The NBD and the SBD are connected by a conserved hydrophobic linker (Hunt & Morimoto, 1985).
Hsp70s occur in two main conformations to perform their chaperone function, we named open state or open conformation (with ATP bound to the NBD) and closed state or closed conformation (with ADP-bound to the NBD or without nucleotide) (Figure 1). In ATP-bound Hsp70, the SBD is opened with fast binding and release of the protein substrate, and the SBD and NBD are docked, as shown by low-resolution small-angle X-ray scattering (SAXS) data (Shi, Kataka, & Fink, 1996;Wilbanks, Chen, Tsuruta, Hodgson, & McKay, 1995) and suggested by the X-ray structure of an ATP-Hsp110 homolog (Liu & Hendrickson, 2007;Schuermann et al., 2008). In the nucleotide-free Hsp70 and in ADP-bound Hsp70, the SBD is assumed to be closed with low binding and release rate of the protein substrate. However, recent experiments shown that the complete closure of the SBD is not strictly required in the chaperone protein interactions (Schlecht, Erbse, Bukau, & Mayer, 2011). In absence of ATP, the SBD and the NBD are undocked and the interdomain linker is exposed to solvent, as shown by SAXS data (Shi et al., 1996;Wilbanks et al., 1995) and by the (low-resolution) two-domain Hsp70 NMR derived structure (Bertelsen, Chang, Gestwicki, & Zuiderweg, 2009).
Although the main steps of the chaperoning cycle of Hsp70s are clearly identified as described above, the details of the conformational changes remain unclear. Recently, the conformations of three Hsp70s in low (picomolar) protein concentration, the murine endoplasmic reticulum Hsp70 (BiP) (Marcinowski et al., 2011), the yeast mitochondrial Hsp70 (Ssc1) (Mapa et al., 2010), and the bacterial (Escherichia coli [E. coli]) Hsp70 (DnaK) (Mapa et al., 2010) were probed by using single-pair Förster Resonance Energy Transfer (spFRET) to monitor conformational changes upon nucleotide binding and protein substrate interactions. In each spFRET measurement, two selected residues belonging to two different subdomains of the protein were replaced by two cysteine residues covalently linked to a donor fluorophor and an acceptor fluorophor, respectively. The distance distribution between the two dye molecules was extracted from the single-molecule histogram FRET (spFRET) efficiency (Mapa et al., 2010;Marcinowski et al., 2011). These spFRET data revealed that the isolated molecules of Hsp70s sample a relatively broad ensemble of conformations, as shown by the large width of the distance distributions that were observed (see Section 3). Because of the absence of experimental and theoretical data of the dynamics of Hsp70s at the atomic resolution, no detailed interpretation of the conformational heterogeneity observed by spFRET was possible.
As discussed above, the current knowledge of the structures and dynamics of Hsp70s is incomplete. Moreover, there is no structural data for the full-length human inducible Hsp70, which is an important target in pharmacology and has a huge interest in medicine (Broadley & Hartl, 2009;Rérole et al., 2011;Stangl et al., 2011;Wang, 2011). One may expect that both human ATPbound Hsp70 and human ADP-bound Hsp70 should be best represented by an ensemble of conformations similar to the one observed for BiP in spFRET (Marcinowski et al., 2011), because of the high homology of sequence between these two chaperones (65% of residue identity and 81% of positive residues, see Section 2). Characterization of the conformational ensemble of hHsp70 is of prime interest to design new inhibitors and to understand the interactions between its domains. To generate an ensemble of possible conformations explored by a human inducible Hsp70 chaperone (hHsp70) in solution, we used all-atom molecular dynamics (MD) simulations in explicit solvent on a microsecond timescale of hHsp70 models. Models of ATP-and ADP-bound hHsp70 were built previously by using homology modeling and all-atom MD simulations were performed in explicit solvent (Nicolai, Senet, Delarue, & Ripoll, 2010). Our previous (Nicolai et al., 2010) and present MD simulations of hHsp70 in solution amount up to 2.7 and .5 μs of total simulation time for the open and closed conformations, respectively. These simulation times are rather long for all-atom unbiased MD simulations of a multidomain protein in regard of their computational cost. Indeed, because of the size of hHsp70 (641 amino acids), the simulation box of the protein in solution contained several hundreds of thousands of atoms and the present MD simulations correspond to about 4 million hours of CPU time on a monoprocessor at the time of writing. In spite of the current limitations of MD of hHsp70 (see Section 4), the MD simulations provide the first atomistic description of the ensemble of conformations of an Hsp70 chaperone accessible on a sub-μs timescale in the absence of a protein substrate and of the co-chaperones, and permit to identify (at least qualitatively) the full-length structures corresponding to the different distances measured in the aforementioned spFRET experiments of Hsp70 homologs.
The computational methods are described in Section 2. The relaxation of the homology models and the distance distributions predicted for human Hsp70 are presented in Section 3. The latter are compared to those extracted from spFRET data of Hsp70 chaperones there. The results are discussed in Section 4. Main conclusions and perspectives are given in Section 5.

Molecular dynamics simulations
As described in details elsewhere (Nicolai et al., 2010), the initial models of human inducible Hsp70 in an open state and in a closed state were built by homology modeling based on the templates Saccharomyces cerevisiae Sse1 (Hsp110) (PDB ID: 3C7 N chain A) (Schuermann et al., 2008) and E. coli DnaK (PDB ID: 2KHO) (Bertelsen et al., 2009), respectively ( Figure 1). The models were relaxed by using all-atom MD simulations in explicit water with the GROMACS software package (Lindahl, Hess, & van der Spoel, 2001) using the GROMOS96 ffG43a1 force field (Scott et al., 1999) and the simple point-charge (SPC) water model (Berendsen, Postma, van Gunsteren, & Hermans, 1981). The time step used in all simulations was .001 ps and the list of neighbors was updated every .005 ps with the "grid" method and a cut-off radius of 1 nm. The coordinates of all the atoms in the simulation box were saved every 2 ps. The initial velocities were chosen randomly. We used the NPT ensemble with a cubic box. The temperature and pressure were kept to the desired value by using the Berendsen method (Berendsen, Postma, van Gunsteren, DiNola, & Haak, 1984) and an isotropic coupling for the pressure (T = 300 K, τ T = .1 ps; P 0 = 1 bar, coupling time τ p = 1 ps). The electrostatic term was computed by using the Particle Mesh Ewald (PME) algorithm (Darden, York, & Pedersen, 1993;Essmann et al., 1995) (with a radius of 1 nm) with the Fast Fourier Transform optimization (on 117 points for each axe for the open model and on 160 points for the closed model with an order equal to 4 for the interpolation). The "cut-off" algorithm was applied for the noncoulomb potentials with a radius of 1 nm. The system was warmed up for 40 ps and equilibrated for 600 ps with lower restraints, finishing with no restraints at 300 K.
A summary of the MD runs is shown in Table 1. Six runs were performed for hHsp70 in an open state and three runs for hHsp70 in a closed state. Runs 1-4 in the open state correspond to runs named APO-1, AOP-2, ADP-1, and ADP-2 in Ref. (Nicolai et al., 2010). In addition, we performed here two new MD runs (runs 5 and 6) of hHsp70 in the open state with the ATP bound in the NBD. Run 1 in the closed state of hHsp70 corresponds to the MD run performed in Nicolai et al., (2010) that we have extended here up to 200 ns to improve the statistics. Runs 2 and 3 in the closed state of hHsp70 are new MD runs with different initial conditions. Only the parameters of the new MD runs are given next.

Run 5 and run 6 of the open model
The initial structure of ATP-hHsp70 model in an open state was built by using the structure of the nucleotidefree hHsp70 (Nicolai et al., 2010). We added one ATP molecule with PRODRG (Schuttelkopf & van Aalten, 2004) and one Ca 2+ ion (as in the experimental structure of the human Hsp70 NBD, PDB ID: 1HJO) (Osipiuk, Walsh, Freeman, Morimoto, & Joachimiak, 1999) to this APO structure. The protein was solvated with 81,986 water molecules in a cubic box with a dimension of 13.69 nm and 13 Na + ions were added to neutralize the system. The initial random velocities were different for the two runs. The total simulation times were 900 and 570 ns for the runs 5 and 6, respectively 2.1.2. Run 2 of the closed model The initial structure of the nucleotide-free hHsp70 in a closed state was built by using the structure of DnaK (PDB code: 2KHO) as a template ). The initial structure was solvated in a cubic box with 212,632 SPC water molecules keeping a minimum distance of .9 nm between the solute and each face of the box. We used periodic boundary conditions with an initial value of the box side of 18.664 nm. The charge of the system in the closed model was neutralized by adding 11 Na + counter-ions. The energy of the system was first optimized with the "steepest descent minimization" algorithm and then by using a "conjugate gradient" algorithm. The total simulation time was 142 ns.

Run 3 of the closed model
The initial structure of ADP-hHsp70 model in a closed state was built by using the structure of the nucleotidefree Hsp70 relaxed after 3 ns in run 1 (Nicolai et al., 2010). We added one ADP molecule with PRODRG (Schuttelkopf & van Aalten, 2004) and one Ca 2+ ion (as in the experimental structure of the human HSP70 NBD, PDB ID: 1HJO) (Osipiuk et al., 1999) to this APO structure. The protein was solvated with 212,724 water molecules in a cubic box of the same dimension than as in run 1 (i.e. 18.664 nm) and 12 Na + ions were added to neutralize the system. The total simulation time period was 188 ns.
As explained in Nicolai et al. (2010) and in Section 3, the relaxation of the hHsp70 model in a closed state is faster than the one in the open state, which justifies that the duration and number of runs are smaller in the closed state compared with the open state (Table 1). In addition, as shown in Table 1, the MD simulations were performed for different time lengths in a given structural state (open or closed), because the relaxation time (defined below) towards a stable structure varied significantly depending on the initial conditions (due to the complexity of the structure and the stochastic nature of the relaxation process). Table 1 In each MD trajectory, the convergence and stability of the hHsp70 structural models were monitored by computing the radius of gyration R g , the C α root-mean-square deviation (RMSD), and the C α root-mean-square fluctuations (RMSF) in function of time as described in Nicolai et al., (2010). The end of the relaxation time correspond to convergence of the RMSD and RMSF which occurred after the final major structural events observed in the MD runs, i.e. to the binding of helix B of the SBD on the NBD in the open state and to the motion of the SBD towards the NBD in the closed state (see Section 3). Because the simulations of hHsp70 are computationally demanding and our computational resources are limited, the MD runs were continued (production time period) for about only 150-200 ns after the relaxation time period (Table 1).

Alignment of the Hsp70s sequences
In FRET experiments of Hsp70s, fluorophors derived from rhodamine 6G were attached to residues T168, T518, G583, and G636 of BiP, (Marcinowski et al., 2011) to residues D341, I448, and D590 of Ssc1 (Mapa et al., 2010), and to residues E318, V425, N458, and T563 of DnaK (Mapa et al., 2010). The distance distri- butions measured between these labeled residues were compared with those computed from MD between the C α of residues located at equivalent positions in hHsp70, and with those computed from the experimental structures of Hsp70 protein homologs; Sse1 (PDB ID: 3C7 N chain A) and DnaK (PDB ID: 2KHO). The positions of the residues in hHsp70, equivalent to the positions of the fluorophors used in spFRET were computed by aligning the sequences of human Hsp70 (unitprot: P08107), Ssc1 (unitprot ID: P12398), DnaK (unitprot ID: P0A6Y8), BiP (uniprot ID: P11021), and Sse1 (uniprot ID: P32589) using the BLAST algorithm (Altschul, Gish, Miller, Myers, & Lipman, 1990). According to these sequence alignments, the labeled residues in the FRET experiments of BiP (Marcinowski et al., 2011) correspond to residues T140, T495, K559, and Q612 of hHsp70; and those of Ssc1 and DnaK (Mapa et al., 2010) correspond to E315, I427, and D572 of hHsp70. The positions of the residues for which we compute the distance distributions (Section 2.3.) are shown in cartoon representations of hHsp70 in Figure 2.
The similarity between human Hsp70 and Hsp70s of other species was evaluated by the local alignments of the sequence of hHsp70 against the sequences of BiP, Ssc1, and DnaK by using the BLAST protein (BLASTp) program (Altschul et al., 1997). All the results are summarized in Table 2.

Distance distributions computed from MD
A quantitative prediction of the distance distributions comparable to those extracted from the spFRET efficiencies would require to simulate the protein labeled with the fluorescent molecules (Badali & Gradinaru, 2011;Corry & Jayatilaka, 2008;Hoefling et al., 2011;Merchant, Best, Louis, Gopich, & Eaton, 2007) and to model the FRET mechanism. This was recently undertaken for standard fluorophors attached to polypeptides (Hoefling et al., 2011). Such an approach is difficult to apply for a large and complicated protein like hHsp70 (Corry & Jayatilaka, 2008;Hoefling et al., 2011). In addition, to predict the protein structure modifications introduced by the cysteine mutations and the fluorescent  organic molecules, it is necessary to build and test a new force field for the fluorescent molecules covalently linked to hHsp70. This task is extremely demanding in computer time and is beyond our current computer capabilities and the scope of the present work. For these reasons, we choose to compute the distance distributions between the C α atoms of the residues of hHsp70 simulated without fluorophors and to compare these "intrinsic" distances to the distances measured between the spFRET probes attached to equivalent residues in Hsp70 homologs (Mapa et al., 2010;Marcinowski et al., 2011). The distributions P(d) were computed from MD as follows. We computed the C α -C α distance d between the two labeled residues in each frame (i.e. every 2 ps) of each MD run of hHsp70 in each state (open and closed) by excluding the relaxation time period (Table 1) /6) and we normalized it. The parameters of the Gaussians are given in Table 3. For comparison, the distance distributions were also computed by including the relaxation time period and are shown in Figures S1 and S2 in the Supporting Information (SI).

Computation of the effective free-energy landscapes
Two-dimensional (2D) free-energy landscapes (FELs) F (X,Y) of hHsp70, where X and Y are structural variables of the protein, were computed from the MD simulations by using the Boltzmann formula, i.e. F(X,Y) = -k B Tln[P(X,Y)] where P(X,Y) is the 2D probability distribution of the variables X and Y, k B is the Boltzmann constant, and T is the temperature (Figures 3 and 4). For each MD run i (i = 1-6 in the open state and i = 1-3 in the closed state, see Table 1), the variables X and Y were computed for each frame [every 2 ps], which belongs to the production time period (see Table 1) and their probability distribution P i (X, Y) was computed. The probability distribution P(X, Y) was computed as a weighted average of the 2D probability distributions of the nine MD runs as follows:

Calculation of the representative structures of hHsp70 from MD simulations
For each MD run, a stable representative structure of hHsp70 was computed using the following procedure. First, the RMSD (C α -RMSD) between the coordinates of the C α atoms of the last snapshot of the MD run considered (e.g. at time t = 190 ns in run 3 of the closed state, see Table 1) and the coordinates of the C α atoms of all the other snapshots recorded (every 2 ps) in the production time period of that run (i.e. from 30 to 190 ns in run 3 of the closed state for example) was computed. Second, the one-dimensional (1D) probability distribution P(C α -RMSD) was calculated and the coordinates of hHsp70 of all snapshots corresponding to the maximum value of P (C α -RMSD) were extracted from the trajectory. This leads to a cluster of representative (most probable) stable structures for the MD run considered. Third, the average structure of this cluster was computed and the RMSD between this average structure and all the members of the cluster was calculated. The structure within the cluster of representative structures having the minimum RMSD relative to the average structure of the cluster is chosen as the representative structure of the MD run and is shown for each MD run in Figures 3 and 4.

Relaxation of hHsp70 in the open and closed structural states
The structural changes of the hHsp70 models observed during the nine MD runs (Table 1) were best represented by the FEL F(RMSD,R g ) = -k B T Ln[P(RMSD,R g )], where k B is the Boltzmann constant, T is the temperature, and P(RMSD,R g ) is the 2D probability density (see Section 2.4) of the RMSD of the structure from the initial model built by homology and of the radius of gyration R g (Figure 3). The FEL F shown in Figure 3 is an effective FEL because the probability density P was computed on the limited duration of the MD runs (Senet, Maisuradze, Foulie, Delarue, & Scheraga, 2008).

Open model of hHsp70
Open model with empty NBD and NBD bound to ADP (Nicolai et al., 2010) and ATP were relaxed by MD (Table 1). Similar events were observed in all MD runs. Runs 1-4 were described previously (Nicolai et al., 2010). We describe here the events for the new runs with ATP bound to the NBD. The unstructured C-terminal part of the initial model folds after a few nanoseconds. This event corresponds to an increase in the RMSD up to around 10 Å in runs 5 and 6 at the beginning of the relaxation time period (Figure 3). We observed the break of the straight helix "A + B" of the SBD-α, corresponding to an increase in the RMSD at 400 ns for run 5 and at 200 ns for run 6, and the motion of the helix B towards the NBD, corresponding to a decrease in R g (Figure 3). The relaxed final structures found in runs 5 and 6 for the ATP-bound hHsp70 in an open state have a radius of gyration of 29.2 and 26.8 Å respectively, in agreement with previous MD runs 1-4 (Table 1) (Nicolai et al., 2010) and experimental SAXS data (Shi et al., 1996;Wilbanks et al., 1995).

Closed model of hHsp70
The initial model of hHsp70 in the closed state converged very quickly to a stable conformation without large structural changes in the three MD runs (Table 1). Indeed, we observed the folding of the unstructured Cterminal part of the model in the first nanoseconds followed by the motion of the whole SBD towards the NBD. This subdomain motion decreases the radius of gyration R g of the structure and increases the RMSD relative to the initial model linearly with the simulation time up to 20, 14, and 19 ns for the runs 1, 2, and 3, respectively ( Figure 3). The stable conformations of the hHsp70 model have a radius of gyration R g of 34.0 Å (run 1), 33.9 Å (run 2), and 34.9 Å (run 3), in good agreement with experimental SAXS data (Shi et al., 1996;Wilbanks et al., 1995).

Effective FEL of an isolated molecule of hHsp70
To represent the conformational heterogeneity of human Hsp70 in a more quantitative way, we choose to draw a portrait of the FEL of human Hsp70 along two coordinates: one distance d 1 characterizing the NBD/SBD interactions [NBD-I/SBD-α or NBD-II/SBD-β distances] and one distance d 2 characterizing the SBD opening [SBD-α/ SBD-β distance]. A 2D FEL F(d 1 ,d 2 ) is computed by using F(d 1 ,d 2 ) = -kTLn[P(d 1 ,d 2 )], where k is the Boltzmann constant, T is the temperature, and P(d 1 ,d 2 ) is 2D equilibrium probability distribution computed from the part of the MD trajectories after the relaxation time period when the structures of the molecules are fluctuating around their equilibrium states (see Section 2.4). The quantity F(d 1 ,d 2 ) is in fact an effective FEL computed on a μs timescale (Senet et al., 2008): only conforma-tional substates separated by low activation barriers, which can be crossed on a sub-μs timescale, are explored in the present MD simulations. We emphasize that P(d 1 , d 2 ) is not the simple product of P(d 1 ) and P(d 2 ) because the interaction between the NBD and the SBD is strongly dependent on the SBD conformation. The 1D probability distributions P(d 1 ) and P(d 2 ) can be measured by spFRET (see Section 3.3). The 2D effective FELs F (d 1 ,d 2 ) and the structures corresponding to its different local minima are shown in Figure 4. For comparison, the structures of the templates of the closed and open states of hHsp70 are also shown. Figure 4 clearly shows that the different conformational substates of hHsp70 in the closed state [left panels] correspond to different orientations of the SBD relative to the NBD. This picture agrees with the hypothesis of a SBD diffusing around a cone, as suggested from residual dipolar coupling NMR data of DnaK . In the open state, the multiple local minima of the FELs correspond to different positions of the SBD-α onto the NBD-I (Figure 4, right panels). The SBD-α bundle appears therefore as an essential dynamical ingredient of hHsp70.

Distance distributions predicted for human Hsp70 and comparison with spFRET data of Hsp70 homologs
Conformational heterogeneity of the Hsp70s can be quantified by measuring the 1D probability distribution of the distances between two flurophors as done recently for Bip (Marcinowski et al., 2011), Ssc1, and DnaK (Mapa et al., 2010). Next, we compare the 1D distance distributions predicted by MD for human Hsp70 with those measured for the Hsp70 homologs ( Figures 5 and 6). We emphasize that only a qualitative comparison is possible, in particular because the distances in MD simulations are systematically smaller than the distances measured in spFRET (see Section 4).

Conformational heterogeneity of the distance distributions within BiP and hHsp70
Distances between the NBD and the SBD. In spFRET measurements of BiP (Marcinowski et al., 2011), the mean and the width of the major NBD-I/SBD-β distance distribution are only few Å larger in the APO (nucleotide-free) and in the ADP-bound states ( Figure 5(a)) than in the nonhydrolysable ATP analog (AMP-PNP) bound state ( Figure 5(b)). In addition, the major NBD-I/SBD-α distance distribution does not depend on the nucleotidebinding status of the NBD: it is very similar in the APO and ADP-bound states ( Figure 5(c)) and in the AMP-PNP bound state ( Figure 5(d)). This result is surprising as the conformations of Hsp70s chaperones are significantly different in the ADP-and ATP-bound states, as shown for instance by SAXS (Shi et al., 1996;Wilbanks et al., 1995). In agreement with the experimental data of BiP, the means of the major NBD-I/SBD-β distance distribution ( Figure 5 In MD, the heterogeneity of the distance distribution between the NBD-I and the SBD-β ( Figure 5(b), inset) and between the NBD-I and the SBD-α ( Figure 5(d), inset) of hHsp70 in an open state arises from the variable position of the SBD-α onto the NBD. As shown in Figure 4 (right panels), the SBD lid bound at different locations onto the NBD depends in part on the interaction of the flexible C-terminal part of the chaperone with the SBD and with the NBD. Interestingly, this conformational variability of the SBD-α in the open state produces a minor population of hHsp70 molecules having a shorter NBD-I/SBD-α distance, as it was observed in spFRET data for BiP in the AMP-PNP-bound state (a very narrow minor distance distribution occurs around d = 60 Å in the experimental data in Figure 5(d)). Note that the very diffuse minor distribution of the large NBD-I/SBD-β distance (around 75 Å) of BiP in the AMP-PNP bound state (Figure 5(b)) was not found in the MD simulations of hHsp70.
The heterogeneity (multiple-peak distribution) of the distance between the NBD-I and the SBD-α of hHsp70 in a closed state (Figure 5(c)) is due to the different conformations of the kink (residues 525-531) between the helix A and the helix B of the SBD-α and to different conformations of the interdomain linker (residues 381-393).
Distances within the SBD. The BiP chaperones can be resolved into three discrete subpopulations of molecules having a long (90 Å), a medium (62 Å), and short (48 Å) mean distance between their SBD-α and their SBD-β ( Figure 5(e) and (f)) (Marcinowski et al., 2011). The mean distance of each discrete distribution of molecules did not depend on the nucleotide-binding status of the NBD. On the contrary, the size of each subpopulation varied with it: the major subpopulation corresponds to the short mean distance in the ADP-and nucleotide-free states ( Figure 5(e)), and to the long mean distance in the AMP-PNP state ( Figure 5(f)). The subpopulation of molecules with the shortest mean distance between the two SBD subdomains occurs with a relatively narrow distance distribution. On the contrary, the corresponding distance distributions of the medium and long mean distance subpopulations of molecules are rather broad.
In the MD simulations of hHsp70 in an open state, we observed three discrete subpopulations of molecules with a medium (65-70 Å) and a long (83 Å) mean distance between the SBD subdomains (the trajectories 1, 2, and 6 form two subpopulations with close distance and the trajectories 3, 4, and 5 another) ( Figure 5(f)). The medium and large distances correspond to the different binding positions of the helix A and B of the SBD-α on the NBD-I and of different conformations of the kink between these two helix, as shown in the insets of Figures 5(f) and 4 (structures shown on the right).
Compared with the spFRET data of BiP, there is no subpopulation with a short SBD-β/SBD-α distance in the present MD simulations of the open state of hHsp70. Short distances (15-30 Å) between SBD-α/SBD-β subdomains occur only in the closed state of hHsp70 ( Figure 5 (e)). According to our MD simulations of human Hsp70, we may hypothesize that the minor short distance sub-  Marcinowski et al., 2011). The distances computed for the Hsp70 homologs Sse1 (PDB ID: 3C7 N) and DnaK (PDB ID: 2KHO) are represented by a purple square and a triangle, respectively. The FRET distance distributions centered on these distances (cyan and green dashed curves) are also given in panels (a) and (b) for comparison. Each distance between a pair of residues (the numbers of which are given in the brackets in the insets) is computed between their C α atoms in the MD runs (numbered as in Table 1) and in the PDB structures. Results are shown in the left panels for the closed state of Hsp70s (nucleotide-free and ADP-bound) and in the right panel for the open state of Hsp70s (AMP-PNP-bound). Each Hsp70 structure in the insets is represented with ribbons: NBD (residues 1-380), lobe I (residues 1-188, 361-380; blue) and lobe II (residues 189-360; cyan); linker (residues 381-393; purple); SBD-β (residues 394-507; green); SBD-α (residues 512-615; red); Cterminal (residues 616-641; gray) and magenta (PDB ID: 2KHO in the left panel and PDB ID: 3C7 N chain A in the right panel). The labeled residues are represented by black spheres. The structures shown in the insets were extracted from each MD run and are representative of the structures at the maxima of the distance distributions computed from MD. The superposition of the structures was done by minimizing the RMSD of the C α of the whole NBD and all the structures are in the same view. These insets were prepared with PyMOL [http://www.pymol.org]. population observed for BiP in Figure 5(f) could be due to a minor subpopulation of BiP molecules in a closed state. Similarly, we did not observe the large SBD-α/ SBD-β distance found in spFRET in the MD simulations of hHsp70 in the closed state ( Figure 5(e)). Such large distances may correspond to a fraction of BiP molecules in an open structural state (see also Section 4) (Marcinowski et al., 2011).

Conformational heterogeneity of distance distributions within Ssc1, DnaK, and hHsp70
Distances between the NBD and the SBD. In the spFRET experiments of Ssc1 and DnaK (Mapa et al., 2010), the mean and the width of the major distance distribution between the NBD-II and the SBD domain of the chaperones decreased significantly after the exchange of ADP (Figure 6(a)) by ATP (Figure 6(b)). In addition, minor populations of molecules were observed both for ADPand ATP-bound Ssc1 and DnaK chaperones. The authors of Mapa et al. (2010) mentioned that minor species varied with sample preparation and could be due to contamination. In Figure 6 In MD simulations, we observed that the mean and the width of the distance distribution between the NBD-II and the SBD-β subdomains of hHsp70 are smaller in the open state (Figure 6(b)) compared with those computed in the closed state (Figure 6(a)), as in the FRET data of DnaK and Ssc1.
Distances within the SBD. In both ADP-and ATP-bound states of DnaK (Figure 6(c) and (d)), the major (very broad) subpopulation measured by FRET corresponds to a long distance (80-90 Å) between the SBD-α/SBD-β (Mapa et al., 2010). A minor (narrower) subpopulation of short distance (60 Å) between the subdomains appears in the ADP-bound DnaK solution (Figure 6(c)). In the MD simulations of hHsp70, a (broad, two-peak) long distance distribution occurs only in the open state (Figure 6(d)) and a (relatively narrow) short distance distribution exists only in the closed state (Figure 6(c)). These all-atom MD simulations suggest that the ADP-bound DnaK solution might be a mixture of molecules in the open and in the closed states. A similar hypothesis can be formulated for Ssc1 for which the major subpopulation of molecules has a long distance (70 Å) between the SBD subdomains independently of the status of the nucleotide-binding domain (Figure 6(c) and (d)).

Discussion
4.1. The distances computed for hHsp70 are smaller than the distances between the spFRET probes in homolog chaperones The means of the major distance distributions computed for human Hsp70 are systematically smaller than the ones measured for BiP, DnaK, and Ssc1 in the spFRET experiments, as shown in Figures 5 and 6. This might be due in part to the inaccuracies of the protein model and forcefield: the gyration radii computed for the full-length hHsp70 in its open and closed states (Nicolai et al., 2010) are 6 and 1 Å smaller than those measured by SAXS (Wilbanks et al., 1995) for the bovine Hsp70 homolog protein in ADP-bound and ATP-bound states, respectively. This small deviation cannot explain the large difference between the means of the major distance distributions computed by MD and those measured by spFRET. One should keep in mind that the distances are measured between two fluorophors attached to two residues in spFRET (Mapa et al., 2010;Marcinowski et al., 2011) and not between the C α of these two residues. The distance between the C α of two residues which is actually computed in MD (see Section 2) is expected to be smaller than the distances between the fluorescent molecules attached to these residues. Indeed, one can estimate the size of the commercial fluorophors derived from rhodamine 6G used in spFRET to about 15 Å [as computed from the structure of rhodamine 6G in PDB ID: 1JUS (Schumacher et al., 2001)]. The distance between the dye center of a fluorophor and the position of the C α of the residue to which it is linked was estimated to be about 10 Å (Choi et al., 2010). Based on this estimation, the distance between two fluorophors can be theoretically up to 20 Å larger than the distance between the C α of the two residues to which they are covalently linked.
The distances measured in spFRET ( Figures 5 and 6) are also larger than the distances between the C α of the residues computed from high-resolution Hsp70 structures. Indeed, the distances between the fluorophors derived from the spFRET data of ADP-DnaK (Mapa et al., 2010) are 20 Å larger than the distances between the C α of the residues to which they are linked, which we computed from the NMR-derived structure of ADPbound DnaK (PDB ID: 2KHO) (see Figure 6(a) and (c)) . The distances between the fluorophors in ATP-Hsp70s (Figures 5 and 6, right panels) are also systematically larger than the distances between the C α of the residues to which they are linked, which we evaluated from the X-ray structure of the homolog protein Sse1 (PDB ID: 3C7 N) (see Figures 5(b) and 6 (b) and (d)). A difference as large as 20 Å between the distance extracted from X-ray structures between the atoms of two residues and the distance extracted from FRET between the fluorophors attached to them was already observed for other proteins (Dos Remedios & Moens, 1995).

Difficulties in interpreting experimental distance distributions quantitatively
A quantitative comparison between the distances computed in MD of proteins (without extrinsic probes) with those derived from the spFRET efficiencies is limited by several factors related to the fluorescent probes. Indeed, the fluorophors are large molecules which often occur in different conformations (Corry & Jayatilaka, 2008;Hoefling et al., 2011) corresponding to different distances between the probes for the same protein structure, i.e. to the same distance between the C α of the residues to which they are linked (Corry & Jayatilaka, 2008;Hoefling et al., 2011). Moreover, the spFRET efficiency depends both on the orientation of the probes and of their distances (Latt, Cheung, & Blout, 1965). The orientation is quantified by a dipole orientation factor κ 2 , which measures the relative orientation of the donor and acceptor dipoles (Latt et al., 1965). The factor κ 2 is fluctuating over time (due to fluorophors motions and fluorophors-protein interactions), and its value may vary theoretically between 0 and 4 (Badali & Gradinaru, 2011;Hoefling et al., 2011). These variations are neglected by assuming an orientational averaging of the dipoles of the probes (i.e. κ 2 = 2/3) in extracting the distance distributions from the FRET efficiencies (Marcinowski et al., 2011;Mapa et al., 2010).
The size of the fluorophors (15 Å) is rather large compared to the distance between the SBD-α and the SBD-β of the chaperones in the closed state [the distance computed for Dnak (PDB ID: 2KHO) is 40 Å, Figure 6 (c)]. For this reason, they may perturb the equilibrium between the different substates of the SBD observed in Figures 5(e) and 6(c). An independent measurement of the distance between the SBD-α and the SBD-β, between two labeled side chains of these subdomains, was recently carried on by using the electron paramagnetic resonance (EPR) spectroscopy, for DnaK in the nucleotide-free and in the ATP-bound states (Schlecht et al., 2011). As the spFRET distance distributions of BiP and the MD distance distributions of hHsp70, the EPR results of DnaK reveal multiple substates of the SBD in the closed state (Schlecht et al., 2011). However, in agreement with the MD of hHsp70, no major population with large (80 Å) SBD-α/SBD-β distance was observed in EPR data (Schlecht et al., 2011) of APO-bound DnaK, on the contrary of the spFRET data shown in Figure 6 (c). It would be interesting to compare the EPR and spFRET distance distributions with those simulated by unbiased all-atom MD for DnaK in solution in different nucleotide-binding states. Such a work is currently under investigation in our group and will be presented elsewhere.

Choice of the distances monitoring the conformational changes
The MD simulations show that a narrow distribution of one distance selected between two domains may hide the actual conformational heterogeneity of the protein. For example, the multiple conformations of hHsp70 computed in an open state (insets of Figures 5 and 6, right panels) correspond to different NBD-I/SBD-β distance distributions ( Figure 5(b)) but to the same NBD-II/SBDβ distance distribution (Figure 6(b)). In the closed state, the different conformations adopted by the SBD of hHsp70 lead to a narrow NBD-I/SBD-β distance distri-bution ( Figure 5(a)) but to a quite broad NBD-II/SBD-β distance distribution (Figure 6(a)).
In agreement with SAXS data (Shi et al., 1996;Wilbanks et al., 1995), the NBD and SBD domains of hHsp70 are docked in the open state (insets of Figures 5  and 6, right panels) and undocked in the closed state (insets of Figures 5 and 6, left panels). This large conformational change is better mirrored by the accompanying change of the distance between the subdomains NBD-II and the SBD-β (Figure 6(a) and (b)) than by the accompanying change of the distance between the subdomains NBD-I and the SBD (Figure 5(a)-(d)).

Conclusion
We explored the conformational heterogeneity of human Hsp70 which is an important target for new drugs in cancer therapies. We found that the distance distributions between the NBD and the SBD computed from MD for human Hsp70 and measured by spFRET for Hsp70 homologs followed the same trends when ADP (closed state in MD) is exchanged by ATP (open state in MD). The unbiased all-atom MD simulations of hHsp70 may provide an explanation of the trends observed as follows. The mean and the width of the distance distribution between the NBD-I and the SBD-α (Figure 5(c) and (d)) does not change significantly when ADP is exchanged by ATP in spite of the large differences between the ADP-and ATP-bound conformations of the chaperones. The large width of the distribution and the discrete state observed in the ATP state reveals the different possible locations of the SBD-α onto the NBD in ATP-bound hHsp70. These findings support the hypothesis that the SBD-α of ATP-bound hHsp70 is able to switch between different structural states in the absence of the Hsp40 co-chaperone (Gao et al., 2012).
The mean of the distance distribution between the NBD-II and the SBD-β strongly varied between the ADP-and ATP-bound states of the Hsp70 chaperones (Figure 6(a) and (b)). The MD simulations of hHsp70 confirm that the SBD of the Hsp70 chaperones occur in multiple conformations both in ADP-and ATP-bound states which are characterized by different SBD-α/SBD-β distances (Figures 5(e) and (f) and 6(c) and (d)). The insets of Figures 5 and 6 provide a picture of the origin of this heterogeneity: it is due to different SBD-α positions onto the NBD-I in the open state and to different conformations of the interdomain linker and of the kink between the helix A and helix B of the SBD-α in the closed state. We emphasize that the helix A and helix B were aligned, forming a single straight helix, in the Hsp110 template used to build ATP-bound hHsp70. The helix A + B is broken in hHsp70 in the open state relaxed by MD (see, e.g. Figure 5(f)). This finding could be revealed by a spFRET experiment with one fluorophor attached to helix A and another one to helix B.
The multiple conformational states of ATP-bound and ADP-bound Hsp70 found in the all-atom MD simulations suggest multiple pathways from the ensembles of local minima located around the closed and open structural states of the computed 2D FEL (Figure 4). Simulation of the different pathways between these structural states certainly will require crossing activation barriers on a simulation timescale much larger than 1 μs. Therefore, all-atom simulations of the transition between the open and closed states of hHsp70, including the detailed interactions of the protein with the nucleotide along the conformational pathway, are challenging. For that reason, we carried out recently coarse-grained simulations of a nucleotide-free bacterial Hsp70 protein by using implicit solvent (Golas et al., 2012). The effects of the nucleotides were introduced by rigid constraints on the NBD. A spontaneous transition between the closed and open states was observed (Golas et al., 2012). However, in order to explore the role of the ATP binding and hydrolysis, explicit all-atom simulations of hHsp70 with nucleotides are needed. This could be achieved by applying biased MD techniques (Laio & Parrinello, 2002;Wereszczynski & McCammon, 2012) which speed up the exploration of the FEL compared with the present unbiased all-atom MD simulations: this is currently under investigation in our laboratory in Dijon by applying metadynamics (Laio & Gervasio, 2008).