Dynamic properties of the native free antithrombin from molecular dynamics simulations: computational evidence for solvent- exposed Arg393 side chain

While antithrombin (AT) has small basal inhibitory activity, it reaches its full inhibitory potential against activated blood coagulation factors, FXa, FIXa, and FIIa (thrombin), via an allosteric and/or template (bridging) mechanism by the action of heparin, heparan sulfate, or heparin-mimetic pentasaccharides (PS). From the numerous X-ray structures available for different conformational states of AT, only indirect and incomplete conclusions can be drawn on the inherently dynamic properties of AT. As a typical example, the basal inhibitory activity of AT cannot be interpreted on the basis of “non-activated” free antithrombin X-ray structures since the Arg393 side chain, playing crucial role in antithrombin-proteinase interaction, is not exposed. In order to reveal the intrinsic dynamic properties and the reason of basal inhibitory activity of antithrombin, 2 μs molecular dynamics simulations were carried out on its native free-forms. It was shown from the simulation trajectories that the reactive center loop which is functioning as “bait” for proteases, even without any biasing potential can populate conformational state in which the Arg393 side chain is solvent exposed. It is revealed from the trajectory analysis that the peptide sequences correspond to the helix D extension, and new helix P formation can be featured with especially large root-mean-square fluctuations. Mutual information analyses of the trajectory showed remarkable (generalized) correlation between those regions of antithrombin which changed their conformations as the consequence of AT–PS complex formation. This suggests that allosteric information propagation pathways are present even in the non-activated native form of AT.


Introduction
Antithrombin (AT) (or antithrombin III, ATIII (Seegers, Johnson, & Fell, 1954)) is the member of the serpin superfamily of proteins which can be characterized by a special type of folding. The acronym corresponds to "serine protease inhibitors" since majority of this family have inhibitory activity against serine proteases. Serpins are the main physiological inhibitors of these enzymes in tightly regulated proteolytic pathways in higher organs (Gettins, 2002;Silverman, 2001), including the haemostatic and fibrinolytic systems (Huntington, 2011;Rau, Beaulieu, Huntington, & Church, 2007).
Antithrombin is the most abundant serpin that circulates in blood vessels and it is the most important physiological inhibitor of the blood clotting cascade (Björk & Olson, 1997;Muszbek, Bereczky, Kovacs, & Komaromi, 2010;Quinsey, Greedy, Bottomley, Whisstock, & Pike, 2004;Rau et al., 2007). Antithrombin in its native form is a poor inhibitor (Olson et al., 1992), which is a necessary condition to avoid unwanted bleeding which would be the consequence of the high concentration of antithrombin in blood plasma. Specific mechanisms (Huntington, 2011;Langdown, Belzar, Savory, Baglin, & Huntington, 2009;Langdown, Carter, Baglin, & Huntington, 2006;Olson, Bjork, & Bock, 2002;Olson, Richard, Izaguirre, Schedin-Weiss, & Gettins, 2010;Rau et al., 2007;Whisstock, et al., 2000) need to attain its full (two or three orders of magnitude higher) inhibitory potency (Olson et al., 1992) only when and where it needs, e.g. to prevent excessive coagulation at the injury of blood vessels. These mechanisms are the template (or bridging) mechanisms against thrombin and allosteric activation against FXa and FIXa (for excellent reviews, see references (Huntington, 2005(Huntington, , 2011Olson et al., 2010)). The template mechanism requires longer chain heparin or heparan sulfate components (Bray, Lane, Freyssinet, Pejler, & Lindahl, 1989), while for allosteric activation, a suitably sulfated pentasaccharide fragment is sufficient. It should be noted that template mechanism can play a role in the inhibition of FXa and FIXa as well at the case when approximately twice the length of heparin chains compared to those required for thrombin inhibition are available (Bedsted, 2003;Rezaie, 1998;Rezaie & Olson, 2000).
Until now, a great number of serpin X-ray structures (Dunstone & Whisstock, 2011) and among them many antithrombin structures (Izaguirre & Olson, 2006;Johnson, Langdown, & Huntington, 2010;Johnson, Li, Adams, & Huntington, 2006;Langdown et al., 2009;McCoy, Pei, Skinner, Abrahams, & Carrell, 2003;Whisstock et al., 2000) with and without sulfated pentasaccharide or heparin fragments and even in ternary complex with serine proteases, have been deposited to the protein data bank (PDB, www.rcsb.org). From these structures, a three distinct step mechanism has been proposed and a consensus has been established on how the antithrombin achieves its full inhibitory potential against FXa and FIXa (Langdown et al., 2009;Schedin-Weiss et al., 2010). According to this model, at first, the pentasaccharide ligand is loosely attached to the highly positive surface of the antithrombin. In the next step, a stronger binding is evolved which is accompanied by conformational changes mainly at the ligand-binding area. Finally, this is followed by an additional conformational transition during which the conformational change from the ligandbinding site is propagated to the hinge region part of the antithrombin. It resulted in the hinge region expulsion from the β-sheet-A and the formation of an even more stable antithrombin-pentasaccharide complex.
While these resolved X-ray structures provide us invaluable information on the allosteric activation mechanism, these are basically static snapshot pictures from complex dynamic processes. The structures obtained this way are influenced by crystal forces. The latter can be demonstrated by the co-crystallization of the sulfated glycosaminoglycan-free native antithrombin with its latent form (McCoy et al., 2003) which results in substantially different conformations for the reactive center loop (RCL) compared to that observed in the case of the "pure" native antithrombin crystals (Johnson, Langdown, et al., 2006). More importantly, neither of these conformations can be used to explain how the glycosaminoglycan-free antithrombin displays small but non-negligible inhibitory activity against FXa or thrombin. In these structures, the Arg393 side chain pointing to the inner side of antithrombin cannot be accessed by the active site of proteases. Therefore, to explain the basal inhibitory potential of AT against serine proteases, the existence of at least one additional, considerably different conformation has to be supposed (Huntington, 2011;Li, Johnson, Esmon, & Huntington, 2004).
These facts and those conformational variabilities that were obtained from the analysis of available X-ray structures are the straightforward evidences which support the strongly dynamic nature of AT as a member of serpin superfamily (Whisstock & Bottomley, 2006).
It is expected that every piece of information on the dynamic property of antithrombin would help to understand the basal inhibitory potential of antithrombin itself and would help to recognize the "channel of information flow" of allostery, resulting in finally remarkable conformational transitions. Owing to the size of this protein, obtaining detailed experimental information with sufficient time and space resolution on dynamic properties of antithrombin is still beyond the limits of the current experimental (e.g. nuclear magnetic resonance, NMR) methods. In addition, the co-existence of different conformational states would make the extraction of atomic resolution structure and its progression a function of time to be extremely difficult from these experiments.
On the other hand, we are witnessing a continuously growing importance of in silico methods in structural studies of blood coagulation proteins (Villoutreix & Sperandio, 2010). Molecular dynamics simulations can be used e.g.in principleto reach arbitrary time and space resolution at the expenses of sufficient computational effort. Although the results obtained this way depend strongly on the quality of force field applied, the time interval can be covered by simulations and on some other parameters. Nevertheless, there is a long record of success achieved by molecular dynamics simulations in revealing hidden structural events of protein dynamics (Schlick, 2010). Molecular dynamics simulations on serpins have been reported recently and can be regarded as the demonstrative examples (Chandrasekaran, Lee, Lin, Duke, & Pedersen, 2009;Kass, Knaupp, Bottomley, & Buckle, 2012;Kass, Reboul, & Buckle, 2011;Verli & Guimarães, 2005).
In the present work, we carried out in silico molecular dynamics experiments, at first, on the antithrombin itself. First of all, our goal was to reveal the flexibility of the RCL and to see whether a RCL conformation that is suitable for a Michaelis-type complex formation can exist during the simulation interval. Our aim was also to gain information on the flexibility of pentasaccharide binding region since it is known that remarkable conformational changes should take place at this site for a sufficiently strong PS binding. We were wondering if there is any sign for the helix D extension at this site in the trajectory of the PS-free AT. During allosteric regulation, the information on the interaction should propagate from the allosteric site to the site of action on allosteric pathways. We were wondering whether such pathways are preexisting at the free native antithrombin and it is selected and "widened" by heparin/pentasaccharide binding or, alternatively, it is newly created by those bindings. The information theory based on "mutual information" can be extracted from the dynamic trajectories and it can help in finding the regions with coupled motions, i.e. which have information from each other.

Computational details
Three systems were considered in our simulations. The first one is based on the 1e04 pdb structure and represented the conformation which is observed (McCoy et al., 2003) when the native form is co-crystallized with the latent one (Simulation 1). The second one is based on the 1t1f pdb structure representing the AT conformation which is crystallized in monomeric form and featured significantly different RCL positions  relative to the body of AT. Since, in this case, the engineered Val317Cys-Thr401Cys mutations were introduced that ensured a new disulfide bridge formation in order to prevent the native-latent transition, the Cys317 and Cys401 residues were virtually mutated back to the residues present in the wild-type sequence (Simulation 2). The third simulation was carried out on the 1t1f Val317Cys/Thr401Cys mutant (Simulation 3).
In order to carry out molecular dynamics simulation for the whole AT protein chain the missing peptide sections in the AT structures (McCoy et al., 2003; retrieved from the protein data bank (PDB id: 1e04, 1t1f) had to be completed. Besides the short N-terminal region and the missing C-terminal Lys432 residue, loop structure for an additional unresolved >10-residuelong peptide sequence had to be derived. The latter has special importance since it is in a PS-binding area of the protein. These were done either by the automatic loop generation capability of the MODELLER (Fiser & Šali, 2003) software (1t1f ) or by the Schrödinger package (SchrödingerLLC, 2013) (1e04). In the latter case, the derived loop structure was refined by the PLOP (Jacobson, 2004) software tools developed in the Friesner group and implemented into the Schrödinger modeling package. The structures were obtained by homology modeling and were checked by the Procheck (Laskowski, MacArthur, et al., 1993) software.
The energies of the completed structures were minimized by the GROMACS software package (Pronk, Pall, et al., 2013) using the AMBER '03 force field (Duan, Wu, et al., 2003;Lee & Duan, 2004). Structural water molecules were kept according to the original X-ray file. The model was then solvated in a dodecahedral box. The distance between the box wall and the closest protein atom was set to 12 Å at the 1e04 and 15 Å at the 1t1f. The latter value was chosen because of the more compact structure of 1t1f compared to the 1e04. The systems were neutralized and additional Na + and Cl − ions were added to set the ionic strength to .15 M. A short simulated annealing period (2 ns) was applied for heating up the system to 310 K, then 2000 ns constant particle number (N), constant pressure (p = 10 5 Pa), and constant temperature (T = 310 K) production dynamics with periodic boundary (PBC) condition took place. For the water, the TIP3P (Jorgensen, Chandrasekhar, Madura, Impey, & Klein, 1983) explicit water model was applied. For the short-range electrostatic and van der Waals energy terms, 10 Å cut-off distances were used. The long-range electrostatic energy corrections were calculated by means of the particle mesh Ewald (PME) method (Darden, York, & Pedersen, 1993). The v-rescale (Bussi, Donadio, & Parrinello, 2007) and Berendsen algorithms (Berendsen, Postma, DiNola, & Haak, 1984) were used for temperature and pressure couplings, respectively. To have a chance to observe conformational fluctuation, which resembles the large-scale conformational transition that accompanies activation, the simulation time should be sufficiently long. It can be efficiently achieved using longer time steps during simulations. Virtual sites protocol is an alternative to mass repartitioning method which can be used to increase the length of time step in simulations. At this case, the position of H atoms are defined as a function of the neighboring heavy atoms (Berendsen & van Gunsteren, 1984;van der Spoel et al., 2010). An increased time step (4 fs) was used in all the simulations presented in this work by means of the virtual site protocol applied for hydrogen atoms. The system sizes (N), including the virtual sites, were 85,566, 88,291, and 88,165 for Systems 1, 2, and 3, respectively. The GROMACS software suites were used for all the simulations and for the analyses of trajectories. Protein structure visualizations were done by the VMD 1.9.1 (Humphrey, Dalke, & Schulten, 1996) software tools.
The "mutual information" or "generalized correlation" proposed by Lange and Grubmüller (Lange & Grubmuller, 2006) was calculated by means of the g_correlation utility downloaded from the http://www. mpibpc.mpg.de/grubmueller/g_correlation website. For this purpose, the frames from the trajectory were selected at the end of every nanosecond.
In order to extract the principal components of the residue movements, principal component analysis (also known as essential dynamics analysis) calculations were carried out on the dynamic trajectories using the ProDy package (Bakan, Meireles, & Bahar, 2010), which is interfaced as the "normal mode wizard" to the VMD 1.9.1 molecular graphics software. Two thousand frames, saved at the end of every ns, were considered in Principal component analysis (PCA) calculations. The covariance matrices were calculated only for the Cα atomic displacements. Only the 10 largest amplitude motions were calculated. The default parameters were applied in the normal mode wizard calculations to visualize the eigenvectors.
Allosteric pathways were also derived from 2000 trajectory snapshots selected from the end of each nanosecond simulation using the WISP software (Van Wart, Durrant, Votapka, & Amaro, 2014) which was interfaced to the VMD 1.9.1 molecular graphics software as well.
Only the Cα atoms of residues, as "nodes" were considered during covariance calculations. The allosteric pathways from Lys114 to Ser380 were calculated. The contact map limit of WISP was set to an average of 4.5 Å for the contact node pair generator. Only 20 allosteric pathways were generated.
While there are many already-existing methods to map (to find local energy minima and barriers between them) the modestly complex potential energy hypersurface, in this work, we used "brute force" NPT dynamics. The reason was that we were interested in the properties of "equilibrium" dynamics.
Nevertheless, an attempt was made to have a rough estimation on the free energy surface for the conversion between RCL conformational states resembling the 1t1fand the 1e04 PDB structures applying metadynamics method (Barducci, Bonomi, & Parinello, 2011) by the PLUMED software (Tribello, Bonomi, Branduardi, Camilloni, & Bussi, 2014) interfaced to the GROMACS package (Pronk et al., 2013). An angle defined by the Met251 CA, Glu255 CA, and the center of mass of the Ala384-, Ser385-, Thr386-, Ala387-, Val388-, Val389-, and Ile390 CA atoms of the RCL as well as a distance between the Lys236 and the center of mass of Val388-, Val389-, and Ile390 CA atoms were used as collective variables (CVs). Using these CVs the conformational states resembling the 1e04 and 1t1f structures of the RCL, can be satisfactorily separated. The simulation parameters were the same for the preceding simulation. For both the CVs, the weight factor and the width of the deposited biasing Gaussian potential were set to .1 and .1, respectively, measured in kJ/mol as well as nanometers and radians. The frequency of deposition was 1000 in MD steps. The starting (equilibrated) trajectory frame was the last one in the simulations carried out on the wild-type 1t1f system. The total simulation time was 300 ns.
Although simulations for the three systems (Systems 1, 2, and 3) were carried out, not all of them will be discussed in detail. In all native-latent dimeric and AT-proteinase productive complex structures, the RCL orientation was closer to the one which was found in 1e04 structure than in 1t1f one. Therefore, the simulation based on the 1e04 starting geometry will be used as the "lead" simulation and the characteristic similarities and/ or differences of Simulations 2 and 3 relative to Simulation 1 will be emphasized only in the text. In some cases, figures will be given only for Simulation 1 and the corresponding figures for Simulations 2 and 3, if they have some importance, will be available only in the Supplemental Online Material.

Homology modeling
Both the Modeller and the Prime module of Schrödinger software package, the latter with the rather lengthy PLOP optimization, resulted in good starting geometries which were suitable for molecular dynamics simulations. No non-glycine φ-ψ dihedral angle pairs were found in the disallowed regions and even those pairs which fell into the "generously allowed regions" are close to the "allowed" ones. Especially, excellent quality structure was obtained by means of the PLOP loop optimization protocol. It should be noted, however, that because of the length of the structurally unresolved sequence at the N-terminal region and because of the fact that it is unresolved in all the known X-ray structures, there is a high probability for the co-existence of several different loop conformations. It means that the loop structure, we have derived, represents probably only one of the many possibilities.

Molecular dynamics simulations
Two microsecond dynamics simulations intervals are insufficient to observe native → latent transitions or even activation of antithrombin, since these are rare events on this time scale. Nevertheless, other molecular dynamic properties e.g. loop conformational flexibilities, root-mean-square fluctuations (rmsf), correlated motions between regions of the protein can be extracted from the trajectories. Based on these properties, valuable information on the first phase of conformational transitions and on the pre-existence of allostery pathways can be expected.
Regarding the total energies of the systems, it can be recognized that the systems even after the rather short 2 ns heating-up simulated annealing period are well equilibrated (Figure 1). While the energy fluctuations are remarkable (between ±4000 kJ/mol due to the large number of particles in simulation boxes and the finite time step integration), the running-averaged values (averaging 10 ns time period) show an almost constant total energy with no significant drift. Therefore, it can be stated that the simulations were converged from the total energy point of view. From the conformational point of view, drawing out similar conclusion is much more difficult since for such large proteins, many co-existing conformations can occur and in a dynamics trajectory, a continuous transition between them can be expected. However, a "stabilization" of the radius of gyration value (not shown here) and root-mean-square deviation (rmsd) from the starting geometry are good indicators of this kind of convergence (Verli & Guimarães, 2005). The latter can be seen later in the "root mean square deviations" section.

Secondary structure dynamics
The time evolution of the secondary structure elements of the native AT extracted from the dynamics trajectory by means of the DSSP software (Kabsch & Sander, 1983) showed that all these structural elements are quite stable, i.e. no long helical and β-string elements are broken or such type of new elements are formed ( Figure S1 in Supplemental Material). It is in good agreement with the stable nature of the native AT structure in spite of the fact that its "native" form is thermodynamically unstable. Nevertheless, at certain regions, the secondary structure elements showed less stability, and some interconversion between them can occur during the simulation interval we reached ( Figure S1). The latter phenomenon can be observed at or near to the helices G and H and helices C and D.
From X-ray structures resolved for different states of the protein, it is well established that in the pentasaccharide-binding region, a new helix (helix P) between the helices C and D (marked by arrows in Figure 2 and Figure S1) exists when PS is bound to the AT. It was thought that this is a consequence of a PS-AT interaction. Surprisingly, this helix is shown to be formed even at the free monomeric AT crystal structure (1t1f ) too (Johnson, Langdown, et al., 2006). It was explained by the sensitive equilibrium with the co-existence of different conformational states even at free monomeric native AT (Figure 2). According to this assumption, the equilibrium can be shifted into one of the directions (1e04-like disordered loop or 1t1f-like "helix" P) depending on the environment. Elongation of the C-terminal part of the helix D is also known as a structural consequence of pentasaccharide binding. Nevertheless, an intermediate structure when the PS is already bound, but the helix D is still not extended was also captured (Johnson & Huntington, 2003).
Regarding the existence of helix P in simulations, it is especially interesting whether this structural element appears accidentally starting from a basically coiled structural element in the 1e04 structure. On the other hand, the question of how stable is the already-formed helix P in the 1t1f structure is a very exciting one as well. When the extension of helix D is considered, one might wonder whether any sign for the helix D extension or for the hinge region expulsion can be observed.
A closer look in Figure 2 shows, during the simulation carried out for the 1e04 structure, the 3-helix among the secondary structure elements of the corresponding Figure 1. The total GROMACS energies as the function of time for the molecular dynamics simulations carried out for the different conformational states of antithrombin, corresponding to 1e04 and 1t1f PDB structures. At the latter case, simulations were carried out on both the wild-type (1t1f-w) and the Val317Cys/Thr401Cys disulfide-bridged variants. region appearing occasionally. While this is not an exact proof of the co-existence of the helix P, it shows that the corresponding sequence has some (even if minor) preference for helix formation. The helix P, once it is formed, showed remarkable stability ( Figure 2) according to our simulations on the structures starting from the 1t1f structure. As expected, much less preference for the helix D extension can be seen from Figure 2. It can be rationalized by the fact that, at first, the new helix P is being formed (in the second step of AT-PS complex formation according to the three steps model or even earlier) (Langdown et al., 2009;Schedin-Weiss et al., 2010). The extension of helix D is the final (third) step of the AT-PS complex formation in which the gap between the third and fifth strands of the β sheet A is closed as well and the hinge region is expelled finally, resulting in an AT conformation which is fully activated against FXa.

Root-mean-square deviations
The rmsdsfrom the starting geometries as a function of simulation time are shown in Figure S2 and Figure S3 (Supplemental Material). Certain deviations from the X-ray-based starting structures are quite usual since simulations are carried out, in principle, in the solution phase. Nevertheless, in most of the cases only a modest deviation can be observed. It is in agreement with the fact that the reference AT structures represent kinetically trapped local minima and the simulation does not disturb these structures too much. The first 45 N-terminal residues with the exception of the disulfide-bridged C8-C128 residues and their vicinity are badly or even not resolved from the diffraction patterns, which clearly indicate that this part is flexible in both of the reference 1e04 and 1t1f X-ray structures. Remarkable flexibility is assumed also for the RCL as it was found earlier (Verli & Guimarães, 2005). Therefore, omitting the highly flexible 45 N-terminal residues as well as the RCL region from the calculation, substantially smaller rmsd values can be expected as demonstrated in Figure S3. The equilibrated values are now around~.2 nm which fall into the range of resolution of the X-ray structure.
It can be observed also from Figures S2 and S3 (Supplemental Material) that the rmsd values during the first 1000-1500 ns simulation "converged" to their "equilibrated" values. Figure 2. Secondary structure elements of the antithrombin are shown as a function of time, starting from different conformational states, corresponding to 1e04 and 1t1f PDB structures. At the latter case, simulations were carried out on both the wild-type (1t1f-w) and the Val317Cys/Thr401Cys disulfide-bridged variants. Only the regions that comprise helices C and D, the potential helix P, and the C-terminal extension of the helix D are depicted.

Root-mean-square fluctuations
The rmsfvalues give us valuable information on the flexibilities of different regions of proteins. Calculating the rmsf values for the whole proteins, it was found that the N-terminal regions of the ATs considered in these simulations can be featured by quite large rms fluctuations (Figure 3, Figure S4(a), and Figure S4(b)). It is not surprising since for two peptide sequences of these regions, the X-ray diffraction patterns cannot be resolved and the completed structures are expected to reflect this uncertainty. Also, the remarkable fluctuations found for the RCLs are not surprising as well since the basal inhibitory activity cannot be interpreted without conformational changes in this region. These results are in a good agreement with the 8 ns molecular dynamics results published (Verli & Guimarães, 2005) for the free and PS-bound ATs derived from the already-activated form of AT.
In Figure 3, Figure S4(a), and Figure S4(b), the color code of rmsf values projected to the " cartoon" representations of the AT structures considered in this paper are shown. The color scale from red to blue corresponds to small to large rmsf values, respectively. It is apparent that the overwhelming majority of residues can be featured with small rms fluctuations (represented by red color) at all systems we investigated. These residues correspond roughly the first (and the largest) rigid body of the AT as it was defined comparing the X-ray structures available for different states of activation (Langdown et al., 2009). The agreement is, however, not fully perfect. It can be observed that a few of the loops even in these rigid parts have considerable fluctuations (marked by red arrows). It is interesting that some peripheral helices and even the strand built from 400-410 residues in structures without engineered Val317Cys-Thr401Cys mutations also show remarkable fluctuations in the first rigid parts of AT. The considerably large fluctuations at the PS-binding regions bordered by helices A, D and the newly formed helix P as well as the N-terminal region of the protein (in Figure 3, the latter is marked by a large arrow) were not surprising. These regions should form an optimal PS-binding area, and the reorganization of the otherwise repulsive positively charged side chains presumes remarkable flexibility at this site.
For the sequence between the C-terminal end of helix D and the subsequent third β-strand (which is considered as the extension of helix D), larger rms fluctuations were seen compared to the other loop regions which were obtained from the trajectory. From the X-ray structures of free (i.e. the no PS bound) AT, it can be characterized as an extremely flexible loop (which sometimes cannot be resolved from the diffraction pattern (Schreuder et al., 1994)) as well. The flexibility of the C-terminal part of helix F can be understood considering that this region has to move during the pentasaccharide binding and also when the RCL incorporates as fourth β string to the β sheet A. The rms fluctuation of the regions mentioned above exceeds the fluctuations of the rest of the loops in AT. Also, much smaller flexibilities were observed for the already-formed P-helix from Simulations 2 or 3 ( Figure S4(b), Supplemental Material), which means that once it is formed, this secondary structure element is relatively stable.
In Figure 4, the importance of a few residues which are assumed to have a key role in conformational changes (Arocas, Bock, Raja, Olson, & Björk, 2001;dela Cruz, Jairajpuri, & Bock, 2006;Desai, Swanson, Bock, Björk, & Olson, 2000;Jairajpuri et al., 2003;Schedin-Weiss et al., 2002) required for disordered loops → helices transformations (helix P formation and helix D extension) in Simulation 1 is demonstrated (Figure 4). It is shown in this figure that the side chains of helix D residues, independently from their buried or solvent-exposed nature, have remarkable flexibility. In accordance with the available experimental data, from Simulation 1, especially large side chain fluctuations can be observed at those aromatic and basic side chains where remarkable side chain movement will take place during PS binding. It is worth noting that two residues, the Phe121 and Arg129, have significantly larger fluctuations than their immediate neighbors. These results shed light on the possible key roles of these residues in conformational transition and/or PS binding in accordance with the observed binding "hotspots" nature of Arg129 (Desai et al., 2000) and the observed importance of nonionic contribution to the stabilization of the activated conformation (Jairajpuri et al., 2003). It can be also recognized, that the regions undergoing conformational transitions (helix P formation and helix D extension) have highly above -average RMSFs which are not limited to the side chains in Simulation 1.

RCL structures
The dynamic property of the RCL even in its free native conformation has an exceptional importance in the structure-function studies of AT. All the X-ray structures of the free AT have been resolved so far with the guanidinium group of Arg393 points to the inner side of AT. Other types of RCL conformations, where the side chain of Arg393 points outward from the main body of AT (that has been found so far only in the AT-proteinase complex (Dementiev, Petitou, Herbert, & Gettins, 2004;Johnson et al., 2010;Li et al., 2004)) must, straightforwardly, exist. This conformation either should appear in non-negligible population in the free AT or can be induced by AT-proteinase interaction.
It is assumed that the RCL conformation adopted in the 1e04 (McCoy et al., 2003) structure in which the guanidinium group is surface bound is the consequence of crystal forces in the co-crystallized dimer (native + latent) AT structure. In the 1t1f structure, the monomeric native AT without PS showed completely different RCL conformations in which the R393 side chain was also surface-bound, but by an Arg393-Glu237 salt bridge (Johnson, Langdown, et al., 2006). Starting from the 1e04 conformation, during the 2 μs dynamics simulation, a large-scale conformational change was observed in the RCL conformation. In Figure 5, the distance of the guanidinium Cζ atom from any atom of the main body of AT as a function of time is depicted for all the systems we considered. For the 1e04 system, (Simulation 1, Figure 5) the~.3 nm distance clearly corresponds to a surfacebound guanidinium group. The larger, .5 -1.4 nm values correspond to conformations that are at least partially solvent exposed, i.e. which potentially can form a Michaelis complex with proteases. In simulations carried out for the 1t1f wild-type protein, even larger flexibility can be seen from the trajectory for the Arg393 side chain position ( Figure 5). Interestingly, when the C-terminal end of the RCL is fixed by a disulfide-bond, the flexibility of the RCL is not large enough for the guanidinium group to escape from a salt bridge formed with the Glu237 side chain ( Figure 5).
Despite the large fluctuation in the position of Arg393 side chain in Simulation 2, the RCL position still resembles those which can be found in the 1t1f PDB structure and differs substantially from the loop conformations that can be observed in Simulation 1. To check whether transition between these two classes of RCL conformations can take place, a metadynamics simulation was carried out. In order to achieve nearly exact (within the framework of the model we used) free energy surface for such large and flexible systems, a tremendous computational effort would be required. Therefore, the results we obtained for conformational conversion of RCL should be regarded as a rough estimation of the activation-free energy for conformational transition state between "1t1f-" and "1e04 RCL states." It can be seen from Figure S5 that a~15-20 kJ/mol barrier prevents transition from the 1t1f-like RCL conformation to the 1e04-like ones. Assuming that in our estimation the error is not too high, it seems to be enough to prevent the mixing of the different kinds of RCL conformational classes in molecular dynamics simulation. On the other hand, it is in the range which can be easily overcome by "real life" thermal motions. It is important to note that there is no need to hinge region expulsion for such conformational change. This fact suggests that these conformational states are in equilibrium, even though the equilibrium is shifted. Since the helix P is already formed in the 1t1f structure, it also suggests that both types of RCL conformations can co-exist with an already-formed helix P, i.e. pentasaccharide binding can be preceded by the helix P formation.
A few characteristic snapshots extracted from the trajectory of Simulation 1, only those for the 1e04 system, are depicted in Figure 6. The Arg393-Glu237 salt bridge, which was originally assumed in the late eighties (Owen, Beresford, & Carrell, 1988) and observed later in 1t1f structure (Johnson, Langdown, et al., 2006), is formed after~400 ns of simulation time. The conformation in which the Arg393 is solvent exposed can exist multiple times during simulation. The backbone structure of the RCL, however, still differs from those that can be seen in 1t1f X-ray structure (not shown). The probable reason can be that the simulation time was insufficient to sample the full conformational space available for the RCL. More importantly, the system, in our model, can escape from the reasonably populated (Arg393-Glu237 salt-bridged) RCL conformation and can populate those ones in which the Arg393 side chain points outward from the body of AT ( Figure 6). It required neither the PS-AT complex formation nor the hinge region expulsion. In such a conformation, the RCL can form a Michaelis complex with proteinases. It should be mentioned that an 8 ns molecular dynamics simulations, in contrast to our study on the already-activated form of AT (PDB id: 1e03), has been already published (Verli & Guimarães, 2005). It showed also a flexible Arg393 residue with increased solvent exposure compared to the corresponding crystal structure.

Generalized correlation and allostery
In the structural studies carried out on antithrombin and in which the different forms of AT (native, latent, pentasaccharide-bound and protease-bound) were compared to each other, it is usually speculated on the possible path of allosteric signal propagation (Huntington, 2011;Olson et al., 2010). Nevertheless, it is still not fully understood how the conformational changes caused by the AT-PS complex formation launches a sign propagation which finally results in a RCL hinge region expulsion on the other site of the protein and how the AT becomes fully activated against FXa and FIXa. Since, in this work, molecular dynamics simulations were carried out only on the native non-complexed forms of AT, such structural changes cannot be expected. However, one can expect structural information on the pre-existence of such pathways even in the native free AT. Such information will be a useful complement to those we have already received from the available X-ray data for the "static" states.
One of the plausible assumptions is that the effect is conducted across the network node points (residues, regions) which have information on the changes/movements of the other nodes on that network. Therefore, covariance and generalized correlation between regions would represent a possible way of propagation of the allosteric signal. Note that while the covariance matrix misses part of correlated motions between residues/ regions/domains (Ichiye & Karplus, 1991;Lange & Grubmuller, 2006), generalized correlations (information theory-based "mutual information", MI (Lange & Grubmuller, 2006)) show more complex and more sensitive signs on the correlated movements in proteins, e.g. on the propagation of the allosteric signal.
Regarding the graphical representation of the MI matrix calculated, as proposed by Lange and Grubmuller (2006), remarkable correlations can be seen in Figure 7A between many residues located in the 80-200 residue region in Simulations 1 we carried out. Interestingly, it is in agreement with the proposal that the motions of the whole helix D with the peptide sequence at its N-terminal end as well as the C-terminal ends of helices E and F show remarkable correlation (Langdown et al., 2009). The considerable correlation of all of these peptide sequences with residues 412-420 as well as with the residues 305-310 can be partially explained by the fact that these are in the proximity of each other. The (generalized) correlations between the corresponding part of helices D-, E-, and F with the RCL as well as with residues 340-350 (situated on the back side viewing from the PS-binding area) are less straightforward. Essentially, the same information with some remarkable differences can be seen from the generalized correlation matrices depicted for Simulations 2 and 3 ( Figures S6,  Supplementary Material). Both of them show the same Figure 6. The "cartoon" and stick (Gly237 and Arg393) representation of a few characteristic snapshots of antithrombin structures during the simulation carried out on the 1e04 conformational state are shown. The helical, β-strand, and bend/coil secondary structure elements are colored in purple, yellow, and cyan/gray, respectively. Figure 7A. Generalized correlation (mutual information) matrix for the movements of the CA atoms of antithrombin during the simulation carried out on the 1e04 conformational state is depicted. The correlated regions with potential role in allosteric signal propagation are marked by dashed rectangles. The positions of helices A-I are marked by the corresponding red capital letters. regional separation of correlated motions, which were observed in the case of Simulation 1. It is worth noting that the corresponding simulations starting from 1t1f structure show larger weakly correlated regions. It is especially true for the system where there was no extra disulfide bond introduced ( Figure S6(A)).

Weighted implementation of suboptimal path (WISP) analysis
Very recently a new method for extracting information on allostery from molecular dynamics trajectory has been proposed (Van Wart et al., 2014). Using this method for the trajectories, we obtained the allosteric pathways from the AT-PS interaction site to the site of action; in our cases, the hinge region was proposed for the free AT structures. Surprisingly, three different pathways were proposed. While Simulation 1 showed a signal path which propagated along the helix D ( Figure 7B), Simulation 2 proposed an inner path in which the N-terminal end of helix A is involved ( Figure S7(A)). It can also be rationalized since the Trp49 was found to play an important role in stabilizing activated state (Monien, Krishnasamy, Olson, & Desai, 2005). Simulation 3 predicted a preferable pathway through the helices P and E and the second and third strands of sheet A ( Figure S7 (B)). While the path obtained from Simulation 1 seems to be the most obvious, even the latter ones cannot be excluded in advance. It should be noted, however, that in AT-proteinase complexes, the RCL conformation is more similar to those which were observed in Simulation 1 (1e04). Thus, it can be assumed that the 1t1f → saltbridged 1e04-like conformation transition precedes the activation.
From the results mentioned above, it follows that even for similar systems slightly different allosteric pathways can be predicted for activation. The predicted pathway can also depend on the applied method. Nevertheless, it seems that the whole helix D (especially, the Ile119, Phe121-Phe123, and Leu130 residues), the C-terminal part of the helix F (Val165, Tyr166 residues) as well as the N-terminal end of the second strand of β-sheet A (Leu140) and the C-terminal end of the third strand of the β-sheet A (Phe221) have particular importance.

Principal component analysis
Global internal motions (obtained by removing the translation and rotation degrees of freedoms) can be decomposed into principal components. These principal components can be regarded as large amplitude (low frequency) modes and can be assumed that conformational transitions begin alongside these modes. The first (largest amplitude) motions calculated by the ProDy software are shown in Figure 8. It is immediately apparent that to Figure 7B. Possible allosteric pathways between the potential helix P and the hinge region of antithrombin calculated by the WISP software (Van Wart et al., 2014). The simulation was carried out starting from the conformational state corresponding to the 1e04 PDB structure. these modes, the contribution of those regions is dominant which have the largest fluctuations (N-terminal region, pentasaccharide-binding region, helices E and F, as well as the RCL) demonstrating coupled motions to each other. While it is not an exact proof for the direct role of these motions in propagation of allosteric information (Cui & Karplus, 2008;Van Wynsberghe & Cui, 2006) (linear combinations of these motions can be used as well and even the contribution of these motions to the trajectory snapshots can be different from time to time), it definitely helps to explain allostery.

Conclusion
Two microsecond molecular dynamics simulation were carried out for the native free forms of antithrombin (AT). By the evaluation of simulation trajectories, it was pointed out that the RCL can populate conformational states with solvent-exposed Arg393 side chain conformation, which can form the antithrombin-protease Michaelis complex. It also showed that the peptide sequences corresponding to the helix D extension can be featured by especially large rmsfs populating even helix-like conformational states. In Simulation 1, where the helix P was still not formed, the corresponding sequence showed especially large fluctuation as well. Mutual information analyses showed remarkable (generalized) correlation between those regions which were found to change their conformation and/or orientation in X-ray structures of AT-PS complexes. It has been revealed that allosteric information propagation pathways exist even in the nonactivated native form of AT. The same conclusion can be drawn by PCA extracting the principal components of fluctuations from the trajectory of simulation as well as WISP analysis of dynamics trajectories. From metadynamics calculations, the conclusion can be drawn that the helix P formation might precede the PS binding.

Supplementary material
The supplementary material for this paper is available online at http://dx.doi.10.1080/07391102.2014.986525.