Conformational changes and allosteric communications in human serum albumin due to ligand binding

It is well recognized that knowledge of structure alone is not sufficient to understand the fundamental mechanism of biomolecular recognition. Information of dynamics is necessary to describe motions involving relevant conformational states of functional importance. We carried out principal component analysis (PCA) of structural ensemble, derived from 84 crystal structures of human serum albumin (HSA) with different ligands and/or different conditions, to identify the functionally important collective motions, and compared with the motions along the low-frequency modes obtained from normal mode analysis of the elastic network model (ENM) of unliganded HSA. Significant overlap is observed in the collective motions derived from PCA and ENM. PCA and ENM analysis revealed that ligand selects the most favored conformation from accessible equilibrium structures of unliganded HSA. Further, we analyzed dynamic network obtained from molecular dynamics simulations of unliganded HSA and fatty acidsbound HSA. Our results show that fatty acids-bound HSA has more robust community network with several routes to communicate among different parts of the protein. Critical nodes (residues) identified from dynamic network analysis are in good agreement with allosteric residues obtained from sequence-based statistical coupling analysis method. This work underscores the importance of intrinsic structural dynamics of proteins in ligand recognition and can be utilized for the development of novel drugs with optimum activity.


Introduction
Understanding of the fundamental mechanisms of molecular recognition is central to understand biology at the molecular level. The conformational transitions that originate in the ligand binding are explained on the basis of two representative models: the induced-fit model proposed by Koshland (1958) and the preexisting equilibrium dynamics or conformational selection model (Boehr, Nussinov, & Wright, 2009;Ma, Kumar, Tsai, & Nussinov, 1999;Monod, Wyman, & Changeux, 1965). In induced-fit model, ligand binding induces a conformational change in the macromolecule to achieve the preferred ligand-bound conformation, whereas according to conformational selection model, ligand selectively binds to a conformational substate available prior to binding with low population in an equilibrium ensemble of substates of the macromolecule, eventually shifting the equilibrium toward the ligand-bound conformation. The Protein Data Bank (PDB) (Berman et al., 2000) is rapidly and continuously growing with structure deposition and more than 89,000 structures are already available. Many proteins have multiple X-ray structures with different ligands and/or under different conditions. Although crystal structures provide significant information about binding interactions but alone they are not sufficient to understand the mechanism or origin of biomolecular recognition. In fact, proteins do not function as static entities; they are rather engaged in functional motions and interactions with other molecules. Understanding of biomolecular interactions helps in identifying the cause of diseases and ultimately to develop therapeutic strategies. Molecular docking is the most commonly used technique in structure-based drug designing. However, incorporation of protein flexibility in docking methods is a major concern because most of the algorithms are based on computationally expensive methods like molecular dynamics (MD) and Monte Carlo simulations. Thus, the prediction of a conformational change upon ligand binding remains a formidable challenge because it is difficult to explore adequate conformational space in a reasonable computation time.
After a drug is absorbed into the bloodstream, it gets distributed to various tissues of the body. These drug molecules can bind to a variety of blood components i.e. red blood cells, leukocytes, and platelets, as well as plasma proteins. Plasma proteins binding can affect the pharmacokinetics of drugs. Human serum albumin (HSA), a most abundant protein in blood plasma, can bind different classes of ligands ranging from metal ions and small molecules to macromolecular compounds such as dendrimers, fullerene derivatives, fatty acids (FAs), hormones, proteins, and drugs at multiple sites (Curry, 2009;Fanali et al., 2012;Ghuman et al., 2005). HSA also helps in increasing solubility in plasma, decreasing toxicity, and/or protection against oxidation of the bound ligand (Curry, 2009;Fanali et al., 2012;Ghuman et al., 2005). This extraordinary ability of HSA has stimulated many efforts to understand what determines the binding of a diverse range of ligands. HSA is a highly flexible 67 kDa monomer containing three homologous helical domains, I (residues 1-195),  and III (384-585), each divided into A and B subdomains and the overall structure is stabilized by 17 disulfide bridges. Both ends of HSA protein show high conformational flexibility (Sugio, Kashima, Mochizuki, Noda, & Kobayashi, 1999) and are not available in most of the crystal structures. Three homologous domains of HSA interact with each other according to physiological condition and show a wide variety of conformational changes with allosteric effect (Fanali et al., 2012;Ghuman et al., 2005). Several interaction studies of different drugs/ligands with HSA have been performed using X-ray crystallography (Bhattacharya, Grune, & Curry, 2000;Curry, 2009;Curry, Mandelkow, Brick, & Franks, 1998;Petitpas, Grune, Bhattacharya, & Curry, 2001;Simard et al., 2005;Zunszain, Ghuman, McDonagh, & Curry, 2008), MD simulations (Fujiwara & Amisaki, 2008;Sudhamalla, Gokara, Ahalawat, Amooru, & Subramanyam, 2010) and nuclear magnetic resonance (NMR) technique (Fanali, Pariani, Ascenzi, & Fasano, 2009;Krenzel, Chen, & Hamilton, 2013;Simard, Zunszain, Hamilton, & Curry, 2006). However, to the best of our knowledge, a global picture of correlation between mechanism of molecular recognition by HSA and the conformational fluctuations (that is, structural dynamics) in the native state ensemble is yet to be established.
In this article, we address a detailed theoretical study of the molecular mechanism of ligand binding in HSA and aim to identify crucial residues involved in longrange and short-range communications. Toward this goal, we have used essential dynamics-refined ENM (ed-ENM) (Orellana et al., 2010) method for studying the global fluctuations in the native state of HSA. It is well established that handful of experimentally known conformers of a protein determined under different conditions, or with small variations in sequence, capture a representative subset of the true native state ensemble (Best, Lindorff-Larsen, DePristo, & Vendruscolo, 2006). X-ray structural ensemble, derived from 84 crystal structures resolved for HSA with different ligands and/or under different conditions, has been used as the reference for the conformational space accessible to HSA. Principal component analysis (PCA) was employed to derive the functionally important collective motions from X-ray structural ensemble. Furthermore, we utilized dynamical network analysis (Sethi, Eargle, Black, & Luthey-Schulten, 2009;Vanwart, Eargle, Luthey-Schulten, & Amaro, 2012) and community analysis based on mutual information of correlated residue motions to characterize the information flow between communities in the network and identify the crucial residues involved in allosteric communication.
We found that only first few slow modes of ENM are sufficient to capture the major functional subspace explored by PCA modes, thereby suggesting that ligand selects the most favorable structure from accessible conformational space (i.e. conformational selection). Large-scale global conformational changes are observed due to FAs binding, while binding of drugs show relatively small structural deviation in comparison to FAs. This information can be utilized in docking algorithms to improve their predictive power. Major response of FAs binding is found in communities of FA2-and FA3-binding sites known for allosteric effect. We also used statistical coupling analysis (SCA) (Lockless & Ranganathan, 1999;Suel, Lockless, Wall, & Ranganathan, 2003) method based on coevolutionary information of protein sequence to predict the allosteric residues, and compare with the results obtained from community network analysis.

Principal component analysis
The complete list of PDBs used to generate experimental native state ensemble are available in the Supporting information Table S1. X-ray ensemble is generated by superimposing C α atoms of crystal structures onto the reference structure (PDB 1AO6) using Kabsch algorithm in an iterative procedure. Each structure is represented by 3N Cartesian coordinates, where N is the number of C α atoms. We build covariance matrix C ¼ XX T from X-ray ensemble; here, X ð¼ ðq 1 À qÞðq 2 À qÞ. . .ðq M À qÞ ½ 3N ÂM is a matrix of M columns (number of structures), each with 3N elements, represent the deviation vector from the average structure of the ensemble. Covariance matrix is diagonalized to get principal directions of structural variations. These principle directions are rank-ordered with decreasing variations and named as principal component 1 (PC1), principal component 2 (PC2), etc.

Elastic network normal mode analysis
ENMs describe the dynamics of a protein by its contact topology, represented as a network of nodes and springs, and only the C α atoms of the individual residues are considered as nodes and each residue pair located within a cut-off distance is connected by a spring of uniform force constant. ENMs are based on harmonic, nearequilibrium approach, and thus have problems in capturing large anharmonic motions. Many attempts have been made to improve the robustness and generality of ENMs (Atilgan et al., 2001;Hinsen, 1998;Tama, Gadea, Marques, & Sanejouand, 2000;Yang, Song, & Jernigan, 2009;Zheng, 2008). Here, we adopted ed-ENM (Orellana et al., 2010) method based on hybrid potential, which has been calibrated using a large database of MD trajectories to trace better protein flexibility. ed-ENM method captures the flexibility of NMR structural ensembles with remarkable precision. This is a parameter-free ENM method, avoiding any arbitrary cutoff, and uses simple and robust scaling of the local backbone and long-range contacts. This method treats the sequential and spatial (nonsequential) contacts differentially. A fully connected matrix is used for the first M neighbors (sequential contacts), while a continuum distance-dependent function with a calibrated size-dependent cutoff is used for residue pairs that are close in space and not included in the first group. ed-ENM method uses two kinds of inter-residue force constants: (i) very stiffsequence distance-dependent constant, and (ii) Cartesian distance-dependent constant with stiffness scaled down to the sixth power of the exponential term (for more details see Orellana et al., 2010). The overall potential energy of network is given by Here, K ij is force constant acting between any pair of nodes (residues) i and j; C ij is element of (Kirchhoff matrix) inter-residue connectivity matrix, where i and j represent residue index; R ij and R 0 ij are the instantaneous and reference (equilibrium) distances between each pair of α-carbon of residues i and j; H is the Hessian matrix of 3N × 3N, which is second derivative of the potential; DR ¼ ðDx 1 Dy 1 Dz 1 . . .Dz N Þ T is the position deviation vector for all C α atoms of the protein; and N is the number of residues (C α -atoms).
Fluctuations' correlations (cross-correlation) between residue i and residue j are calculated by The two limits of C(i,j) are 1 and −1 and represent fully correlated and fully anticorrelated motions, respectively. The overlap I J describes the similarity between the direction of conformational change along a transition pathway and a given normal mode of protein. This is computed as where a j is the eigenvector of mode j, a ij is the ith coordinate of jth mode, Dr represents the deviation vector of crystal structure from the reference structure, and Δr i is the ith coordinate of the deviation vector Dr. Collectivity index k (or information entropy) reflects the number of atoms (C α atoms or residues), which are significantly affected during the conformational change and can be calculated as where the sum is over the C α atoms of the protein, and s is a normalization factor that can be calculated by P N i¼1 ðsDr 2 i Þ ¼ 1 and Dr 2 i ð¼ Dx 2 i þ Dy 2 i þ Dz 2 i Þ is square fluctuation of ith residue (C α atom); k is confined to the interval between N −1 and 1; k = 1 means maximally collective conformational change, and all residues have identical Dr 2 i . The B-factor or the crystallographic Debye-Waller factor is related to the expected residue fluctuations and defined as where a k and k k are eigenvector and eigenvalue of kth mode, respectively.

MD simulations and dynamic network analysis
For dynamic network analysis, we have used the combination of MD simulations, generalized cross-correlations (Lange & Grubmuller, 2006) based on mutual information theory and graph theory. Atomistic MD simulations for both states apo-HSA (PDB 1AO6) and HSA-FAs (PDB 1BJ5) were carried out using GROMACS (Van Der Spoel et al., 2005) version 4.5 and AMBER-ff99SB force field. The topologies and force-field parameters (except partial atomic charges) of myristate molecules were generated using ACPYPE (Wang, Wang, Kollman, & Case, 2006) tool, which uses the ANTECHAMBER module with generalized amber force field (Wang, Wolf, Caldwell, Kollman, & Case, 2004). After ab initio optimization at the HF/6-31G* level using Gaussian09 (Gaussian Inc., Pittsburgh, USA), a restrained electrostatic potential fit procedure was used to generate the partial atomic charges of the myristate molecules. The initial structures were solvated in a cubic box with 43,712 and 43,580 TIP4P-Ew water (Horn et al., 2004) molecules for apo-HSA and HSA-FAs, respectively. The minimum distance between protein atoms and the box surface was 10 Å. The LINCS (Hess, Bekker, Berendsen, & Fraaije, 1997) algorithm was applied to constrain hydrogen atoms, and an integration time step of 2.0 fs was used. The bonds and the angles of the water molecules were constrained using the SETTLE (Miyamoto & Kollman, 1992) algorithm. To neutralize the systems, 15 Na + and 14 Na + were added to apo-HSA and HSA-FAs complex, respectively. Eight replicas of each energy-minimized system were equilibrated starting from different initial atomic velocities. Energy minimization and equilibration steps consist of: (1) a steepest descent energy minimization, (2) a 200 ps position-restrained (all heavy atoms of HSA and myristate molecules) NVT simulation with restraining force constant of 1000 kJ mol −1 nm −2 , (3) a 500 ps position-restrained NPT simulation with restraining force constant of 1000 kJ mol −1 nm −2 , and (4) a 5 ns of NPT equilibration at 300 K and 1 bar pressure using the Berendsen coupling scheme (Berendsen, Postma, Vangunsteren, Dinola, & Haak, 1984) for both temperature and pressure with coupling constants .1 ps and .5 ps, respectively. Each replica was simulated for another 50 ns of production dynamics at 300 K, using separate v-rescale thermostats (Bussi, Donadio, & Parrinello, 2007) for protein, myristate, ions, and the water, with a time constant of τ = .1 ps. The pressure was maintained at 1 bar using the Parrinello-Rahman (Parrinello & Rahman, 1981) algorithm with a coupling constant of 2.0 ps. The electrostatic interactions were evaluated using the particle mesh Ewald (Darden, York, & Pedersen, 1993) method, with a cutoff of 10 Å. The van der waals interactions were evaluated using a cutoff of 10 Å. Generalized cross-correlations of residues in the apo-HSA and HSA-FAs-bound states are calculated based on mutual information between all C α atoms in the protein using the gromacs tool g_correlation developed by Lange and Grubmuller (2006). Dynamic network analysis is carried out using the NetworkView plugin  in VMD (Humphrey, Dalke, & Schulten, 1996). In dynamical network, each C α atom of residues represents a node and two residues are connected by an edge if in any two of their heavy atoms are within 5 Å for more than 75% of the simulation time. Each edge is weighted by averaged generalized cross-correlations obtained from all (eight) MD trajectories.

Statistical coupling analysis
Homologues sequences for HSA (PDB 1AO6) were retrieved from NCBI nr database using psi-blast (Altschul et al., 1997) with cutoff 1.0E-3 e-value. Muscle tool (Edgar, 2004) was used to generate multiple sequence alignment (MSA) considering only less than 96% sequences similarity with more than 80% sequence coverage. Columns having less than 20% gap were considered for all calculations. SCA of MSA consisting of 231 sequences was done using SCA toolbox (SCA 5.0). Further details of the SCA method can be found elsewhere (Lockless & Ranganathan, 1999;Smock et al., 2010;Suel et al., 2003).

PCA results
PCA is a widely used technique to reveal the concerted fluctuations with large amplitudes from a set of configurations. PCA results are obtained for HSA using the ensemble of X-ray crystallographic structures. Figure 1 shows the projection of HSA structures onto the subspace spanned by first two principal components (PCs) determined for the X-ray structure ensemble. PC1 and PC2 clearly divide the ensemble of structures into two well-defined clusters. Interestingly, these two clusters differ only on the basis of FAs bound structures and FAs-free structures; cluster-1 (with 45 structures) represents structures without FAs, while cluster-2 (with 39 structures) represents structures with FAs. Minimum and maximum root mean square deviation (RMSD) within cluster-1 are .36 and 2.48 Å from its average structure, while for cluster-2 they are .22 and 2.53 Å from its average structure, respectively. Minimum and maximum RMSD between these two clusters are 3.79 and 6.73 Å, respectively. When we plotted the structural ensemble on first three PCs, we did not observe any change in cluster classification. This is consistent with the fact that PC1 and PC2 together contribute 92% to the total variation (PC1 and PC2 contribute 89% and 3%, respectively) and, is therefore sufficient to explain the global structural changes. The structural variation represented by highly dominated PC1 is shown in Figure S1. HSA is a well-known monomeric protein for its allosteric properties. FAs and heme both are known as allosteric modulators of HSA (Ascenzi, Bocedi, Notari, Menegatti, & Fasano, 2005;Fanali et al., 2012;Simard et al., 2006). Although cluster-2 in PC1-PC2 projection (Figure 1) includes the structure that is bound to FAs and heme (PDB 1N5U), but the structure bound to FAs and hemin (PDB 1O9X) has slightly different conformation than typically observed for the FA-bound state; this may be because of the crystal packing effect as reported earlier (Zunszain, Ghuman, Komatsu, Tsuchida, & Curry, 2003). Further it is important to note that the medium/ long chain FAs (e.g. myristate) are known to act as allosteric regulators in HSA, but FAs with shorter chain length (e.g. octanoate) are not (Fanali et al., 2012). As structures of HSA complexed with short chain FAs are yet not available, the structural ensemble considered in this study consists of only medium/long chain FAs-bound structures. From the projection of structural ensemble on PC1 and PC2 (Figure 1), it is clearly evident that FA-bound complexes show large-scale global conformational changes, whereas all other ligands (e.g. drugs, proteins, etc.) show relatively small structural deviation in HSA.

ed-ENM analysis
In contrast to PCA, ed-ENM analysis is performed for a single unliganded structure, that is, for apo-HSA (PDB 1AO6). Low-frequency modes (the first six low-frequency modes are zero modes corresponding to the three translational and three rotational degrees of freedom) are assumed to be functional modes and contribute to global molecular motions of protein. To validate the molecular motions predicted by ed-ENM, we have compared the theoretical B-factor obtained from ed-ENM with the experimental B-factor (PDB 1AO6) and found a very Figure 1. Distribution of X-ray ensemble (84 structures) on the subspace spanned by first two slowest PC modes (PC1 and PC2). The distributions of structures along the individual modes are shown by the histograms. Structures are divided into two groups using k-mean clustering; blue circles represent structures that are grouped into cluster-1 (HSA without FAs), while red circles represent structures grouped into cluster-2 (HSA with FAs). In PDB 4K71, 4N0F, 4N0U, and 2VDB, HSA is complexed with other proteins. good agreement with a correlation coefficient, R = .7 (Figure 2). We also calculated collectivity index, which evaluates how many residues are significantly affected in a particular mode, and found the values .42, .40, and .63 for first three modes, respectively. This suggests that the motions in ed-ENM mode three (ed-ENM3) involve collective displacements of relatively large number of residues compared to the first two modes. Previous experimental studies have shown that subdomains IB-IIA and FA2-FA3-binding sites are allosterically coupled. Binding of heme in subdomain IB allosterically inhibits the binding of warfarin in subdomain IIA, and vice versa (Ascenzi et al., 2005;Ascenzi & Fasano, 2010). Binding of myristate at the FA2-binding site (interfaces of IA, IB and IIA) and at the FA3 site (subdomain IIIA) drives allosteric transitions in HSA (Curry, 2009;Fanali et al., 2012). Our results also show positive cross-correlations emphasizing allosteric links between these ligand-binding sites (Figure 3). It should be noted that anticorrelated motions observed between subdomain IB and IIIB, and mixed (positive and negative correlated) motions between IA and IIIB ( Figure 3) helps in producing large conformational changes in HSA.

Comparison of PCA and ed-ENM results
To verify the relevance of modes derived from PCA analysis and ed-ENM analysis, we calculated the overlap between the subspaces spanned by the eigenvectors. Figure 4 shows the overlap between the first seven lowest frequency PCA and ed-ENM modes. Such comparisons allow us to identify PCA and ed-ENM modes that are in close correspondence with each other.
The counterpart of highly dominated PC1 is found to be ed-ENM1 and ed-ENM3 with overlap values of .50 and .60, respectively (Figure 4). Comparison of structural variations along ed-ENM1, ed-ENM3, and PC1 is shown in Figure 5   .99, respectively. To understand which low-frequency mode makes a significant contribution to the structural changes from unbound to FA-bound state, we calculated the overlap between ed-ENM modes and deviation vector from unliganded HSA (PDB: 1AO6) to FA bound structures. Interestingly, we found that all FA-bound structures show very similar overlap pattern with ed-ENM modes, where largest contributing modes are ed-ENM1 and ed-ENM3 (Figure 6(A), Supporting information Table S2). Overlap between ed-ENM modes and deviation vector from unliganded HSA (PDB: 1AO6) to structures complexed with different ligands (other than FAs) does not show any fixed pattern; however, slow modes do contribute to the binding of these ligands (Figure 6(B), Supporting information Table S2). Further, we calculated the cumulative overlap to check how many ed-ENM modes are required to explore the subspace spanned by PC1 and PC2. Cumulative overlap represents the square root of sum of individual squared overlaps, starting from the lowest nonzero frequency modes. Figure 6(C) shows the cumulative overlap of both PC1 and PC2 with 10 slowest ed-ENM modes. Clearly, first eight ed-ENM modes produced more than 90% overlap with both PC1 and PC2. The present comparison demonstrates that the subspace sampled by the softest ed-ENM modes is able to explore the experimentally accessed conformational space. In other words, the observed conformational changes after ligand binding are likely to be facilitated by low-frequency motions that are intrinsic to the structure. Our results therefore revealed that in the absence of ligands, protein fluctuates into conformations that resemble to the ligand-bound states. This suggests that preexisting equilibrium dynamics could be a preferred mechanism of binding of ligands to highly flexible HSA. To this end, it should be noted that our results are in support of previous findings (Bakan & Bahar, 2009;Yang, Song, Carriquiry, & Jernigan, 2008) where a close correlation between ENM and PCA was observed, thereby confirming the validity of using ENM to understand the functional dynamics of proteins.

Dynamic network analysis
From PCA results of X-ray ensemble, it is clear that FAs binding induce large-scale global conformational changes in HSA. MD simulation is a powerful method to provide a detailed atomistic description of these large rearrangements due to FAs binding. Dynamical weighted network for both HSA and HSA-FAs are examined through community network analysis using the NetworkView  plugin in VMD (Humphrey et al., 1996). Average values of the generalized crosscorrelations of residues obtained from MD simulations are used to weigh the edges of a network. The time-averaged connectivity of the nodes is used to identify subnetworks or communities in the network. Figure 7 shows graph network representation of the optimum structure of communities obtained for HSA and HSA-FAs. The Girvan-Newman algorithm splits the weighted network of HSA into 12 communities and that for HSA-FAs to 13 communities. Flow of information among communities can be traced using coarse-grained network where node represents a community and edge represents intercommunity total betweenness centrality. Comparison of coarse-grained network of both states, HSA and HSA-FAs, shows HSA-FAs has more robust community network than HSA with several routes to communicate among different parts of the protein. Furthermore, intercommunity information flow is strong in HSA-FAs than in HSA community network. Community 7 and 3ʹ in HSA and HSA-FAs network, respectively, are important as they are strongly connected with communities of all domains; they include helix h1 (residue 196-205) which is a major molecular hinge site (predicted by ed-ENM analysis) responsible for large movement of domain I and domain III with respect to domain II. Subdomain IIB, which has very less movements in ed-ENM and PCA modes, has only one independent community 6 and two shared communities 2 and 3 in the community network of HSA; however, there are two independent communities 6ʹ and 9ʹ and two shared communities 8ʹ and 13ʹ for HSA-FAs. This less coarse-grained structure of subdomain IIB in HSA-FAs shows that this subdomain is now more actively communicating with other subdomains. Major rearrangements of residues are observed between the communities of domain I and II (Supporting information Figure S2). Interface of subdomain IA and IIA where FA2 binds and interface of domain II and domain III near the FA3-binding site have large rearrangement of residues among communities after FA binding. These two sites (subdomains IA, FA2-binding site and IIIA, FA3-binding site), known for major allosteric effect, are responsible for the perturbation of correlated motions of residues in HSA and can be seen in community network analysis; community network structure depends on correlated motions of residues obtained from dynamics. Intercommunity information flow depends only on few residues that connect them. Mutations in these sites can disrupt complete information flow in distant communities of the network. Therefore, these residues can be considered as critical nodes. Comparisons of these critical residues (Supporting information Table S3) between HSA and HSA-FAs community network shows many of them are common and can be considered as structurally and functionally important for HSA.

SCA and allosteric hot spots
A single mutation away from the binding site or functional site can change the whole dynamics of the protein as well as its binding specificity. These single site mutations are not independent; they are always linked with other sites, and if one mutation occurs, it needs to be compensated with other to preserve the functional dynamics. How coevolutionary processes are related to functional diversification within protein families has been studied successively in recent years, and continues to be a topic of central importance. Serum albumin is one of the most rapidly evolving proteins, where~70-80% of its residues have been replaced in 500 Myr. Thus, it would be interesting to know the biologically relevant correlated residue positions and their role in functional dynamics in HSA. We adopted a sequence-based method to estimate energetic coupling between residues in proteins, named SCA method (Lockless & Ranganathan, 1999). Figure 8(A) shows hierarchical clustering of SCA correlation matrix, which is a global representation of evolutionary coupling between all pairs of residues. Hierarchical clustering clusters residues displaying Upper part shows communities' map on structure and lower part shows coarse-grained community network where nodes represent communities and edge represents connectivity. Widths of edges are proportional to intercommunity total betweenness centrality. similar and interconnected pattern of significant energetic coupling values. We found that most of the amino acid positions are weakly correlated and evolving independently (Figure 8(A)), only few positions (less than 20%) (Supporting information Table S3) show strong coupling and can be considered as allosteric hotspots. These positions are not clustered in any particular domain of the protein; they are scattered and form a continuously connected network of residues in 3D space, which indicates that different domains of HSA are allosterically linked with cluster of energetically coupled residues. Further, allosteric residues from SCA are in good agreement with residues obtained using dynamic network analysis (Supporting information Table S3). It should be noted that dynamic network analysis and SCA are based on very different principles and assumptions where the former one is purely topology dependent, while SCA relies on information based on MSA.
In several studies, the fluctuation profile of residues along the low-and high-frequency ENM modes has successfully been utilized to identify hinge sites responsible for global conformational changes, and kinetically important residues responsible for protein stability, respectively (Bahar, Atilgan, Demirel, & Erman, 1998;Emekli, Schneidman-Duhovny, Wolfson, Nussinov, & Haliloglu, 2008;Haliloglu, Keskin, Ma, & Nussinov, 2005;Kundu, Sorensen, & Phillips, 2004). In fact, fluctuation minima in slow modes are generally identified as hinge sites, and peaks in fast modes indicate kinetically hot spots responsible for protein stability. To understand the role of critical residues from dynamical network analysis and SCA in intrinsic dynamics, we carried out the fluctuation profile analysis in slow modes and fast modes. Critical residues that are found to be common in both methods (dynamic network analysis and SCA) reside either in minima or near the minima of slow/fast modes (Figure 8(B)). This indicates that these residues are likely to act as hinge sites, driving major conformational changes in HSA. We note that only two residues (Tyr341 and Val469), predicted as critical residues in dynamic network analysis of HSA and HSA-FAs, are at the peak in the fluctuation profile of high-frequency modes (Figure 8(B)) and can be considered as kinetically important residues responsible for protein stability.

Conclusion
Our network analysis captures information about highly cooperative domain motions important for the protein function (i.e. drug binding) or allosteric communications. From experimental (PCA analysis of X-ray structure ensemble) and theoretical (ed-ENM) analysis, it is clear that low-frequency modes calculated for unbound state account for the conformational changes observed upon ligand binding. This observation reveals that ligand selects the most favored conformation from accessible equilibrium structures of HSA. As small subset of lowfrequency modes is able to capture the major functional changes, this information could be used as structural refinement, molecular docking, and the development of drugs that target particular functional motions. Dynamic network analysis of both states HSA and HSA-FAs shows that community network of HSA-FAs is more robust and has more information flow between different domains of the protein. Further, critical residues obtained using community network analysis are common in both states and are in good agreement with allosteric residues obtained from sequence-based SCA method. Predicted allosteric key residues can be utilized to improve the function of HSA. Our study will also facilitate efforts to modify the drugs to control their interactions with HSA, and thereby optimize their distribution across the body. The combination of structural dynamics and sequencebased bioinformatics approach, used in this study, can further be utilized to understand the molecular recognition processes in other biological systems.

Supplementary material
The supplementary material for this paper is available online at http://dx.doi.10.1080/07391102.2014.996609.