Uncovering allostery and regulation in SORCIN through molecular dynamics simulations

Abstract Soluble resistance-related calcium-binding protein or Sorcin is an allosteric, calcium-binding Penta-EF hand (PEF) family protein implicated in multi-drug resistant cancers. Sorcin is known to bind chemotherapeutic molecules such as Doxorubicin. This study uses in-silico molecular dynamics simulations to explore the dynamics and allosteric behavior of Sorcin in the context of Ca2+ uptake and Doxorubicin binding. The results show that Ca2+ binding induces large, but reversible conformational changes in the Sorcin structure which manifest as rigid body reorientations that preserve the local secondary structure. A reciprocal allosteric handshake centered around the EF5 hand is found to be key in Sorcin dimer formation and stabilization. Binding of Doxorubicin results in rearrangement of allosteric communities which disrupts long-range allosteric information transfer from the N-terminal domain to the middle lobe. However, this binding does not result in secondary structure destabilization. Sorcin does not appear to have a distinct Ca2+ activated mode of Doxorubicin binding. Communicated by Ramaswamy H. Sarma


Introduction
Soluble resistance-related calcium-binding protein or Sorcin is a 22 kDa calcium binding allosteric protein in humans which is implicated in, and overexpressed in multidrug-resistant (MDR) cells of different cancer cells (Colotti et al., 2014;Deng et al., 2010;Kawakami et al., 2007;Meyers et al., 1987;Qi et al., 2006;Qu et al., 2010;Zhou et al., 2006).Structural studies on Sorcin are now almost two decades old.The earliest published structures were simultaneous X-ray diffraction (XRD) studies of Chinese hamster and human-derived Sorcin published in 2001 by Ilari (Ilari et al., 2002) and Xie (Xie et al., 2001) respectively.Sorcin is a homodimeric protein; each monomer is formed by two domains: (i) a flexible and hydrophobic Gly/Pro-rich N-terminal domain (NTD, residue 1-32) and (ii) calcium-binding Cterminal domain (SCBD,.The flexibility of the NTD renders it ill-suited to crystallization.The SCBD is made up of eight a-helices (named A-H) connected by loops to form five EF-hand motifs.(Details about Sorcin's topology, including the EF hand indices are provided in Supplementary Table S1.)Thus, Sorcin is a member of the penta-EF hand family of proteins (PEF).Other members of this family include Ca 2þ dependent calpains, peflin, grancalcin (Jia et al., 2000) and PDCD6 (Jia et al., 2001;Maki et al., 2011) .This family is known for 3 distinct structure-function properties: (i) the Gly/ Pro-rich N-terminal region, (ii) dimerization mediated by an EF hand, and (iii) Ca 2þ -triggered translocation to membranes (Maki et al., 2002).The five Sorcin EF-hands form the following two pairs: EF1 with EF2 and EF3 with EF4 connected by small two-stranded b-sheets.The unpaired EF5 hand of monomer is involved in homodimer formation by pairing with the unpaired EF5 of the other Sorcin monomer in a closely intertwined motif almost akin to domain swapping.The Sorcin monomer SCBD structure can be further visualized as composed of the following subdomains: (i) the N-terminal lobe, which comprises helices A through D, terminating in the EF3 loop, (ii) the middle lobe, which comprises helices E & F, terminating in the EF4 loop and finally (iii) the inner core, which comprises helices G & H and the EF5 loop.It is the inner core subdomain that is principally involved in dimer formation.Subsequent work by Ilari andco-authors in 2015(Ilari et al., 2015) revealed the allosteric mechanism of this protein: Ca 2þ triggers an opening of EF hands 1 and 3 (by 18.4 and 14.5 degrees respectively), which rotates the entire N-terminal lobe and middle lobe outwards with EF3 acting as a hinge point and thus exposing hydrophobic patches identified as Pockets 1 and 2, which are potential partner protein binding sites.
Sorcin is found in a wide variety of tissues ranging from the gut, spinal cord, brain, and colon; it is among the top 5% of expressed proteins as per the PAXdb protein abundance database.The PEF motif naturally suggests a role for Sorcin in Ca 2þ homeostasis.Part of this function can be attributed to its micromolar affinity for Ca 2þ ions (Meyers et al., 1995;Zamparelli et al., 1997).The remainder of this function can be attributed to Sorcin's interaction with calcium channels and other proteins such as RyR, SERCA and NCX (Lokuta et al., 1997;Matsumoto et al., 2005;Meyers et al., 1995;Seidler et al., 2003;Zamparelli et al., 2010).The Sorcin encoding gene in both humans and rodents is found in chromosome 7, at the same locus as the genes encoding multidrug resistance proteins MDR1 and MDR2 (themselves both members of the ATP Binding Cassette Transporter or ABC family)(van der Bliek et al., 1986Bliek et al., , 1988)).High Sorcin expression levels are associated with escape from ER stress and apoptosis, while Sorcin knockdown is associated with defective mitosis and cytokinesis and increased apoptosis (Lalioti et al., 2014;Maddalena et al., 2011).The full story of Sorcin's participation in the MDR phenotype of cancer cells is not yet understood.It is known that Sorcin plays a role in apoptotic regulation and its silencing results in stalled mitosis.However, the direct interaction of Sorcin with chemotherapeutic drugs had not been structurally investigated until 2017 by Ilaria et al (Genovese et al., 2017) who reported that this protein binds to chemotherapeutic agents including cisplatin, vinblastine, paclitaxel and doxorubicin.They postulated the existence of two drug binding sites on the protein surface, one operating with low nanomolar affinity and the other with 100-fold weaker affinity.These two sites were the same as the previously identified protein-binding Pockets 1 and 2. Ilaria et.al. solved an XRD structure of Doxorubicin bound to the low affinity site with Mg 2þ liganded to the protein instead of Ca 2þ .This structure contains two dimers in the asymmetric unit (corresponding to chains AB and CD) and is remarkably similar to the apoSorDim (dimeric apo Sorcin) structure, in that it adopts a relatively compact conformation.
The overall motivation for this study comes from the fact that while the allosteric triggering of Sorcin by Ca 2þ is understood, its consequences for downstream signaling and binding with other metabolic partners are not.Furthermore, the ability of Sorcin to directly bind and sequester chemotherapeutic drugs is also minimally understood at the level of enzyme kinetics studies-but not very well at the structural level.This presents a significant gap in the development and delivery of new chemotherapeutic drugs which can bypass the Sorcin block.Therefore, in this study, we have attempted to answer the following broad questions: (i) what are the effects of Ca 2þ binding upon the dynamics of Sorcin, (ii) what are the effects of Doxorubicin binding on the dynamics of Sorcin and (iii) is there any evidence of allostery and cooperative behaviour in the context of drug binding?Our results indicate that the Calcium bound Sorcin samples both open and closed conformations as defined by rotation of the NTD and middle lobe around the EF3 hinge point.Sorcin dimer stabilization was found to be linked to a reciprocal allosteric handshake around the EF5 motif.In terms of allosteric pathways related to the binding of Doxorubicin, our results did not uncover any 'smoking gun' suggesting cooperative behaviour.

Simulation setup
We used the tetrameric crystal structure from Protein Data Bank entry 5MRA (Genovese et al., 2017) to generate two variants, the regular tetrameric variant (called DoxSorTet for tetrameric Sorcin bound to Doxorubicin) with four monomers, labeled A, B, C and D along with the bound Doxorubicin molecule and the apo-form of the complex (called SorTet for tetrameric apo Sorcin) in which the Doxorubicin molecule was deleted.In both tetrameric structures, the Mg 2þ ions in the crystal structure were retained.Two dimeric variants were also generated from the crystal structures 4UPG and 4USL (Ilari et al., 2015) also obtained from the Protein Data Bank.The latter complex (4USL) contained three Ca 2þ ions in each monomer while the 4UPGbased structure did not contain any ions.The two variants were named apoSorDim and CaSorDim respectively.A third dimeric variant was created by placing the Doxorubicin molecule in the 4USL complex at the same location as the tetrameric (5MRA) based structure after aligning both complexes.This was called DoxSorDim.The two chains (monomers) in the dimeric complex are named A and H, following the labels in the PDB structure.In addition, two monomeric structures were generated based on the 4UPG and 4USL structures, by retaining only one monomer of each complex, along with bound Ca 2þ ions in the case of the 4USL-based structure.In all structures, the crystallographic waters if present, were retained.Therefore, a total of seven structures were used in this study, two tetrameric, three dimeric and two monomeric.Each system was immersed in pre-equilibrated TIP3P water molecules.Na þ and Cl -ions were added at random positions to bring the molar concentration to 0.15 mol/L while maintaining the neutrality of the system.The tetrameric systems consisted of approximately 100,000 atoms and measured 10:8 � 10:8 � 9:0 nm 3 : Figure 1(a) shows the DoxSorTet complex (derived from PDB 5MRA) with the four monomers in distinct colors.The dimeric systems contained approximately 63000 atoms each and measured 8:1 � 8:1 � 9:7 nm 3 : The monomeric systems contained approximately 42500 atoms each and measured 6:8 � 8:1 � 8:0 nm 3 : Figure 1b shows the CaSorDim (derived from PDB 4USL) in ribbon representation with the bound Ca 2þ ions as yellow spheres.The two monomers are indicated by red and grey ribbons.Figure 4a shows the DoxCaSorDim with the Doxorubicin represented by sticks.

MD simulation protocols
Molecular dynamics (MD) simulations were performed using NAMD 2.9 (Kal et al., 1999;Phillips et al., 2005).All simulations employed periodic boundary conditions and multiple time-stepping wherein local interactions were calculated every 2 fs and full electrostatic evaluations were performed every two-timesteps.The particle mesh Ewald method (Darden et al., 1993) was employed for long-range electrostatic calculations.CHARMM36 (Klauda et al., 2010;MacKerell et al., 1998) force fields were employed along with the TIP3P water model.Covalent bonds involving hydrogen were held rigidly by the RATTLE (Andersen, 1983) and SETTLE (Miyamoto & Kollman, 1992) algorithms.Each system was minimized for 3000 steps using the conjugate gradient method and then equilibrated in the NPT ensemble using the Nos� e-Hoover Langevin piston pressure control at 310 K for at least 5 ns.Following equilibration, all NAMD simulations were performed in the NVT ensemble with the temperature maintained at 310 K using the Langevin thermostat.The data was recorded at 10 ps intervals.A list of all simulations conducted with monomeric, dimeric and tetrameric forms of the Sorcin complex is given in Table 1.The longest trajectories were generated for the dimer complexes which are the biologically relevant assemblies.Shorter trajectories were generated for the monomer and tetramer systems, primarily for comparison.

Correlation analysis
An analysis of the correlated motions within a protein often helps to unravel long-range effects such as information flow, allosteric dynamics, etc (Dutta et al., 2017;Ghosh & Vishveshwara, 2007;Grant et al., 2006;Long & Br€ uschweiler, 2012).In this study, we have used LMI (linear mutual information) calculations based on MD trajectories to find the correlation between C a atoms of each residue with any other one.The LMI calculation returns a matrix, of all atom-wise linear mutual information whose elements are denoted as I ij .The correlation coefficient (I ij ) between two atoms or nodes (i and j) from a long MD trajectory (giving N number of set of scores for each atom) can be calculated as follows, where I ij is the correlation coefficient between two atoms (i and j), C i is the covariance matrix for the displacement of C a atom of the ith residue and C ij is the pair covariance matrix for residues i and j.
Consensus correlation map: A consensus correlation map is made from the average of several correlation matrices to ensure consistency in terms of a statistical point of view.We divided the available trajectories into several overlapping 50 ns windows (eg: 0 À 50ns, 10 À 60 ns, . . .:, 250 À 300 ns).Since there were multiple independent trajectories for some of the systems (see Table 1), the ensemble was defined as the collection of all the 50 ns windows from all the trajectories corresponding to each system.The resulting average matrix from the ensemble, termed the consensus LMI matrix, was pruned using a cutoff of 0.5, and used to generate a weighted network graph representing the dynamic correlation between different parts of the protein.The network nodes represent the C a atoms connected through edges weighted by the negative of the logarithm of the LMI values.The network was then clustered into highly intra-correlated regions (communities) that are loosely coupled to the rest of the system based on the Girvan Neuman algorithm (Girvan & Newman, 2002).These calculations were performed using the Bio3D software (Grant et al., 2006).The results are presented in Figures 3 and 5 and in Figure S3 in the Supporting Material.

Results
The results presented here have been obtained based on the molecular dynamics calculations of monomeric, dimeric, and tetrameric forms of sorcin as described in the Methods section.Table 1 lists the simulations performed.

Ca 21 binding induces large, but reversible rigid body reorientation while preserving local secondary structure
A comparison of the global backbone RMSD (root mean squared deviation) for both trajectories of the Mg 2þ bound Sorcin tetramer (Figure 1a) derived from the PDB structure, 5MRA, is shown in Figure 1c.The RMSD is seen to fluctuate around 3 Å, which is similar to the results for the Mg 2þ bound tetramer liganded with Doxorubicin, as seen in Figure 1c.This suggests that the overall Sorcin tetramer structure is relatively stable and not markedly affected by the binding of Doxorubicin in pocket 1 near the EF5 hand.This is in line with the conclusions of Genovese et.al (Genovese et al., 2017).
The conformation changes induced upon binding of Ca 2þ to the dimer structure (Figure 1b) are more radical.The first observation we make is that the global backbone RMSD for the apo-Sorcin dimer (derived from 4UPG.pdb) fluctuates between 2 to 3 Å.This is much less than the corresponding RMSD for the Ca 2þ -Sorcin dimer (derived from 4USL.pdb) which fluctuates between 4 to 6 A, as seen in Figure 1d.Considering only the RMSD derived from the Ca 2þ binding domain (SCBD), i.e residues 30-198 for both structures, we see that the results for apo-Sorcin do not change, as seen in Figure 1d.This outcome is expected since the apo-Sorcin dimer XRD structure did not yield adequate e-density to model the N-terminal domain, and the deposited 4UPG.pdbstructure count begins at residue index 30.However, the Ca 2þ -Sorcin dimer structure count begins at residue 26.
Calculating the RMSD of this holoenzyme dimer from residue 30 onwards, we observe values fluctuating approximately between 3 to 4 Å.Thus, the elevated dynamics of the Ca 2þ -bound SCBD dimer can only be partially attributed to the flexible Gly-rich N-terminal domain.The Ca 2 -bound SCBD dimer displays higher intrinsic dynamics compared to the apoenzyme dimer.
Having discussed the overall dynamics for the tetramer and the dimeric forms of Sorcin, we now move on to discussing the dynamics of apoSor and CaSor monomers which are derived from 4UPG.pdb and 4USL.pdbrespectively.The backbone RMSD plots for both structures are shown in Supplementary Figure S1a, where we see that the apoSor monomer RMSD fluctuates between 4 to 5 Å, while the value for the CaSor monomer fluctuates between 5 to 6 Å.The residue-specific RMSF plots for both structures are shown in Supplementary Figure S1b, where we see that while the RMSF values for the CaSor monomer are slightly higher overall than those for the apoSor monomer; the same plot pattern is present in both structures.There is one notable exception to this pattern which appears towards the C-terminal end of the protein, where the region approximately between residue index 175 to 185 shows significantly higher RMSF values for the CaSor monomer.This is the region corresponding to the EF5 hand loop motif (Supplementary Table 1 lists the secondary structure elements).Thus, the CaSor monomer is, overall, a more dynamic entity than the apoSor monomer.
Analysis of the radius of gyration of both monomers (Supplementary Figure S1c) reveals that the apoSor monomer has approximately uniform R gyr around 16.5 Å throughout the �500 ns trajectory, while the CaSor monomer starts with a much larger gyration radius of around 19 Å-but drops to a more compact �17Å after about 300 ns.This suggests some degree of rigid body reorientation occurring within the CaSor monomer.This bimodality is evident in a histogram plot of the R gyr values for monomeric apoSor and CaSor as shown in Supplementary Figure S1d in the Supporting Material.Figure 6 shows two snapshots from the trajectories of the apoSor and CaSor monomers at the 0 ns and 390ns: The change in the overall compaction of the structure for the CaSor monomer is visible in panels (c) and (d).Monomeric apoSor appears to have a unimodal R gyr distribution thus suggesting a single, compact, and stable structure, whereas monomeric CaSor is seen to have a more extended and bimodal distribution with minimal overlap with the apoSor plot.A histogram plot of the R gyr values for the apoSordim and CaSordim structures are also shown in Supplementary Figure S1f, where we see that while the CaSorDim is overall, a less compact molecule, it does have some overlap with the apoenzyme.The R gyr plots for the 5MRA-derived Sorcin tetramer show a unimodal distribution (Supplementary Figure S1e).We have investigated possible rigid body reorientations within the Sorcin monomers by looking at the behavior of various helices associated with the EF hands.Genovese and co-authors pointed out that Ca 2þ binding results in a particularly significant conformation change at the EF3 loop resulting in the long helix D (F92-F112) pivoting around EF3 and away from helix E (P122-M132).Thus, Genovese et.al postulate that upon Ca 2þ binding, the first half of the SCBD (comprising helices A. B, C, D) rotates around EF3 as a hinge point away from the second half of the SCBD (comprising helices E, F, G, H).This results at the higher R gyr of the CaSor monomer as compared to the apoSor monomer.However, the gradual reduction of the CaSor monomer's R gyr value suggests that the 'expanded' conformation adopted by the Sorcin monomer upon Ca 2þ binding is not graven in stone.Thus, we have studied the variation of the interhelical angles between helices A & B (i.e. the EF1 angle), C & D (EF2 angle),

D & E (EF3 angle) and finally between D & G. Supplementary
Figure S2 shows these angles in the context of the individual helices that they belong to, as well as the location of angle in the context of the overall structure.These interhelical angles, along with their variations are plotted in Figure 2. The mean values of these angles as derived from the MD trajectories of the dimeric systems, are shown in Table 2.
We observe that the EF1 and EF2 angles (Figure 2b and c) do not show any major trend over the 500 ns timescale of the simulation of the monomeric systems.The EF3 angle (Figure 2d) between helices D and E, however, starts at around 90 � for the CaSor monomer, but eventually decreases to around 45 � after 300 ns, which is comparable to the apoSor monomer.A similar trend is observed with the angle theta (Figure 2e) between helices D and G, which starts at around 100 � , and then drops to about 60 � after about 300 ns.Thus, we note that while the CaSor monomer starts out with the outer N-terminal lobe in the extended conformation, this lobe eventually rotates back inward to adopt a conformation closer to the apoSor monomer.Population distribution of the D-G angle theta shown in Figure 2f shows that while the apoSor monomer is represented by a unimodal distribution with a peak angle at � 60 � , the CaSor monomer displays a wide bimodal distribution with a peak value of angle at <70 � corresponding to a compact conformation and an extended or relaxed conformation with a peak value of theta at � 95 � .

The linchpin of Sorcin dimer stabilization is a reciprocal allosteric handshake around the EF5 motif
Dynamic cross-correlation analysis was performed on three dimeric forms of Sorcin (apoSor, derived from 4UPG.pdb,CaSor and DoxCaSor-both derived from 4USL.pdb).Here, we employ LMI (or Linear Mutual Information) as an indicator of correlations between residues.The LMI matrix is calculated as per the protocols explained in the Methods section using the Bio3D software (Grant et al., 2006).The first observation from the LMI matrices is that the Sorcin dimers are relatively weakly correlated across monomers (Figure 3a-c).For the apoSor dimer, the correlation score stays relatively low (�0.3)across monomers while rising to somewhat higher values (�0.6) within each monomer in some regions.There is one notable exception to this trend, and it is seen in the inset boxes in Figure 3a which correlate the C-terminal region of chain A to chain H and vice versa.We will postpone delving into the structural significance of this cross-correlation while we note that this phenomenon is replicated in the CaSor dimer as well as the DoxCaSor dimer, as seen in Figure 3b and c.We also note that the CaSor and DoxCaSor dimers both present overall elevated cross-correlations compared to the apo structure, with both intra-and inter-chain values routinely cresting 0.6.Thus, the Ca 2þ bound holoenzyme structures exhibit a higher level of global cross-correlations compared to the apoenzyme.The detailed community partitioning of Sorcin dimers is shown in Table 3.
A similar analysis for performed for the monomeric apoSor and CaSor structures which showed the same trend of the Ca 2þ bound holoenzyme exhibiting markedly higher internal correlations as compared to the apoenzyme (Supplementary Figure S3a vs S3b).We observe in the CaSor monomer, that despite having overall stronger correlations compared to the apoSor monomer, the number of actually correlated communities is higher-at 14, instead of just 9 communities for the holoenzyme (Supplementary Figure S3c).We also note that for both monomers, the correlated communities are highly balkanized to the point that individual communities are rarely much more than a helix followed by, or preceded by part of an EF loop, as seen in Supporting Information (Supplementary Figure S3d-g and outlined in n Supplementary Table 2.) The distribution of communities for the Sorcin dimers (Table 3 and Figure 3), is very different from the monomers.Here, we observe that as opposed to the numerous small communities comprising essentially individual helices in the monomer, there is significant consolidation of communities-the most noteworthy being the entire first half of the SCBD (comprising helices A, B, C, D-i.e-the N-terminal lobe) being consolidated into one community (C1 for chain A and C4 for chain H).This is true for both monomers of all three dimeric systems under study.The apoSor dimer's next major community group comprises the region starting with the EF3 loop and ending with the EF4 loop.This is essentially the middle lobe of the protein.This region corresponds to the communities is C2 (for chain A) and C5 (for chain H) for apoSorDim (see Figure 3d, g,  j).In the case of CaSordim and DoxCaSorDim, a part of helix D in the EF3 loop motif is also part of this community for chain A. The relevant communities are labeled C2 and C5 for chains A and H respectively (Table 3).The last major community grouping for the apoSor dimer is C3.This community is particularly interesting because it encompasses helices G and H of both monomers, i.e, constitutes an allosteric handshake between the chains.In considering the formation of these communities, we look at matched pairs in the network model shown in Figure 3d, e, and f.We see that the matched pair of C1 and C4 have no direct communication.This is also the case for the matched pairs of C2 and C5.C3 is directly connected to all the major communities, C1, C2, C4 and C5.Thus, we have found evidence for a reciprocal allosteric handshake mediated via helices G & H along with the EF5 loop.This handshake interaction appears to exist in all three forms of the Sorcin dimer that were studied.The DoxCasSorDim also showed smaller communities, labeled C6, C7 and C8, not seen in the other two complexes.

Effect of Doxorubicin binding
The effect of Doxorubicin binding upon both the dynamics of the Ca 2þ bound dimer (Figure 4a) as well as the Mg 2þ bound tetramer (Figure 4b) was studied via simulations.We observe that the binding of Dox to CaSorDim resulted in a The error is the standard deviation of the data.
small decrease in overall dynamics as seen in the all-atom RMSD shown in Figure 4c.We do note that the trend is for the DoxCaSorDim structure RMSD to approach the �4Å value that is attained by the CaSorDim structure, but the simulation was not long enough to confirm this.Particularly, the Doxorubicin molecule detaches from the dimer structure (at 150 ns for Trajectory 1 and 250 ns for Trajectory 2), at which point the simulation was terminated.The effect of Dox binding on overall dimer (derived from 4USL.pdb) compactness was evaluated via probability distribution of the radius of gyration and is shown in Figure 4e, where we note that the DoxCaSordim structure is marginally more 'open' than the CaSorDim structure.Interestingly enough, in contrast, the gyration radius distributions for the MgSorTet and  DoxMgSorTet structures (both derived from 5MRA.pdb) are practically identical, as shown in Figure 4f.This is also in agreement with the overall RMSD plots of both trajectories of DoxMgSorTet being quite similar as compared to MgSorTet, as seen in Figure 4d.A more detailed look at system dynamics by way of residue-specific RMSF plots for both chains of both dimer structures is shown in Supplementary Figure S4a and b-where it is apparent that Doxorubicin binding does not significantly affect local dynamics.Similar RMSF plots for the DoxMgTet structure are shown in Supplementary Figure S4d; and the MgTet model is available in Supplementary Figure S4c.Like the dimer structures, the tetramer structures do not show any major changes in dynamics upon Doxorubicin binding.
Recapitulating the previously discussed results of community analysis on the dimer structures, as shown in Figure 3; the binding of Doxorubicin did not have any significant effect on the size or number of communities seen in the CaSorDim structure.As we will discuss next, this is in marked contrast to the results that we observe for the binding of Doxorubicin to the Mg þ 2 liganded tetramer.
As a prelude to discussing the community analysis of Doxorubicin bound to the Sorcin tetramer, we must first discuss the communities of the drug-free Sorcin tetramer derived from 5MRA.pdb.As can be seen in Figure 5a, the correlation map of apoSorTet reveals the presence of crosscorrelation values reaching �0.6 across the chains of a homodimer (A-B and C-D), but reaching those values less frequently when considered across the chains of adjacent homodimers.Nevertheless, there are no obvious 'hot spots' which are seen to be very strongly correlated.The presence of Doxorubicin retains the overall pattern of cross-correlation while uncovering small regions of higher correlation across the chains of homodimers close to the C-terminus.This is, of course close to the EF5 hand which, as we have already discussed-is directly involved in an allosteric handshake mediating dimerization.
The distribution of communities in the SorTet (i.e tetrameric apo Sorcin) structure is not symmetric.Considering that the biological assembly of Sorcin is understood to be a homodimer (here shown as chain A-B or chain C-D), we will investigate the distribution of communities within this context.Table 4 lists the members of the communities obtained from the correlation analysis of the tetrameric systems.Each chain of the SorTet structure is approximately divided into two communities, one of which is a larger community (from residues 32 to � 170) while the other is the smaller community that dynamically couples two neighboring chains.Chain A is an exception, where a smaller strip (residues 145-147) constitutes a third community.Considering the homodimer corresponding to chains A & B, the first major community is C1 which contains residues in chain A from 32 to 176, with some minor cutouts.The mirror community in the homodimeric partner chain B is C2, which runs from residue 32 to 167 (with minor cutouts and a short segment in chain A from residue 145 to 148).Thus, we note that community C2 is smaller than C1.C1 contains almost all secondary structure elements of chain A except for the EF5 loop and helix H, which belong to community C3.Community C2, on the other hand, relinquishes about half of helix G to C3. Community C3; of course, represents the allosteric handshake involving the unpaired loop EF5, which is the core of the dimerization interaction between Sorcin monomers.Considering the other homodimer made from chains C & D, the first large community is C4, which extends mostly from residue 32 to 168 of chain C-thus appearing similar to community C1 and including everything from the N-terminus to half of helix G. Community C6, which mirrors C2 in this homodimer; on the other hand, runs from residue 32 to 154 of chain D (with a minor cutout at residue 152).Thus, community C6 stops at the EF4 loop of chain D and does not include helix G at all.Community C5, of course, is made up of the remnants of chains C and D and represents the allosteric handshake stabilizing this dimer.Thus, we note that both dimers display a relatively simple communication pattern with a pair of large communities comprising both pairs of EF hands and their attendant helices and a smaller community representing the remaining EF hand of each monomer and the allosteric  handshake across monomers.However, this pattern is not symmetric, with the major difference occurring after the EF4 loop.
While considering the community networks in the drugloaded DoxMgTet structure, we first note that the drug Doxorubicin is found bound to only one location in the tetramer near the EF5 hand loop and helix G of chain B. Given the presence of only one drug molecule in the tetramer, we might expect that the changes to the tetramer community are minimal.This is not so.The first major community is C1, which runs from the N-terminus of chain A to residue 106, which is about halfway through helix D. The mirror community is C2, which has the same extent in chain B, although it does include a short segment in chain A's helix F. Community C3 is the allosteric handshake involving the EF5 loop for the homodimer with chains A and B. This is very similar to the equivalent community in the drug-free SorTet structure.This sequence of communities is repeated for the other homodimer involving chains C and D, wherein communities C4 and C6 run from the N-termini of their respective chains to halfway through helix D and community C5 encompasses the allosteric handshake.A new sequence of communities emerges here, which runs from about halfway through helix D to where the allosteric handshake communities pick up.This corresponds to the end of helix G for chain A and the second turn of helix G for chains B and C and the very beginning of helix G for chain D. Thus, the binding of Doxorubicin has resulted in the fragmentation of tetramer communities such that where the apo SorTet structure had the N-terminal and middle lobes of individual monomers conjoined into single communities, the two lobes have now approximately separated into distinct communities.The allosteric handshake communities appear to be unchanged.

Discussion
Calcium-based regulation of protein activity is ubiquitous in mammals and thus extensively studied in biochemistry.The prototypical example is, of course Calmodulin, where Ca 2þ binding to its 4 EF hand motifs triggers major allosteric changes opening hydrophobic recognition patches.From an evolutionary standpoint, Ca 2þ triggered allosteric changes are particularly useful given that the Ca 2þ concentration difference between the cytosol and the extracellular fluid can be 4 orders of magnitude.Accessing such a large dynamic range of Ca 2þ concentrations provide a useful avenue for triggering downstream signal transduction processes.Calpain was identified as a Ca 2þ dependent cysteine protease in the 1980s (Ohno et al., 1984;Sakihama et al., 1985) .This protein, named so because of its similarities to both Calmodulin and Papain became the prototypical example of a novel structural motif called the penta-EF hand or PEF (Blanchard et al., 1997;Kitaura et al., 1999;Lin et al., 1997).Other proteins containing this novel motif were soon discovered.These socalled PEF proteins feature three common structure-function characteristics: (i) dimerization through the 5 th EF hand, (ii) G/P rich N-terminal region and (iii) Ca 2þ dependant translocation to membranes.Maki and co-authors classified PEF proteins into two groups in 2002 (Maki et al., 2002).Group I PEF proteins such as ALG-2 are the most widely distributed in Eukarya.Group II PEF proteins such as Calpain, Sorcin and Grancalcin are seen in higher mammals (Maki, 2020).
The role played by Sorcin in signal transduction depends on its ability to bind partner proteins in a Ca 2þ dependent fashion.This has been elucidated by Ilari and co-authors who discovered a Ca 2þ -triggered EF1 motif opening and the exposure of a consensus hydrophobic binding pocket.We have extended that X-ray crystallographic study by probing the dynamics associated with this allosteric change.The first result we have uncovered indicates that the Ca 2þ -triggered conformational change is indeed reversible.This is seen in both the reduction of the radius of gyration of the CaSor as well as the reduction of the angle Theta between the D and G helices of CaSor.Both events occur at around 300 ns in the simulated trajectory and together signify the transition from a relaxed to a tight conformation.This, by itself is not surprising.After all, Ca 2þ induced conformational changes are supposed to be reversible, as evinced by the dynamics displayed by the best known of all calcium-binding proteins, Calomodulin itself.What is not anticipated here is the ability of the Ca 2þ bound protein to swing from a relaxed to a tight conformation while still bound to Ca 2þ .This suggests that Sorcin may be capable of sampling multiple conformations while bound to Ca 2þ .This has not been previously captured via XRD crystallography.It may be hypothesized that Sorcin is capable of sampling multiple structures and Ca 2þ binding preferentially stabilizes the relaxed conformation.
Information transfer within the Sorcin structure has been investigated by Mella and co-authors using mutagenesis and biochemistry assays such as CD and fluorescence spectroscopy (Mella et al., 2003).They have postulated that the key to Sorcin activation is via the binding of Ca 2þ to the EF3 hand and by allosteric transmission to the EF2 hand via the D-helix and beyond to EF1.This information flow is difficult to visualize in the context of the highly balkanized monomer community structure.However, the dimeric community structure shows the entire N-terminal lobe from residue 30 to residue 110 (approximately) being grouped into one community which comprises helices A through D. This community is directly connected to the middle lobe, which runs from the EF3 hand to the EF4 hand (approx.residues 110 to 155).Thus, allosteric information flows directly from the first, outer N-terminal lobe community to the middle community which is bracketed by EF3 and EF4.This is in line with the conclusions of Mella as cited above.More interesting is that we have uncovered evidence of allosteric communication across the monomer chains in the dimeric structure.This communication occurs via a reciprocal allosteric handshake occurs from helix G of one chain to the EF5 hand and the H helix of the partner monomer and vice versa.This novel result has not been anticipated by XRD crystallography and sheds new light on the flow of information in the biologically relevant dimer form of the protein.
This study has provided answers to three broad questions: (i) what are the effects of Ca 2þ binding upon the dynamics of Sorcin, (ii) what are the effects of Doxorubicin binding on the dynamics of Sorcin and (iii) is there any evidence of allostery and cooperative behaviour in the context of drug binding?However, these questions address matters at an operational level, while the higher, or systems-level question has not been addressed.This question, obviously is whether Sorcin acts merely as a passive scavenger for Doxorubicin (and other chemotherapeutics) or whether it possess a trigger mechanism that turns it into an active scavenger for these drugs.It is hence apposite to discuss signaling mechanisms briefly.Unlike Mg 2þ , which is maintained at a millimolar concentration both inside and outside mammalian cells, Ca 2þ is found at a concentration of 1-2 mM in the extracellular fluid and at � 100 nM inside the cytosol (Carafoli & Krebs, 2016).A single chain of Sorcin binds to two Ca 2þ ions with an affinity of 1 uM (Zamparelli et al., 1997).This is a rather weak affinity interaction, especially given the vastly lower cytosolic concentrations of Ca 2þ .Coupled with this is the fact that Ca 2þ bound Sorcin translocated to membranes.Thus, Ca 2þ bound Sorcin in the cytosol is in the minority vs apo Sorcin.Binding studies by Ilari and co-workers show that Doxorubicin binds to apo Sorcin with a two-site model yielding affinities of 10 nM and 1 uM respectively.These numbers suggest differential, but not cooperative binding (Acerenza & Mizraji, 1997;Saroff, 1991;Stefan & Le Nov� ere, 2013).These change to 22 nM and 2uM for Ca 2þ bound Sorcin (Genovese et al., 2017).Thus, even if Ca 2þ could trigger an active scavenging mechanism in Sorcin, it would not be a particularly effective one.Thus, it is tempting to write off the Dox-binding of Sorcin as being both structurally and mechanistically unrelated to the binding of the Calcium ions.However, these dual Dox-Sorcin binding affinities are � 1 and � 350 nM for Mg 2þ bound Sorcin.This suggests that an active scavenging mechanism may simply be irrelevant in the context of Calcium binding and Sorcin merely acts as a promiscuous binding agent for Doxorubicin.Indeed, it also binds to paclitaxel, vinblastine, and cisplatin, albeit not with high affinity.But this may not be the case for Mg þ 2 binding to Sorcin, where some aspect of cooperativity w.r.t Dox binding may well be present (Cattoni et al., 2015;de Vries et al., 2021;Meijer et al., 2019;Whitty, 2008).Ultimately, the fact remains that Sorcin is overexpressed in MDR cancers and chemoresistant cell lines.Thus, probing the dynamics of Sorcin and indeed the characteristics of its Dox binding site is of great importance.
The role of allosteric regulation in Calcium/Magnesium binding proteins has been studied for a few decades now (da Silva & Reinach, 1991;Ikura, 1996;Mills & Johnson, 1985;Sekharudu & Sundaralingam, 1988).Calmodulin has been the archetype for such regulation, motivating stellar work by the Bax group and others (Wu and Bax, 2002;Mills & Johnson, 1985) that illustrated the connection between the flexibility of Calmodulin and its diversity of binding partners (Yamniuk & Vogel, 2004).The role of conformational entropy and energy landscapes in this context was further explored by Joshua Wand's group and others (Frederick et al., 2007;Li et al., 2014).The field has advanced considerably since these early studies on Calmodulin.Other proteins with EF-hand motifs such as mitochondrial D/E carriers have been investigated (Thangaratnarajah et al., 2014), which also show similarities to Sorcin in that some of the EF-hands are involved in dimerization, while other EF-hands form a Calcium sensing allosteric scaffold.
A future extension to this study will attempt to extend the simulation time for Ca 2þ bound Sorcin to investigate whether multiple conformations are indeed sampled and if so, with what frequency.Another extension would be to investigate the conformations sampled by Mg 2þ bound Sorcin dimer (obtained from 4U8D.pdb, also deposited by Fiorillo and coauthors).A third suggested extension would be to employ experimental methods which explicitly look at system dynamics.Analytical ultracentrifugation would be a likely choice, except that in this case, the conformation changes might be too subtle to investigate using Sedimentation Velocity AUC (Cole et al., 2008;Cole & Hansen, 1999).If we consider the two endpoints of relaxed and tight conformation of the Sorcin monomer to be derived from the Ca 2þ bound 4USL.pdb and the apo-form 4UPG.pdb, then the differences in overall shape are indeed very subtle.Hullrad calculations show the axial ratio of the apo form to be 1.3, with a predicted Sedimentation Coefficient of 2.06 Svedberg, while the axial ratio of the Ca 2þ bound form is 1.37, predicted to sediment at 2.12 Svedberg (Fleming and Fleming, 2018).This is too marginal of a difference to detect experimentally.The predicted Sedimentation Coefficient of the dimeric apo form protein is 3.23 Svedberg, which is excellent agreement with the results obtained from SV-AUC experiments conducted by Zamparelli and co-authors (Zamparelli et al., 1997).These authors noted that both apo and Ca 2þ bound Sorcin sediment between 3.1 to 3.2 Svedbergs (Zamparelli et al., 2000) .This, however, still implies that any SV-AUC attempt to discriminate between the apo and Ca 2þ liganded dimeric protein will be forever stymied by the marginal differences in sedimentation coefficient.Systems like Sorcin, where local secondary structure remains intact, but the overall tertiary structure changes due to reorientation of intact helices are well suited to biophysical methods which specifically track distances between residue pairs.Thus, an orthogonal approach might be to use FRET spectroscopy with suitable donor acceptor fluorophore pairs chemically conjugated to suitable residues, such as C163 and Q97 which are located on helices G and D respectively.The FRET data obtained, especially via time-resolved single molecule experiments can yield valuable insights into the relative time spent by the molecule in open vs closed conformations (Donoghue, 1991;Lankiewicz et al., 1997;Moraczewska et al., 1996).However, FRET measurements only provide a single distance parameter for measuring the dynamics of a protein.X-ray crystallography derived structures reveal the end points of a protein's dynamic trajectory, or to put it another way-the eigenstates of its conformational space.MD simulations provide insight into the conformational space sampled by the protein as it transits between these eigenstates-as our study has demonstrated with the demonstration that Calcium bound Sorcin samples a closed conformation similar to apo Sorcin.This insight calls for experimental validation from a technique which can show the transition between the crystallography defined eigenstates.
Thus, the best approach to studying the dynamics of this system might be NMR spectroscopy.NMR spectroscopy offers a wide array of methods with site specific information available on a wide array of timescales from picoseconds to seconds (Chao & Byrd, 2018;Ikeya et al., 2018;Kleckner & Foster, 2011;Palmer, 2015;Salmon et al., 2011;V€ ogeli, 2010).The 20 kDa monomer mass suggests that inexpensive 15 N labelling strategies might shed light on the dynamics of individual amino acids.A more sophisticated triple labelling ( 15 N/ 13 C/ 2 H) approach along with lifetime experiments (R 1 /R 2 /NOESY) would yield a trove of information and is well within the size limitations of NMR spectroscopy (Cavanagh et al., 2007;Chao & Byrd, 2018;Lipari & Szabo, 1982;V€ ogeli, 2010).
It is the combination of such experimental methods with in-silico simulations which promises to uncover new ground in the understanding of the interplay between protein dynamics and function (Fenwick et al., 2014;Guerry et al., 2013;Hirano et al., 2021;Narayanan et al., 2017)-and that is what must be brought to bear in order to understand the structural and mechanistic role of Sorcin in MDR cancers.The objective of this study has been to shed light on the dynamics of a protein which is of potential importance in developing oncotherapeutics.While our study has relied exclusively on MD simulations, we hope that this will motivate experimental biophysics to explore this system further.

Figure 1 .
Figure 1.Cartoon representations of the crystal structures of (a) Sorcin tetramer (with Mg þ 2 and Doxorubicin) from PDB 5MRA.The four monomers are represented by ribbons of different colors.Mg þ 2 ions are indicated by magenta spheres.(b) Sorcin dimer (with Ca 2þ ) based on the X-Ray structure 4USL.The time variation of the root mean squared deviation (RMSD) of the backbone of the complex of (c) the apo tetramer (derived from PDB 5MRA) and (d) the tetramer with Doxorubicin (based on 5MRA).(e) The RMSD variation with time for the apo dimer (based on 4UPG) and the Ca-Sor (based on 4USL) calculated using the entire backbone available from the crystal structures.(f) The backbone RMSD of the 4UPG and 4USL dimers based on the residues 30-198 from the same trajectories used in (e).

Figure 2 .
Figure 2. (a) Snapshot of the sorcin monomer with the bound Ca 2þ ions indicated by yellow spheres.Helices A, B, C, D and E are indicated by light blue, red, orange, purple and dark blue cylinders.(b)-(e) Variation of the angles between the helices constituting the EF-arms calculated from the MD trajectories of the sorcin monomers.Orange and violet lines indicate the 4UPG and 4USL (with bound C 2þ a Þ derived systems.

Figure 3 .
Figure 3. Community analysis for dimers.Dynamic cross-correlation (based on LMI) for the dimeric systems based on (a) 4UPG, (b) 4USL, (c) 4USL with bound Doxorubicin (Trajectory 1).Regions of high correlation are indicated by black boxes.(d)-(f) The network structure obtained from the correlation analysis.The communities are represented as spheres with the labels indicated alongside.(g)-(i) Bar diagram showing the correlated communities in the dimer.The two monomers, labelled A and H, are represented as separate bars.The color of the bars represent the communities.(j)-(l) The dimeric complex is represented in ribbon representation colored according to communities.The same color scheme is used as in the panela vertically above.Eg, panels (d), (g) and (j) represent the same system (i.e. the apo-Sor dimer based on 4UPG) and have the same color scheme for the communities.Similarly, (e), (h) and (k) represent the Ca-Sor dimer based on 4USL and the panels (f), (i) and (l) represent the Doxorubicin bound complex.

Figure 4 .
Figure 4. Cartoon representations of the crystal structures of (a) Sorcin dimer (with Ca 2þ ) based on the X-Ray structure 4USL.The two chains are indicated by purple and grey ribbons.Doxorubicin molecule was inserted in the structure after alignment with the tetrameric structure based on 5MRA.(b) Sorcin tetramer (with Mg þ 2 and Doxorubicin) from PDB 5MRA.(c) The time variation of the backbone RMSD of the CaSorDim with bound Doxorubicin (in red).For comparison, the corresponding plots for the apoSordim and CaSorDim are also included.(d) The backbone RMSD vs. time plot for the apoSorTet and the DoxMgTet structures.(e) The distributions of the radius of gyration of the dimeric complexes.(f) The distributions of the radius of gyration of the tetrameric complexes.The color scheme in (f) is the same as in (d).

Figure 5 .
Figure 5. Distribution of communities in tetrameric Sorcin.(a) and (d) The dynamic cross correlation matrix (Linear Mutual Information) between residues of the two systems (a) 5MRA tetramer and (d) 5MRA tetramer with Doxorubicin.The boxes B1, B2, B3 and B4 indicate the correlations between the chains A-B and C-D.(b) and (e) Snapshots of the two systems in ribbon representation, colored according to the communities.The color code is chosen so that similar communities in both systems can be given the same label.(c) and (g).The community structure indicated by bar diagrams for the 5MRA and the 5MRA þ Doxorubicin systems.The four bars in each panel represent the four monomers, each colored according to the communities.The color code is the same as in the panels (b) and (e) and is indicated by colored boxes above panel (c).The network structure of the 5MRA system and the 5MRA þ Doxorubicin system are shown in panels (f) and (h).

Figure 6 .
Figure 6.Snapshots of the apoSor (or 4UPG derived) and CaSor (or 4USL derived) monomer, at 0 and 390 ns of the MD trajectories.The helices E and D are shown in magenta.

Table 1 .
List of all the systems studied along with the number and duration of the MD trajectories in each case.

Table 2 .
Mean angles for the EF hands and for angle Theta between helices D and G for the different dimer trajectories.