Canonical structural-binding modes in the calmodulin–target protein complexes

Abstract Intracellular calcium sensor protein calmodulin (CaM) belongs to the large EF-hand protein superfamily. CaM shows a unique and not fully understood ability to bind to multiple targets, allows them to participate in a variety of regulatory processes. The protein has two approximately symmetrical globular domains (the N- and C-lobes). Analysis of the CaM-binding sites of target proteins showed that they have two hydrophobic ‘anchor’ amino acids separated by 10 to 17 residues. Consequently, several CaM-binding motifs: {1–10}, {1–11}, {1–13}, {1–14}, {1–16}, {1–17}, differing by the distance between the two anchor residues along the amino acid sequence, have been identified. Despite extensive structural information on the role of target–protein amino acid residues in the formation of complexes with CaM, much less is known about the role of amino acids from CaM contributing to these interactions. In this work, a quantitative analysis of the contact surfaces of CaM and target proteins has been carried out for 35 representative three-dimensional structures. It has been shown that, in addition to the two hydrophobic terminal residues of the target fragment, the interaction also involves residues that are 4 residues earlier in the sequence (binding mode {1–5}). It has also been found that the N- and C-lobes of CaM bind the {1–5} motif located at the ends of the target in a structurally identical manner. Methionine residues at positions 51 (corresponding to 124 in the C-lobe), 71 (144), and 72 (145) of the CaM amino acid sequence are key hydrophobic residues for this interaction. They are located at the N- and C-boundaries of the even EF-hand motifs. The hydrophobic core of CaM (‘Ф-quatrefoil’) consists of 10 amino acids in the N-lobe (and in the C-lobe): Phe16 (Phe89), Phe19 (Phe92), Ile27 (Ile100), Thr29 (Ala102), Leu32 (Leu105), Ile52 (Ile125), Val55 (Ala128), Ile63 (Val136), Phe65 (Tyr138), and Phe68 (Phe141) and do not intersect with the target-binding methionine residues. CaM belongs to the ‘dynamic’ group of EF-hand proteins, in which calcium and protein ligand binding causes only global conformational changes but does not alter the conservative ‘black’ and ‘grey’ clusters described in our earlier works (PLoS One. 2014; 9(10):e109287). The membership of CaM in the ‘dynamic’ group is determined by the triggering and protective methionine layer: Met51 (Met124), Met71 (Met144) and Met72 (Met145). HIGHLIGHTS Interchain interactions in the unique 35 CaM complex structures were analyzed. Methionine amino acids of the N- and C-lobes of CaM form triggering and protective layers. Interactions of the target terminal residues with these methionine layers are structurally identical. CaM belonging to the ‘dynamic’ group is determined by the triggering and protective methionine layer. Communicated by Ramaswamy H. Sarma


Introduction
Calcium is a universal second messenger directly or indirectly involved in the regulation of most biological processes through interaction with calcium-binding proteins that change their affinity for targets in response to the binding of calcium ions (Williams, 1999). The EF-hand protein superfamily is the largest family of calcium-binding proteins (reviewed in Permyakov and Kretsinger (Permyakov & Kretsinger, 2011)). Calmodulin (CaM) is a highly conserved, soluble, intracellular, Ca 2þ -binding protein, which is present in all eukaryotic cells. In fact, CaM is found in animals, plants, fungi, and protozoa. CaM activates many enzymes and regulates many cellular functions (reviewed in Permyakov & Kretsinger, 2011). The physiological functions of CaM require its interactions with many target proteins. Most of the interactions occur only in the presence of Ca 2þ ions, but some of them take place only in the absence of Ca 2þ ions. A unique and not fully understood property of CaM is its ability to bind to multiple target proteins, which allows these proteins to participate in a variety of regulatory processes (Hoeflich & Ikura, 2002). Due to the huge number of interacting partners related to crucial biological processes, CaM remains highly conserved during evolution.
CaM consists of eight a-helices, I-VIII, and possesses four calcium-binding sites (two in each lobe ( Figure 1A) (Babu et al., 1985;Kretsinger et al., 1986;Wilson & Brunger, 2000). The calcium-binding domains 1 and 4 have the typical helixloop-helix conformation (EF-hand). Domains 2 and 3 also have the helix-loop-helix arrangement, nevertheless they differ from the typical calcium-binding domains by having the long helix IV-V, which is two to three times longer than all the other helices in CaM. Note that the longer helices are observed only in structures determined by X-ray crystallography. As shown in Figure 1A, helix IV-V is usually separated into helix IV and helix V. Calcium-binding loops connecting a-helices in the Ca 2þ -binding domains are composed of residues 20-31, 56-67, 93-104 and 129-140. The molecule is stabilized by multiple interactions between helices and hydrogen bonds between adjacent Ca 2þ -binding loops.
Analysis of the CaM amino acid sequence by a set of intrinsic disorder predictors suggests that this protein contains high levels of disorder. This is illustrated by Figure 1B showing functional disorder profile generated for CaM by the D 2 P 2 platform (http://d2p2.pro/) (Oates et al., 2013). In addition to possessing long regions of intrinsic disorder, CaM is predicted to have four molecular recognition features (MoRFs), which are disorder-based protein-protein interaction sites that are capable of folding at interaction with binding partners. Curiously, these MoRFs (residues 10-21, 64-73, 100-106, and 139-146) coincides with or are included into the a-helices I (residues 5-20), IV (residues 65-74), VI (residues 101-112) and VIII (residues 138-145) in the crystal structure of the CaM-smooth muscle light chain kinase peptide complex ( Figure 1A). Furthermore, CaM is possessing a large number of different posttranslational modifications (PTMs) (see Figure 1B). It is likely that structural pliability combined with multiple PTMs play a role in binding promiscuity of this important protein. In agreement with this notion, Figure 1C represents a STRING-generated proteinprotein interaction network centered at the Gallus gallus CaM. This network includes 309 proteins involved in 4124 interactions. Therefore, on average, each protein in this network interacts with at least almost 27 partners (note that CaM located at the center of this network interacts with at least 308 partners).
According to the structural data from X-ray crystallography and NMR, Ca 2þ binding causes an opening of both globular lobes of CaM, which exposes hydrophobic pockets forming-binding sites for target proteins (Jurado et al., 1999;Zhang & Yuan, 1998). Apo-CaM has overall a more compact structure than calcium-bound CaM. The binding of Ca 2þ induces concerted helical pair movements and the two Three-dimensional (3D) structure of the complex between CaM and target peptide derived from smooth muscle myosin light chain kinase (PDB ID: 2O5G). The structure of CaM consists of two domains: N-lobe (shown in blue orange) and C-lobe (shown in red green). Target peptide is shown in grey. Methionine residues at positions 51 (124), 71 (144), and 72 (145) of the CaM amino acid sequence are key hydrophobic residues for the interaction with residues Trp 5 , Gly 9 , Ala 14 , and Trp 18 of the target protein. The positions of the four calcium ions are shown as black circles. B. Functional disorder profile of CaM from Gallus gallus (UniProt ID: P62149) generated by the D 2 P 2 platform (http://d2p2.pro/) (Oates et al., 2013), which is a database of predicted disorder for a large library of proteins from completely sequenced genomes. D 2 P 2 uses outputs of several per-residue intrinsic disorder predictors, such as IUPred (Doszt� anyi et al., 2005), PONDRV R VLXT (Romero et al., 2001), PrDOS (Ishida & Kinoshita, 2007), PONDR V R VSL2B (Obradovic et al., 2005;Peng et al., 2006), PV2 (Oates et al., 2013), and ESpritz (Walsh et al., 2012). The 9 colored bars located at the top of the plot represent the location of disordered regions as predicted by these different disorder predictors. In the middle of the D 2 P 2 plots, the blue-green-white bar shows the predicted disorder agreement between nine disorder predictors, with blue and green parts corresponding to disordered regions by consensus. Above the disorder consensus bar are two lines with colored and numbered bars that show the positions of the predicted (mostly structured) SCOP domains (Andreeva et al., 2004;Murzin et al., 1995) using the SUPERFAMILY predictor (de Lima Morais et al., 2011). Yellow zigzagged bar shows the location of the predicted disorder-based binding sites (MoRF regions) identified by the ANCHOR algorithm (Meszaros et al., 2009), whereas differently colored circles at the bottom of the plot show location of various PTMs assigned using the outputs of the PhosphoSitePlus platform (Hornbeck et al., 2012), which is a comprehensive resource of the experimentally determined post-translational modifications. C. Protein-protein interaction network centered at CaM from Gallus gallus (UniProt ID: P62149). Network is generated by STRING (Search Tool for the Retrieval of Interacting Genes 167-169 ) (https://string-db.org/) using the medium confidence of 0.4 as a minimum required interaction score. The network includes a query protein and its predicted or known functional associations and uses seven types of evidence, which are indicated by the differently colored lines: a green line represents neighborhood evidence; a red line -the presence of fusion evidence; a purple line -experimental evidence; a blue line -co-occurrence evidence; a light blue line -database evidence; a yellow line -text mining evidence; and a black line -co-expression evidence 168 . PPI enrichment p-value of the resulting network is <1.0e-16, indicating that proteins involved in this network have more interactions among themselves than what would be expected for a random set of proteins of the same size and degree distribution drawn from the genome. helices in each Ca 2þ -binding domain adopt a nearly perpendicular orientation.
Analysis of NMR solution structure of CaM from different organisms in apo-, Ca 2þ -, antagonist-, and protein partnerbound states revealed remarkable structural flexibility of this protein. Although detailed analysis of this phenomenon is outside the scopes of this study, this observation is illustrated by Figure 2 representing some illustrative examples of these structures. Figure 2 shows that solution structures of EF-hand CaM-like calcium-binding protein from Entamoeba histolytica (PDB ID: 1jfj) (Atreya et al., 2001), CaM from yeast (Saccharomyces cerevisiae) in the apo state (PDB ID: 1lkj) (Ishida et al., 2002), Ca 2þ -bound human CaM (PDB ID: 2k0e) (Gsponer et al., 2008), Ca 2þ -bound CaM from Xenopus laevis complexed with an antagonist, N-(6-aminohexyl)-5-chloro-1naphthalenesulfonamide (W-7) (PDB ID: 1mux) (Osawa et al., 1998), human Ca 2þ -bound human CaM complexed with the segment (residues 8-43) of the HIV-1 matrix protein (PDB ID: 2mgu) (Vlach et al., 2014), Ca 2þ -bound form of CaM from Xenopus laevis complexed with the CaM-binding domain of a presynaptic active-zone protein Munc13-1 (PDB ID: 2kdu) (Rodr� ıguez-Castaneda et al., 2010), a complex of Ca 2þ -bound CaM from Xenopus laevis with a binding peptide of the plasma membrane Ca 2þ pump (PDB ID: 1cff) (Elshorst et al., 1999), human Ca 2þ -free CaM complexed with IQ (isoleucineglutamine) motif-containing peptide from the large intracellular C-terminal domain of human cardiac voltage-gated sodium channel Na V 1.5 (Chagot & Chazin, 2011) and Ca 2þbound human CaM in the complex with myosin light chain kinase (MLCK) (PDB ID: 2k0f) (Gsponer et al., 2008) represent highly dynamic conformational ensembles. The authors of the corresponding studies indicated that due to the fact that two globular domains of CaM are connected by a flexible linker, the orientation of the two domains relative to one other is ill-defined, and that there are almost no interdomain interactions. However, NMR analysis of the solution structure of the Ca 2þ -bound human CaM in the complex with MLCK revealed that MLCK binding leads to a coupled equilibrium shift resulting in a locked, less flexible conformation, where N-and C-terminal domains of CaM wrap around the MLCK (PDB ID: 2k0f) (Gsponer et al., 2008).
CaM forms very tight Ca 2þ -dependent complexes with some natural peptides (dissociation constant <10 À 7 M) (Permyakov, 2009;Permyakov & Kretsinger, 2011). As a rule, these peptides form basic amphiphilic a-helices with hydrophobic and positively charged residues on opposite faces. CaM-binding domains in target proteins have similar amphiphilic properties. Two features of CaM were identified that facilitate its interaction with a-helical peptides of protein targets (O'Neil & DeGrado, 1990). The first feature is that 46% of the accessible surface area of the hydrophobic patches on the C-and N-terminal lobes of CaM is occupied by methionine side chains (in positions 36, 71, 72, 76, 106, 124, 144, and 145). Since the side chain of methionine is unbranched, it has considerable conformational flexibility. Thus, a hydrophobic site with a high proportion of methionine residues may be able to alter its surface conformation without changing the conformation of the polypeptide backbone, thus allowing CaM to attune itself to the different topographical/ topological details of the various bound peptides. The second feature that may contribute to the ability of CaM to accommodate a variety of peptides is the flexibility of the central segment connecting the two lobes. Additionally, an important element in binding of specific peptides to CaM is an aromatic residue, frequently tryptophan, in the CaM-binding region of the target, which may help anchor the target to one of the hydrophobic patches of CaM (Gomes et al., 2000;Graether et al., 1997;Weljie & Vogel, 2000).
Analysis of various structures of CaM-based complex in the Protein Data Bank (PDB) (Berman et al., 2000) has revealed a large number of very diverse binding modes and illustrates the range of conformational flexibility of CaM (Tidow & Nissen, 2013;Yap et al., 2000). It has been found that CaM-binding sites (CaMBS) of target proteins are characterized by an increased a-helical content and by the presence of two (or more) hydrophobic 'anchor' amino acids separated by 10 to 17 residues. Therefore, several CaMBS motifs, {1-10}, {1-11}, {1-13}, {1-14}, {1-16}, and {1-17}, have been identified on the basis of the amino acid distance between two anchor residues (Tidow & Nissen, 2013;Yap et al., 2000). For the {1-10} and {1-14} motifs, the existence of additional hydrophobic anchor residues at positions 5 and 10 within the corresponding motifs also has been shown.
Although the structural analysis of CaMBS was performed for more than 80 three-dimensional (3D) structures of complexes contained in the PDB, no structural information concerning the level of residue conservation involved in formation of the complexes was provided for CaM (Tidow & Nissen, 2013). Without these data, it is impossible to answer the question of whether and to what degree structural complementarity between CaM and target proteins take place. In this connection, we decided to re-analyze the representative 3D complexes analyzed in the work (Tidow & Nissen, 2013). Moreover, we included in this analysis 3D structures of complexes that appeared in the PDB after 2013. Thus, in fact, the task was to determine the degree to which common structural motifs exist for the CaM-CaMBS complexes.

The choice of structures to be analyzed
The initial set of the unique CaM complex structures (> 80) that have been deposited in the Protein Data Bank (PDB) was taken from publication (Tidow & Nissen, 2013). All NMR and X-ray structures with a resolution (R) > 2.00 Å were excluded from this set. As a result, 20 CaM complex structures remained. After that, 15 new 3 D complexes that appeared in PDB after 2013 were added.

Modeling software for structural analysis
Structure visualization and structural analysis of interactions among amino acids in proteins (hydrogen bonds, hydrophobic interactions, other types of weak interactions) were carried out using the Discovery Studio Modeling Environment (Dassault Syst� emes BIOVIA, Discovery Studio Modeling Environment, Release 2017, San Diego: Dassault Syst� emes, 2016) and the ligand-protein contacts (LPC) software (Sobolev et al., 1999).
In our previous work (Denessiouk et al., 2014), we explored the conformation of all EF-hand domains (including N-and C-terminal lobes of calmodulin), whose structures were known in the apo-form, in the Ca 2þ -bound form, and in the complex with target protein ligands. In the case of calmodulin, 10 different complexes were used. This analysis allowed us to conclude that calmodulin belongs to the dynamic group 4 characterized by only global conformational changes, with no changes in clusters I and II. In fact, this approach, where multiple structures of protein complexes are used, eliminates the need to use the molecular dynamics method to study the structural characteristics of the protein-protein complexes.

Calmodulin-smooth muscle light kinase peptide complex
First, we have analyzed the 3D structure of the complex between CaM and a target peptide derived from smooth muscle myosin light chain kinase (PDB ID: 2O5G (Valentine et al., 2007a; Figure 1A). At the time of writing this text, the 3D structure of this CaM-peptide complex had the best resolution, R ¼ 1.08 Å, among all 35 representative structures we have taken for analysis (Table 1, row numbered 10). CaM is a small, highly conserved protein that is 148 amino acids long ( Figures 1A and 3). The bound smooth muscle myosin light chain kinase peptide has an a-helical conformation formed Xenopus laevis with a binding peptide of the Ca 2þ pump, the peptide C20W (PDB ID: 1cff) (Elshorst et al., 1999), human Ca 2þ -free CaM complexed with IQ (isoleucine-glutamine) motif-containing peptide from the large intracellular C-terminal domain of human cardiac voltage-gated sodium channel Na V 1.5 (Chagot & Chazin, 2011), and Ca 2þ -bound human CaM in the complex with MLCK (PDB ID: 2k0f) (Gsponer et al., 2008).
by residues 4-18 (Meador et al., 1992;Valentine et al., 2007a). The target peptide sequence shows a characteristic {1-14} spacing of hydrophobic anchor residues, here Trp 5 and Leu 18 . Trp 5 of the peptide occupies the hydrophobic pocket of the C-lobe, whereas Leu 18 resides in the N-lobe pocket ( Figure 1A). Note that the peculiarities of the Ca 2þ binding to CaM was not studied in detail here, since this analysis was already conducted previously (Denessiouk et al., 2014). Based on the results of that analysis is was concluded that CaM belongs to the 'dynamic' group of EF-hand proteins, in which calcium and protein ligand binding causes only global conformational changes but does not alter the conservative 'black' and 'grey' clusters of CaM (Denessiouk et al., 2014).
To quantify the contacts between the amino acids of CaM and the target peptide, we used software characterizing contacts of structural units (CSU), namely contact surface area (CSA, Å 2 ) between two residues (Sobolev et al., 1999). This approach is based on a detailed analysis of interatomic contacts and interface complementarity. The CSA parameter is more than 25 Å 2 for only four amino acids in the C-lobe of CaM (Table S1, row numbered 10, Figure 1A): Leu 105 , Met 124 , Phe 141 , Met 144 , and the Trp 5 residue of the target. The C-terminal amino acid Leu 18 of the target contacts three residues in the N-lobe of CaM: Leu 32 , Met 51 , and Met 71 . When determining the common structural motif of the CaM-CaMBS complex, it is reasonable to assume that the N-and C-lobes provide the same number of structurally homologous residues for interaction with the Trp 5 and Leu 18 residues at the ends of the target. Indeed, the N-and C-lobes have 3 pairs of structurally homologous amino acids: Leu 32 (Leu 105 ), Met 51 (Met 124 ), and Met 71 (Met 144 ) involved in the contact with the terminal residues of the target. The structural analogue of Phe 141 of the C-lobe is Phe 68 in the N-lobe, but it does not fulfill the requirements of a contact with Leu 18 of the N-lobe. Therefore, Phe 68 and Phe 141 were excluded from further consideration. Positions 5 and 10 of the Trp 5 -Leu 18 target peptide are the amino acids Gly 9 and Ala 14 , respectively. One pair of structurally homologous amino acids, Met 72 and Met 145 , are respectively in contact with Ala 14 and Gly 9 of the target peptide. Therefore, the four amino acid pairs of CaM Leu 32 (Leu 105 ), Met 51 (Met 124 ), Met 71 (Met 144 ), and Met 72 (Met 145 ) were taken as the initial set of residues to search for in compiling a final list of residues involved in the formation of a common structural motif of the CaM-CaMBS complex.

Variants of the anchor motif in the interaction between CaM and targets
Previously, the interaction between calmodulin and the smooth muscle myosin light chain kinase target peptide in the structure PDB ID: 2O5G was described as a canonical binding mode {1-14} (Tidow & Nissen, 2013;Valentine et al., 2007a). Our analysis of the contacts between CaM and the target showed that, in addition to the terminal amino acids of the target, residues at positions 5 (Gly 9 ) and 10 (Ala 14 ) must also be taken into account. Therefore, the motif present in the structure 2O5G consists of Trp 5_B -Leu 105_A , Trp 5_B -Met 124_A , Trp 5_B -Met 144_A , Gly 9_B -Met 145_A and Leu 18_B -Leu 32_A , Leu 18_B -Met 51_A , Leu 18_B -Met 71_A , Ala 14_B -Met 72_A , and the CSA values for the amino acid pairs are presented in Table S1.

Anchor motif {1-5}�(4){5-1}
The structural study of all 35 complexes showed that in addition to the complex of CaM with the myosin light chain kinase, 8 more complexes are characterized by a similar type of interaction (Table S1, rows numbered 10-18). Taking into account the fact that the two pairs of residues, Trp 5 and Leu 18 -and -Gly 9 and Ala 14 , have structurally homologous contacts, the motif can be named an 'anchor motif {1-5}�(4){5-1}' (Figure 4, row numbered 10 Therefore, in the structure of the complex, Trp 5 -Gly 9 and Ala 14 -Leu 18 of the target helix are in contact with the C-and N-lobes of CaM ( Figures 1A and 3). The complementary structural role of the residues located at positions '5' of the target (Gly 9 and Ala 14 ) is the unambiguous spatial fixation of amino acids at positions '1' (Trp 5 and Leu 18 ) relative to the C-and N-lobes.

Anchor motif {1-5}{5-1}
The second most common canonical CaM-targeting motif is the {1-5-10} binding mode (Babu et al., 1985). It was found that 8 out of the 35 analyzed complexes were of this type ( Figure 4; Table S1, rows numbered 1-8). In these 8 complexes, additional interactions between the residue at position 6 of the target and the N-and C-lobes of the CaM are observed, that is a {1-5-6-10} binding mode or anchor motif {1-5}{5-1}. Therefore, the main structural difference between anchor motif {1-5}x (4) (Table S1, rows numbered 1-8). Therefore, they can be excluded from the determination of the all common structural motifs of the CaM-CaMBS complexes.

Anchor motifs {1-5} and {5-1}
So Therefore, our analysis revealed that the structural motifs of interaction between calmodulin and the target described in the article are divided into two different groups. The first group included complexes (Figure 4, rows numbered 1-24,  (Tidow & Nissen, 2013). see Sections 3.2.1, 3.2.2 and 3.2.3) in which the target with its terminal amino acids simultaneously interacts with two calmodulin domains. This interaction mode represents the anchor motif {1-5}{5-1} and its variants, such as {1-5}� (4)

Advantages of the new classification system for CaM substrates
In this work, when introducing a classification system for CaM substrates, an attempt was made for the first time to use, in addition to the amino acid sequence of the substrate, also the residues of calmodulin itself in contact with the substrate. Furthermore, due to the structural homology between the N-and C-lobes of CaM, it was hypothesized that there is a structural homology at the ends of the target substrate: anchor motifs {1-5} and {5-1}. This structural innovation made it possible to describe most of the observed types of contacts between calmodulin and its target from a unified nomenclature position. In particular, the introduction this new classification system for CaM substrates made it possible to explain the observed effect of changing the orientation of the target substrate by approximately 180 � with respect to the CaM lobe. These observation facts created grounds for constructing the structural alignment of targets to take into account not only the amino acid sequence, but also the type of the CaM domain with which one or another end of the target interacts. Failure to take this requirement into account previously led to the construction of an incorrect alignment of target sequences.

Hydrophobic core of the EF-hand domain in CaM ('A-quatrefoil')
Previously, we discovered two invariant structural constructs (referred to as 'black' and 'grey' clusters) that are present in all known families of the EF-hand proteins (Denesyuk et al., 2017b). Three amino acids of the 'black' and three amino acids of the 'grey' clusters are located at positions X-4, -X þ 1 and -Z þ 1 of the odd and even EF-hand motifs (Figures 3, 4A, and 4B). 'X', 'Y', 'Z', '-X', '-Y', '-Z' denote structural positions of Ca 2þ -binding ligands within all EF-hand domains (Kretsinger & Nockolds, 1973). The amino acids that constitute the two clusters are different. The 'black' cluster is much less variable in sequence and incorporates mostly aromatic amino acids (phenylalanine, tryptophan and tyrosine). The 'grey' cluster includes a mix of aromatic (position: X-4), hydrophobic (position: -Z þ 1) and polar amino acids (position: -X þ 1) of various sizes (Denessiouk et al., 2014). However, despite their polarity, the amino acids at position -X þ 1 in most cases have long and potentially hydrophobic chains. Thus, both 'black' and 'grey' clusters are hydrophobic clusters involved in the formation of the hydrophobic core of the EF-hand domain (Denessiouk et al., 2014). Figure 5A and 5B, which depict 3D structures of CaM, clearly demonstrate that the 'black' and 'grey' clusters are structural components of the hydrophobic core of the EFhand domain, but they are separated from each other by some other amino acids. In CaM, the 'black' cluster is formed by Phe 16 (Phe 89 ), Phe 65 (Tyr 138 ), Phe 68 (Phe 141 ) and the 'grey' cluster from Thr 29 (Ala 102 ), Leu 32 (Leu 105 ), Ile 52 (Ile 125 ). A preliminary visual analysis of the tertiary structure of CaM showed that for a correct description of the hydrophobic core of this protein, it is necessary to add four amino acids to the 'black' and 'grey' clusters, which are located between them: Phe 19 (Phe 92 ) (position: X-1), Ile 27 (Ile 100 ) (position: -Y þ 1), Ile 63 (Val 136 ) (position: -Y þ 1) and Val 55 (Ala 128 ) (position: X-1).
The six amino acids of the 'black' and 'grey' clusters, as well as the four residues between them in the structure of CaM are packed into a hydrophobic structure, which we will call the 'A-quatrefoil' (Figure 5A and 5B). The symbol 'A' (Aasland et al., 2002) is used to emphasize the hydrophobic nature of the 'A-quatrefoil'. The amino acid components of the A-quatrefoil exhibit second order symmetry. The four amino acids at positions X-1 and X-4 are located within the last turn of the incoming helix of the odd and even EF-hand motifs. Not trivially, the four amino acids at positions -X þ 1 and -Z þ 1 are located along the first turn of the outgoing helix of the odd and even EF-hand motifs. The remaining two amino acids are located at the same -Y þ 1 position of the odd and even EF-hand motifs.
The pair of amino acids at the X-1 position -Phe 19 (Phe 92 ) and Val 55 (Ala 128 ) -added by us to the 'black' and 'grey' clusters ensures the tight contact between the boundary residues of the clusters due to weak hydrogen bonds: O/ Phe 19 (Phe 92 ) (X-1) -CB/Glu 31 (Glu 104 ) (-Z) and O/Val 55 (Ala 128 ) (X-1) -CB/Glu 67 (Glu 140 ) (-Z). These weak hydrogen bonds are invariant not only in all EF-hand proteins, but also in proteins of other folds with EF-hand motifs (Denesyuk et al., 2017b). In addition to these two weak hydrogen bonds, amino acids at positions X-1 and -Z contact each other indirectly through the water molecules HOH 502 (HOH 508 ) and HOH 539 (HOH 516 ) (not shown in Figure 5A and 5B).
Two hydrophobic amino acids are located in the center of the A-quatrefoil: Ile 27 (Ile 100 ) and Ile 63 (Val 136 ) (position: -Y þ 1) ( Figure 5A  The backbone carbonyl atoms are the key elements of the several structural units for design of metal cation-binding sites (Denesyuk et al., 2017a). Finally, the oxygen atoms of the side chains of glutamic acid residues Glu 31 (Glu 104 ) and Glu 67 (Glu 140 ) (position: -Z) are also involved in the process of binding two calcium atoms. Thus, in the EF-hand domain, the odd and even calcium-binding EF-hand motifs form a single A-quatrefoil covered by two calcium atoms ( Figure 5A and 5B).

Structural relation between target-binding amino acids and 'A-quatrefoil' in CaM
In preceding section, we analyzed participation of four pairs of amino acids of CaM, Leu 32 (Leu 105 ), Met 51 (Met 124 ), Met 71 (Met 144 ), and Met 72 (Met 145 ) in the interaction with targets (the target recognition site). On the one hand, variants of complexes with the {1-5}{5-1} motif show that amino acids from CaM-Leu 32 and Leu 105 -are not important for target binding (Table S1, rows numbered 1-8). On the other hand, only two residues Leu 32 and Leu 105 from this list of amino acids constitute a part of the 'A-quatrefoil' of CaM ( Figure  5A and 5B). Consequently, the amino acids of the 'A-quatrefoil' and the hydrophobic residues that form the target recognition site are located side-by-side in CaM, but do not intersect with each other. Independently, when studying the structural rearrangement of the EF-hand domains upon calcium and ligand binding, we established that CaM belongs to the 'dynamic' group, in which calcium and ligand binding causes only global conformational changes, but not any structural changes localized to the 'black' and 'grey' clusters (Denessiouk et al., 2014). Calcium and ligand binding resulted in an opening of the EF-hand domains, but the conformations of the conserved 'black' and 'grey' clusters remained intact.  Figure 6A and 6B). Met 51 (Met 124 ) is located next to the second member of the 'grey'  cluster Ile 52 (Ile 125 ). Met 51 (Met 124 ) and Met 72 (Met 145 ) are located at the N-and C-boundaries of the even EF-hand motifs (Figure 3). When determining certain CaM amino acids that are functionally important for interaction with the substrate by mutation, it is necessary to control the potential possibility of changes in the tertiary structure of CaM domains as a result of such mutations. From the results presented in this paragraph, it follows that it is not worth mutating amino acids Leu 32 (Leu 105 ), because this leads to a change in the structure of the 'grey' clusters, the essential elements of the CaM hydrophobic cores.
One could argue that important information on the proteinprotein interactions can be retrieved based the results of the molecular dynamic simulations of CaM-based complexes.
Undoubtedly, this powerful approach provides a wealth of knowledge on the protein complex dynamics. However, the goal of our study was to find some important regularities in the 3D structures of the intracellular calcium sensor protein CaM, which belongs to the large EF-hand protein superfamily. CaM shows a unique and not fully understood capability to bind to multiple targets thereby participating in a variety of regulatory processes. In this work, a quantitative analysis of the contact surfaces of CaM and target proteins has been carried out for 35 representative 3D structures. The approach used in this study, where multiple structures of protein complexes are used for the detailed analysis of the peculiarities of interfaces, eliminates the need to use the molecular dynamics method to study the structural characteristics of the protein-protein complexes, because analyzed structures can be considered as snapshots of different  poses existing in the dynamic conformational ensembles of CaM-based complexes. In fact, using this approach we established that in addition to the two hydrophobic terminal residues of the target fragment, the interaction also involves residues that are 4 residues earlier in the sequence (binding mode {1-5}). It has also been found that the N-and C-lobes of CaM bind the {1-5} motif located at the ends of the target in a structurally identical manner. Methionine residues at positions Met 51 (Met 124 ), Met 71 (Met 144 ) and Met 72 (Met 145 ) of the CaM amino acid sequence are key hydrophobic residues for this interaction. The hydrophobic core of CaM ('A-quatrefoil') consists of 10 amino acids and do not intersect with the target-binding methionine residues. CaM belongs to the 'dynamic' group of EFhand proteins, in which calcium and protein ligand-binding causes only global conformational changes. The membership of CaM in the 'dynamic' group is determined by the triggering and protective methionine layer: Met 51 (Met 124 ), Met 71 (Met 144 ) and Met 72 (Met 145 ).

Conclusions
Thirty-five representative complexes of calmodulin with various targets were analyzed. It has been established that methionine amino acids of two calmodulin lobes form two triggering and protective layers. Interactions of the four target terminal residues (anchor motifs {1-5} and {5-1}) with these methionine layers are structurally identical. It is shown that the previously established belonging of calmodulin to the 'dynamic' group (only global conformational changes, but not in 'black' and 'grey' clusters) are determined by the triggering and protective methionine layers. In connection with these results, the question arises about the amino acid conservatism of the methionine layer in calmodulin-like proteins and how possible changes will affect the function of these proteins.