Interaction of chemokine receptor CXCR4 in monomeric and dimeric state with its endogenous ligand CXCL12: coarse-grained simulations identify differences

Despite the recent resolutions of the crystal structure of the chemokine receptor CXCR4 in complex with small antagonists or viral chemokine, a description at the molecular level of the interactions between the full-length CXCR4 and its endogenous ligand, the chemokine CXCL12, in relationship with the receptor recognition and activation, is not yet completely elucidated. Moreover, since CXCR4 is able to form dimers, the question of whether the CXCR4–CXCL12 complex has a 1:1 or 2:1 preferential stoichiometry is still an open question. We present here results of coarse-grained protein–protein docking and molecular dynamics simulations of CXCL12 in association with CXCR4 in monomeric and dimeric states. Our proposed models for the 1:1 and 2:1 CXCR4–CXCL12 quaternary structures are consistent with recognition and activation motifs of both partners provided by the available site-directed mutagenesis data. Notably, we observed that in the 2:1 complex, the chemokine N-terminus makes more steady contacts with the receptor residues critical for binding and activation than in the 1:1 structure, suggesting that the 2:1 stoichiometry would favor the receptor signaling activity with respect to the 1:1 association.


Introduction
The chemokine receptor CXCR4, which was originally described as a coreceptor for human immunodeficiency virus (Feng, Broder, Kennedy, & Berger, 1996), is a class A G-protein-coupled receptor (GPCR) with a unique chemokine ligand, CXCL12, previously called stromal cell-derived factor 1 (Bleul et al., 1996;Oberlin et al., 1996). Both CXCR4 and CXCL12 are expressed by a vast array of cell types in many tissues. Targeted disruptions of either genes are embryo lethal in mice, promoting defects in cardiac, hematopoietic, and cerebellar development, evidencing a broad spectrum of activities (Ma et al., 1998;Nagasawa et al., 1996). The CXCR4-CXCL12 pair, which is essential for the proper migration of leukocytes, hematopoietic stem cells, and progenitors, also controls many physiological functions, such as survival, repair, growth, and neovascularization (Puchert & Engele, 2014). Besides, this axis is endowed with pathogenic roles notably in immune diseases and in the progression of various cancers (Kryczek, Wei, Keller, Liu, & Zou, 2006;Sun et al., 2010), including virusrelated ones (Freitas et al., 2014). It is therefore considered as a very attractive therapeutic target (Peled, Wald, & Burger, 2012).
Binding of CXCL12 to CXCR4 triggers typical activation of G protein and arrestin-dependent pathways of a GPCR that cannot solely account for the wide spectrum of physiological activities of this axis (Busillo & Benovic, 2007). It is rather hypothesized that such functional complexity originates from various determinants, including the cell context, the available receptor interactome (receptorosome), together with the reported ability of both CXCL12 and CXCR4 to oligomerize (Bachelerie et al., 2013). This is also consistent with observations made in live cells, from the use of biophysical techniques, including bioluminescence, and fluorescence resonance energy transfer, which indicate that monomers, dimers, and higher order oligomers of CXCR4 might coexist (Ferre et al., 2014;Hamatake et al., 2009). Such as emerged for other GPCR, CXCR4 dimers might display unique ligand-binding properties and functional selectivity. In line with this, dimerization is the most likely mechanism to explain the functional dominance of a gain of function of the CXCR4 mutant over its wildtype congener (Balabanian et al., 2012;Lagane et al., 2008) in the context of the rare WHIM disorder associated with heterozygous CXCR4 mutations (Hernandez et al., 2003).
The possibility that CXCR4 forms dimers is also supported by the recent crystal structures of CXCR4 in complex with small antagonists (Wu et al., 2010) or with the viral chemokine encoded by Kaposi's sarcomaassociated herpesvirus (vMIP-II) (Qin et al., 2015). These CXCR4 structures (PDB codes 3ODU, 3OE0 and 4RWS) revealed a dimeric arrangement of the receptor which is able to accomodate one ligand or a dimer of ligands, forming a 2:1 or a 2:2 CXCR4-CXCL12 complex, respectively. Indeed, chemokines are proposed to exist in an equilibrium of monomeric and dimeric species (Veldkamp, Peterson, Pelzek, & Volkman, 2005), with the dimeric CXCL12 being a partial agonist capable of inducing intracellular calcium mobilization, but not chemotaxis (Drury et al., 2011;Veldkamp et al., 2008). Thus, when considering the full agonist signaling process, the question of whether the CXCR4-CXCL12 complex has a 1:1 or 2:1 preferential stoichiometry with regard to the stability of their respective tridimensional structures remains to be investigated.
The GPCR typical seven-transmembrane helices are well observed in the 3ODU, 3OE0, and 4RWS crystal structures of CXCR4. However, the receptor N-terminus (up to residue Asp22) is not visible in the electron density maps, witnessing its flexible and disordered character, even in the presence of the viral chemokine vMIP-II (Qin et al., 2015;Wu et al., 2010). In the transmembrane cavity of the receptor, several residues were found by site-directed mutagenesis studies to be important for CXCR4 signaling, including the region from Glu179 to Asp182 in extracellular loop 2 (ECL2) (Doranz et al., 1999), the residues Asp97 in transmembrane helix 2, Asp187 in ECL2, Glu288 in TMH7 (Brelot, Heveker, Montes, & Alizon, 2000;Tian et al., 2005), Tyr190 in ECL2, and Glu268 in ECL3 (Zhou, 2001). On the other hand, NMR experiments revealed that CXCL12 has a disordered N-terminus (residues 1-8), which includes residues critical for CXCR4 activation, notably Lys1 and Pro2 (Crump et al., 1997;Heveker et al., 1998;Kofuku et al., 2009;Veldkamp et al., 2008). Besides, it should be emphasized that the three receptor residues Asp97, Asp187, and Glu288 are also critical for the chemokine binding (Brelot et al., 2000). More specifically, they probably make contacts with the CXCL12 first two residues Lys1 and Pro2, as suggested by the crystal structure of CXCR4 with the viral chemokine vMIP-II (Qin et al., 2015).
Based on these evidences, a two-site, two-step model for the CXCR4-CXCL12 interactions was proposed, where the CXCL12 motif 12 RFFESH 17 first binds the receptor N-terminus, and then the chemokine N-terminus 1 KPVSLSYR 8 enters the buried cavity within the CXCR4 transmembrane helices (Crump et al., 1997;Doranz et al., 1999;Kofuku et al., 2009;Veldkamp et al., 2008), triggering receptor activation, probably mediated by a change in the conformation of its transmembrane helices. Nevertheless, despite the important recent structural data, there is not yet a comprehensive characterization of the quaternary structures and dynamics of the CXCR4-CXCL12 associations, which could confirm this model and provide information on the stoichiometry of the functional receptor-ligand complexes.
We investigated this issue using coarse-grained molecular modeling tools, by following a two-step strategy similar to the one employed by Xu et al. and by Tamamis and Floudas in their recent theoretical studies (Tamamis & Floudas, 2014;Xu, Li, Sun, Li, & Hou, 2013), and extending it to investigate the CXCR4-CXCL12 1:1 or 2:1 stoichiometry. The first step consists in generating plausible initial conformations of the complex by coarse-grained rigid-body docking the chemokine onto the receptor. Then, from docked conformations having the chemokine N-terminus located in the receptor transmembrane cavity, we explored the conformational ensemble of the complexes, using coarse-grained MD simulations, to search for the most populated structures. This approach enabled us to identified models of the CXCR4-CXCL12 association in which the chemokine residues Lys1 and Pro2 are in close contact with the three receptor residues Asp97, Asp187, and Glu288, known to be critical for both binding and activation (Brelot et al., 2000). In the modeled 1:1 stoichiometry, we noticed that the chemokine core domain would be more stable when positioned above the transmembrane helices of a putative other protomer if CXCR4 was considered as a dimer. We thus hypothesized that such position of CXCL12 could better fit to a dimeric receptor. Using the same procedure consisting in docking CXCL12 on the dimeric receptor and relaxing the complex with MD simulations, we propose here a model of CXCR4-CXCL12 association that better accomodates to the 2:1 combination.

Building CXCR4-CXCL12 complexes using docking calculations
Quaternary structures of the CXCR4-CXCL12 complex were first generated by rigid-body protein-protein docking, using the molecular modeling toolbox PTools which manipulates biomolecules at a coarse-grained level (Saladin, Fiorucci, Poulain, Prévost, & Zacharias, 2009). We used, for these calculations, the SCORPION coarsegrained force field which was shown to correctly predict several protein-protein interfaces (Basdevant, Borgis, & Ha-Duong, 2007. PTools performs systematic docking as follows: first, the ligand is placed at regular positions around the receptor surface, at a distance slightly larger than its largest dimension. Depending on the receptor size, the number of initial positions typically varies from about 100-300. For each position, about 250 regular orientations of the ligand were generated by systematically changing its three Euler angles. Then, each of these several tens of thousands of initial conformations was submitted to six consecutive minimizations (with decreasing cut-off distances) of the protein-protein interaction energy, using the ligand six transitional and rotational degrees of freedom. The minimized complex conformations were finally clustered by similarity and ranked according to their interaction energy.
The receptor conformation was extracted from the crystal structure 3ODU (Wu et al., 2010) taken from the PDB, by excising the lysozyme fragment (residues 1002-1161) and all the cocrystallized ligands. It should be noted that the N-terminal first 26 residues of the receptor are missing in the crystal structure. We did not include these missing residues in the docking calculations, since this first step mostly aims to generate plausible initial conformations of the CXCR4-CXCL12 complex, prior to molecular dynamics (MD) simulations. These residues will be added to the selected protein-protein models before running the simulations (see below). The dynamics behavior of the CXCR4 N-terminus will be briefly discussed at the end of the Results and Discussion section. The chemokine structures were taken from the PDB file 2K04 (Veldkamp et al., 2008) which contains 20 NMR-resolved conformations of a dimeric form of CXCL12 complexed with the first 38 residues of the CXCR4 N-terminus. It is noted that the N-terminal segment 1-21 of the human CXCL12 sequence (UniProt entry P48061) is annotated as a signal peptide. These residues are therefore removed in the mature protein and do not appear in the chemokine NMR structure. The 20 chains A were isolated and docked into the CXCR4 receptor, starting from positions exclusively located in the extracellular side of the receptor. Each docking calculation yielded about 28,000 quaternary structures of the CXCR4-CXCL12 complex, which were subsequently clustered by similarity. The five lowest energy clusters generated by each of the 20 docking were then visually analyzed to identify those having the chemokine N-terminus inside the receptor transmembrane cavity.

MD simulations of CXCR4-CXCL12 complexes
In a second step, representative conformations of the CXCR4-CXCL12 complex were submitted to extensive MD simulations in order to explore its conformational space and to examine its most populated structures. Before that, the missing N-terminal region 1-26 in the CXCR4 crystal structure was reconstructed on the basis of the complex conformations generated by docking: the NMR structure of CXCL12 complexed with the N-terminal residues 1-38 of CXCR4 (PDB file 2K04) were first superimposed onto the chemokine structure obtained by docking. Then, the segment 27-38 of CXCR4 in the NMR structure was deleted, and the receptor residue 26 of the NMR complex was linked to the residue 27 of the receptor crystal structure (PDB file 3ODU). Finally, the reconstructed structure was minimized in order to optimize the length of the newly created peptidic bond. In this procedure, it could be noted that we had the choice to remove the redundant segment 27-38 of the receptor either from the NMR file or from the crystal structure. We chose to delete the NMR segment 27-38 in order to preserve the disulfide bridge present in the receptor crystal structure between Cys28 and Cys274.
MD simulations were performed with the GRO-MACS software package (Hess, Kutzner, van der Spoel, & Lindahl, 2008), using the MARTINI coarse-grained models of proteins and lipids (Marrink, Risselada, Yefimov, Tieleman, & de Vries, 2007;Monticelli et al., 2008). This change in coarse-grained model is explained by the fact that the SCORPION force field was designed for protein-protein docking calculations and is not suitable for membrane protein MD simulations. Conversely, the MARTINI model was not optimized for proteinprotein docking, but was extensively tested on membrane proteins and is now very reliable for simulating these systems. It could be noted that a similar strategy, using one force field for docking calculations and another one for MD simulations, was used by Xu et al. and by Tamamis and Floudas, but at the atomic level, to predict the CXCR4-CXCL12 quaternary structure. Specifically, they used the ZDOCK force field to perform the docking of the chemokine on the receptor and then the AMBER or CHARMM models for the complex MD simulations (Tamamis & Floudas, 2014;Xu et al., 2013). We adopted the same approach, but at the coarse-grained level which allows to explore the membrane protein conformational space more widely than with all-atom simulations.
We employed the improved version of MARTINI for proteins (de Jong et al., 2013) in combination with the elastic network model ELNEDIN (Periole, Cavalli, Marrink, & Ceruso, 2009) to maintain the overall shape of proteins. In this approach, all pairs of backbone grains, separated by at least three covalent bonds and distant by less than the cut-off distance R c = 0.9 nm, are linked by a spring with the force constant F c = 500 kJ/mol/nm 2 . These parameter values allow to accurately reproduce the structural deformations and dynamics fluctuations of protein backbone (Periole et al., 2009). It should be remarked that, although the elastic network maintains the secondary structures and their close interactions, it does not restrain much the solvated disordered regions, especially the N-terminal and C-terminal tails, which keep most of their intrinsic flexibility ( Figure S1 of the Supplementary Information). Most importantly, elastic networks are defined for each protein (no intermolecular springs) and the six global degrees of freedom between two proteins are completely unrestrained. In the coarsegrained simulations, the chemokine core domain and N-terminus are thus still able to explore various positions and orientations relative to the receptor activation site.
Each CXCR4-CXCL12 complex previously identified was embedded in a rectangular box containing about 300 molecules of 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphatidylcholine (POPC) with initial random position and orientation, along with between 8500 and 12,500 coarse-grained non-polarizable water particles and the appropriate number of counterions required to neutralize the system (Marrink et al., 2007). A relative dielectric constant ε r = 15 was used to screen coulombic interactions as recommended by Marrink et al. in their original paper (Marrink et al., 2007). This solvent model was used although the more recent polarizable coarsegrained water model describes more accurately the solvent dielectric property, but at a greater computational cost (Yesylevskyy, Schäfer, Sengupta, & Marrink, 2010). Nevertheless, the non-polarizable solvent model was extensively tested and successfully used in many membrane protein simulations, as recently reviewed in (Marrink & Tieleman, 2013). This incited us to adopt this solvent model with confidence.
Keeping the proteins rigid, the coarse-grained POPC and water molecules were first submitted to a 200 ns MD run, using a 20 fs timestep, in order to build the lipid bilayer around the CXCR4 receptor. Removing the proteins position restrains, an additional 200 ns MD simulation was then performed to equilibrate the system around the temperature T = 310 K and the pressure p = 1 bar. The system was finally allowed to evolve without any constraint for a 5000 ns production run in the NPT ensemble, using a Nose-Hoover coupling method to keep the temperature constant and a Parrinello-Rahman algorithm for the semi-isotropic pressure. The trajectory coordinates were saved every 500 ps for structural analysis, using GROMACS tools. For characterization of the chemokine position with respect to the receptor, we chose the angle made between the vector joining the CXCL12 pivotal residue Ser6 to the antipodal Asn45 and the vector joining the CXCR4 residues Tyr45 to Gln200 ( Figure S2).

Results and discussion
3.1. Structure of wild-type 1:1 CXCR4-CXCL12 complexes Each docking of the 20 CXCL12 NMR structures on the monomeric CXCR4 generated approximatively 28,000 complexes. For each of the 20 docking calculations, we visually inspected the 5 lowest energy structures, and found all together 17 CXCR4-CXCL12 complexes for which the chemokine N-terminus is located within the receptor transmembrane pocket. In these latter, the chemokine core domain is rather positioned outside the receptor transmembrane helix bundle, as shown in Figure 1. In all other low energy conformations of the Figure 1. Top view of the CXCL12 poses on the extracellular side of the CXCR4 monomeric receptor cyan ribbon). The chemokine is displayed with graduated orange (cluster 1), green (cluster 2), magenta (cluster 3), and purple (cluster 4). The second protomer in the dimeric crystal structure is represented as a white ribbon to provide a visual reference for the chemokine positions. The CXCL12 residue Lys1 is indicated by black spheres and its residue Asn45 by colored ones. complex, the CXCL12 core domain is positioned and centered above the CXCR4 helix bundle, but its N-terminus points towards the solvent (data not shown). It could be noted that the protein-protein rigid-body docking was able to identify several complexes with the CXCL12 N-terminus in the CXCR4 activation site probably because the receptor conformation used for these calculations comes from a co-crystal structure of CXCR4 in complex with a ligand in its transmembrane pocket (PDB ID: 3ODU). This "bound" structure would allow the receptor to accommodate the chemokine N-terminus without large conformational change of its activation site. The 17 complexes, that were found with the CXCL12 N-terminus inside the CXCR4 transmembrane cavity, were clustered by visual inspection of the chemokine core domain position with respect to the receptor helix bundle (Figure 1). We identified four clusters, two of which having only one representative conformation (clusters 3 and 4). We observed that the vector joining the CXCL12 pivotal residue Ser6 to the antipodal Asn45 points towards the receptor transmembrane helices II and III in the first cluster (7 structures), towards helices III and IV in cluster 2 (8 structures), towards helix V in cluster 3 (1 structure), and towards helix VI in cluster 4 (1 structure).
Then, using coarse-grained MD simulations, we considered exploring the conformational dynamics of the two most populated complexes having the chemokine N-terminus within the receptor transmembrane cavity (clusters 1 and 2, with simulations 11-WTA and 11-WTB, respectively). Nevertheless, the question of whether the complex CXCR4-CXCL12 has a 1:1 or a 2:1 stoichiometry prompted us to also consider the cluster 3 structure (simulation chemokine wild type (11-WTC)), since it interestingly has the CXCL12 core domain located above another putative protomer if CXCR4 was dimeric (Figure 1). In contrast, we did not performed MD simulation of cluster 4 since this single conformation has a chemokine core domain position that would not be influenced by a putative other protomer. RMSD relative to the initial conformation of the receptor and the chemokine as well as position of the CXCL12 core domain are shown as a function of time in Figure 2. Due to the elastic network model used here, both receptor and chemokine of the simulated complexes rapidly reached equilibrated tertiary structures which did not fluctuate much during the 5 μs trajectories. In contrast, it can be observed that CXCL12 has larger amplitude global motion relative to CXCR4, especially in the case of complex 11-WTC whose chemokine exhibited broad translational and rotational movements. It could be noted that a conformational change of the receptor occurred around 1 μs of the 11-WTC simulation. It is analyzed as Figure 2. Time evolution of the RMSD (top row) of the receptor (cyan line), the chemokine magenta line), the chemokine with respect to the receptor (green line), and the chemokine position relative to the receptor (bottom row), for the three wild-type complexes 11-WTA, 11-WTB and 11-WTC. The position of the chemokine with respect to the receptor is indicated by the angle between the vector joining the CXCL12 pivotal residue Ser6 to the antipodal Asn45 and the vector joining the CXCR4 residues Tyr45 to Gln200. In the inset pictures, the second protomer of the dimeric CXCR4 is represented in white ribbon as a visual reference for the chemokine positions. The CXCL12 residue Lys1 is indicated by black spheres and its residues Ser6 and Asn45 by colored ones. a local transition from a compact to an extended form of its C-terminal region 302-328 (data not shown). However, since the CXCR4 helix bundle is maintained by an elastic network, which biases its conformational changes, the receptor C-terminus structural transition, located in the intracellular side, can hardly be related to the dynamics of the chemokine/receptor interface, which occurs in the extracellular side.
Among the previous 1:1 complexes, the most plausible quaternary structure was identified by examining contacts between the CXCR4 and CXCL12 residues involved in the receptor activation as previously identified by site-directed mutagenesis studies (Brelot et al., 2000;Crump et al., 1997;Doranz et al., 1999;Heveker et al., 1998;Kofuku et al., 2009;Murphy et al., 2007;Qin et al., 2015;Tian et al., 2005;Veldkamp et al., 2008;Zhou, 2001). Figure 3 indicates the percentages of simulation time for which the chemokine first eight residues are in contact with the receptor key residues. Among the latter, Glu179, Asp187, Tyr190, and Asp193 are located on the ECL2, and Glu268 on the ECL3. According to the subdivision of the receptor transmembrane cavity into a major sub-pocket surrounded by helices III-VII and a minor one by helices I-III and VII (Roumen et al., 2012), the receptor residues Asp171, Tyr255, and Asp262 belong to the major sub-pocket, the residues Trp94 and Asp97 to the minor one, and residues Tyr116 and Glu288 are at the edge of the two subpockets. Figure 3 shows that the chemokine N-terminus mainly lies in the major sub-pocket in the complex 11-WTA, its residue Lys1 predominantly contacting the receptor residue Asp262. In the complex 11-WTB, the chemokine N-terminus is partially buried in the transmembrane cavity, with its residue Lys1 pointing towards the ECL2 β-turn, but establishing no steady contact with any of the minor sub-pocket residues. In contrast, the complex 11-WTC exhibits a chemokine N-terminus that partially occupies the CXCR4 major sub-pocket but interacts mostly with the minor one ( Figure 4). Indeed, the CXCL12 first two residues Lys1 and Pro2 make durable contacts with residues Trp94 and Asp97, as well as with residues Tyr116 and Glu288 at the edge of the two sub-pockets, and to a lesser extent with residues Asp171 and Tyr255 in the major one ( Figure 3). Noticeably, not only CXCL12 residue Lys1 makes steady contacts with the receptor residues critical for binding and activation (i.e. Asp97, Asp187, and Glu288), but the remaining seven N-terminal residues also interact with the CXCR4 important residues Glu179, Asp187, and Tyr190. These interactions were not recovered in the two conformations 11-WTA and 11-WTB, highly suggesting that 11-WTC would be the most representative structure of a functional 1:1 receptor-ligand complexes.
In the latter model, the position and orientation of the chemokine N-terminus within the CXCR4 minor sub-pocket are very similar to those observed in the crystallographic structure of CXCR4 in complex with the viral chemokine vMIP-II, as well as in the CXCR4-CXCL12 model built using the CXCR4-vMIP-II structure as a template (Qin et al., 2015). However, in our model, the global position of the chemokine core domain is located outside the receptor transmembrane helix bundle, whereas it is more centered above it in the crystallographic structure. This apparent discrepancy could be explained by the packing of the proteins in the crystal unit cell of the CXCR4-vMIP-II complex. Indeed, analysis of the latter reveals close contacts between two vMIP-II of two adjacent head to tail subunits, as well as between the T4 lysozyme and the chemokine of two neighboring complexes ( Figure S3). These steric interactions could constrain the ligand vMIP-II to adopt a different position and orientation from those in our liquid phase model. Another explanation could be the disulfide bond between the receptor residue D187C and the chemokine W5C that was introduced to trap the complex in a crystallizing conformation and which could artificially constrains packing of the ligand N-terminal region from residue 5, leading to a different position and orientation of its core domain.

Structure of 1:1 CXCR4-CXCL12 complexes with mutated chemokine
The position of the CXCL12 N-terminal residues among those of the CXCR4 transmembrane cavity and their contacts observed in the 11-WTC complex were further assessed using simulations of CXCR4 in complex with the K1R and P2G chemokine mutants. Starting from the final conformation of the 11-WTC simulation, the time evolution of the receptor and the mutated chemokine RMSD as well as the position of CXCL12 core domain are displayed in Figure S4. It is observed that, with respect to the wild-type CXCL12, the P2G mutation induces a global rotation of the chemokine core domain towards the receptor helices III and IV. In contrast, the K1R mutation does not much affect the global position of the chemokine relative to the receptor. However, we found that the K1R mutation influences the accomodation of the CXCL12 N-terminus in the receptor minor sub-pocket ( Figure 4) and the contacts with the residues critical for CXCR4 activation ( Figure S5). For instance, the contacts of the chemokine's first four residues with the CXCR4 residue Aps187 are partially disrupted by the K1R substitution. We also observed a loss of contacts of the chemokine second residue (Gly2 when mutated or Pro2 in the context of the K1R mutant) with the CXCR4 residues Tyr255 and Glu288 ( Figure S5 compared to Figure 3). In the context of the K1R mutant, these observations are probably due to a weakening of the hydrophobic cluster, involving, in the wildtype complex, the aliphatic moiety of the Lys1 side chain and the side chains of residues Trp94 and Tyr116. Consequently, the mutated chemokine residue Arg1 is hardly hold in place, resulting in turn in the destabilization of the hydrophobic interaction between Pro2 and Tyr255. In both mutated K1R or P2G complexes, the chemokine's first residue (Lys1 or Arg1) is flipped over with respect to its orientation in wild type, so that its side chain amine group makes a salt-bridge with the receptor residues Glu288, instead of Asp97 as observed in 11-WTC ( Figure 4). These structural effects of the chemokine mutations are reflected in the interaction energies of the chemokine first eight residues with the receptor. These non-bonded energies (Lennard-Jones + Coulomb), calculated with the MARTINI force field and averaged over each MD trajetory, are equal to −748.4 ± 65.0, −669.8 ± 48.2, and −690.0 ± 39.1 kJ/mol for the 1:1 CXCR4-CXCL12 11-WTC, K1R, and P2G mutants, respectively. Although these energy estimations are rather crude due to the simplified protein model used here, they tend to confirm that the chemokine mutations K1R and P2G induce a destabilization of the chemokine N-terminus interaction with the receptor transmembrane cavity when compared with the wild type.
From a dynamics point of view, the first residue K1R substitution also influences the duration of the chemokine N-terminus contacts with the three residues critical for the receptor activation. Indeed, as shown in Figure 5, the period during which the chemokine first residue is simultaneously in contact with the receptor residues Asp97, Asp187, and Glu288 is significantly shorter in the 11-K1R mutant than in the wild-type complex (these periods represent 48.8 and 17.3% of the 11-WTC and 11-K1R simulation times, respectively). These differences in the packing and dynamics of the chemokine N-terminus within the receptor cavity could account for the antagonist behavior of the K1R and P2G CXCL12 mutants on calcium mobilization used as a readout for CXCR4 activation (Crump et al., 1997). All together, our results indicate that the 11-WTC quaternary structure is the most probable conformation of the 1:1 CXCR4-CXCL12 functional complex. Consistently with most of the site-directed mutagenesis studies of CXCR4, our model exhibits a chemokine N-terminus lying at the bottom of the receptor minor sub-pocket and making persistent contacts with receptor residues critical for activation, including Asp97, Asp187, and Glu288.

Structure of 2:1 CXCR4-CXCL12 complexes
Since, in the 11-WTC model, the chemokine core domain is located beyond the receptor helix V, it could interact with a putative adjacent protomer if CXCR4 was a dimer. Indeed, when docking the 20 NMR structures of CXCL12 on the crystallographic receptor dimer, we found among the 20 × 5 lowest energy conformations three 2:1 CXCR4-CXCL12 complexes, in which the chemokine N-terminus is located within the transmembrane pocket of one CXCR4 protomer, whereas its core domain is partially covering the cavity of the other protomer ( Figure S6). Again, we explored the conformational ensemble of one of these 2:1 complexes by coarse-grained MD simulation (hereafter referred to as 21-WT). This revealed that the chemokine global movements relative to the receptor dimer are more restrained than in the case of the 1:1 association (Figure 6 compared to Figure 2). Moreover, during the 5-μs simulation, the two N-terminal residues Lys1 and Pro2 of the chemokine mostly occupy the receptor minor sub-pocket and make very steady contacts with the CXCR4 critical residues Asp97, Asp187, and Glu288 (Figures 7 and 8), demonstrating that the 21-WT complex enables the chemokine N-terminus to trigger the receptor signaling activity, by contacting the key transmembrane pocket residues. These contacts, which were established for significantly longer duration than those in the 1:1 complex ( Figure 5), suggest that the 2:1 stoichiometry would favor the receptor activation as compared to the 1:1 association.
Interestingly, the K1R mutation of the chemokine N-terminus had not the same impact on the 2:1 CXCR4-CXCL12 complex as compared to the 1:1 association. Indeed, in the simulation time course of the chemokine mutant K1R of the 2:1 complex (simulation 21-K1R), the chemokine N-terminus maintained the contacts with the receptor residues critical for activation established by the wild-type CXCL12 (Figure 7), whereas the same mutation partially disrupted the interactions of the CXCL12 residues Lys1 and Pro2 with the CXCR4 important residues Asp187, Tyr255, and Glu288 in the 11-K1R simulation ( Figure S5). In contrast, the chemokine mutant P2G in complex with the CXCR4 dimer did not maintain its global position and orientation relative to the receptor ( Figure 6). The consequence is dramatic for the residue Lys1, which cannot make any contacts with the CXCR4 residues critical for activation (Figure 7). More broadly, the first residues of the chemokine N-terminus left the receptor minor sub-pocket to the benefit of the major one, close to the residue Asp262, and even can transiently exit the transmembrane cavity, supporting the observation that the P2G mutant does not promote CXCR4 activation. As for the 1:1 CXCR4-CXCL12 complexes, we calculated, using the MARTINI force field, the averaged non-bonded energy between the chemokine first eight residues and the receptor in the 2:1 associations. These interaction energies are equal to −742.9 ± 36.5, −782.7 ± 37.9 and −403.0 ± 60.2 kJ/mol for the chemokine wild type, K1R, and P2G mutants of the 2:1 complex, respectively. In constrast to the 1:1 Figure 5. Time evolution of the distance between the chemokine first residue and the receptor Asp97 (purple line), Asp187 (orange line) and Glu288 (green line), for the 11-WTC, 11-K1R, 21-WT and 21-E268 simulations. The inset picture displays in colored spheres the chemokine residues Lys1 (black) and Asn45 (magenta), as well as the receptor key residues Asp97 (purple), Asp187 (orange) and Glu288 (green). stoichiometry, and unlike the P2G mutant, the K1R mutation of the 2:1 CXCR4-CXCL12 complex does not destabilize the interaction between the chemokine N-terminal residues and the receptor transmembrane cavity when compared to the wild type. These energy data corroborate the observed lifetimes of the contacts between the chemokine N-terminus and the CXCR4 key residues for binding and activation (Figure 7). Overall, these results suggest that the K1R chemokine mutant would maintain the capacity of CXCL12 to activate CXCR4 when engaged into a 2:1 CXCR4-CXCL12 association, but not in the 1:1 complex. These observations are in good agreement with comparative analysis of CXCR4-dependant G-protein activation promoted by the wild type and derived-chemokine mutants, which suggest a partial agonism of some latter ones (Levoye et al., unpublished results).
3.4. Structure of 1:1 and 2:1 CXCR4-CXCL12 complexes with mutated receptor In the 11-WTC structure, the chemokine N-terminus makes contacts with the receptor residues Asp97, Asp187, and Glu288 critical for its activation. Nevertheless, it is curiously observed in this model that the receptor residue Asp193 makes a steady salt-bridge with the chemokine residue Arg8, whereas it was shown that mutations D193K and D193A have little effect upon CXCR4 activation (Brelot et al., 2000;Doranz et al., 1999). This apparent contradiction was investigated using a coarsegrained MD simulation of CXCL12 bound to the monomeric mutated D193K receptor (simulation 11-D193K). As shown in Figure S7, the mutation does not affect the global position and orientation of the chemokine with respect to the receptor. Moreover, most of the contacts between the CXCL12 residues Lys1 and Pro2 with the important residues of the receptor transmembrane pocket are conserved ( Figure S8). In detail, the Lys193 ammonium group makes a salt-bridge with the Glu268 side chain, but the Lys193 backbone group still remains in contact with the chemokine Arg8 backbone, while its side chain tends to come closer to the residue Asp262. Notwithstanding these slight adaptations at the chemokine interface with the receptor ECL2, the CXCL12 N-terminus still occupies the D193K receptor minor sub-pocket, supporting the fact that the D193K mutation has little effect on the CXCR4 activation. This result is also consistent with the experimental observation that the CXCL12 residue Arg8 is not absolutely required for triggering the receptor signaling, as shown by studies on the chemokine R8K mutant activities (Crump et al., 1997;Murphy et al., 2007).
Another intriguing result in our study concerns the receptor residue Glu268 which was never observed being in contact with the chemokine N-terminus (Figures 3  and 7), whereas the receptor E268A mutant was reported to be impaired in its capacity to bind and to be activated by CXCL12 (Zhou, 2001). In the 11-WTC complex, the Figure 6. Same as Figure 2 but for the wild-type 2:1 CXCR4-CXCL12 complex (left) and the K1R (middle) and P2G (right) chemokine mutants. The inset picture displays the two protomers of the dimeric CXCR4 with cyan and tan ribbons. The CXCL12 residue Lys1 is indicated by black spheres and its residues Ser6 and Asn45 by magenta ones. distance from the receptor residue Glu268 to every chemokine residues was measured (data not shown) and a minimum distance of 8.9 Å was found, indicating that Glu268 was not involved in the receptor-ligand interactions. However, in the 21-WT complex, Glu268 was observed close to the CXCL12 residue Arg41, at a distance of 4.5 Å, meaning that it participates in the 2:1 CXCR4-CXCL12 interface. To further investigate the role of CXCR4 residue Glu268, we performed an additional MD simulation of the E268A mutant of the CXCR4 dimer in complex with CXCL12 (simulation 21-E268A). The simulation analysis shows that the chemokine core domain keeps its position above one of the two protomers, whereas its orientation, which was stabilized by a salt-bridge between the receptor Glu268 and the chemokine Arg41 in the wild-type 2:1 complex, is no more maintained in the E268A mutant ( Figure 8). Accordingly, the E268A mutation influences the accomodation of the chemokine N-terminus within the receptor cavity, which is characterized by the Lys1 side chain amine group in contact with residues Glu288 and Asp187, similarly to what is observed for the mutants 11-K1R and 11-P2G complexes (Figure 8 compared to Figure 4). This difference in the packing of the chemokine N-terminus within the receptor minor sub-pocket could again account for the experimentally observed loss of signaling activity of the E268A mutant (Zhou, 2001). Interestingly, this mutation influences the 2:1 CXCR4-CXCL12 complex, but has probably no impact on the 1:1 association, in which the receptor residue Glu268 does not interact with any of the chemokine residues, providing further support for the existence of the 2:1 stoichiometry.
These results do not accord with the conclusions made by Kufareva et al., who proposed that CXCR4 interacts with CXCL12 in a 1:1 stoichiometry, despite its dimeric nature and subsequently to the exclusion of the 2:1 hypothesis on the basis of functional complementation and dilution assays (Kufareva et al., 2014). Nevertheless, the functional rescue that can be observed upon coexpression of complementary mutants of CXCR4 (between 60 and 100% as seen in Figure 4 of Ref. (Kufareva et al., 2014)), while supporting the existence of receptor dimers, does not exclude the 2:1 stoichiometry hypothesis. We herein propose a dynamic model which could fit with the coexistence of both 1:1 and 2:1 complexes. We hypothesize that during the second step of the two-site two-step mechanism of the CXCR4-CXCL12 recognition (Crump et al., 1997;Kofuku et al., 2009), the chemokine N-terminal tail 1 KPVSLSYR 8 enters the CXCR4 transmembrane cavity, while the receptor N-terminus partially detaches from the chemokine core domain recognition site 12 RFFESH 17 . Indeed, intrinsically disordered regions such as the CXCR4 N-terminus are known to bind their protein partners with high specificity but low affinity (Huang & Liu, 2009;Uversky & Dunker, 2010). For the CXCR4-CXCL12 complex, it was reported that the receptor N-terminal peptide 1-38 binds to the chemokine core domain with a dissociation constant of 4.5 μM (Veldkamp,Seibert,Figure 7. Same as Figure 3 but for the wild-type 2:1 CXCR4-CXCL12 complex (top) and its K1R (middle) and P2G (bottom) chemokine mutants. Peterson, Sakmar, & Volkman, 2006). This micromolar low affinity, compared to the affinity of the chemokine for the whole receptor (K d = 3.6 nM (Crump et al., 1997)), suggests that the CXCR4 N-terminus could easily unbind from its chemokine recognition site after or during the correct positioning of the CXCL12 N-terminus into the receptor transmembrane cavity. In our study, the MD simulations 11-WTC and 21-WT show that the CXCR4 N-terminus, which was initially in contact with the chemokine recognition site 12 RFFESHV 18 , partially unbinds from this region ( Figure S9). This unbinding process would comply with the engagement of one CXCR4 protomer with both chemokine recognition site and N-terminus, in the 1:1 or 2:1 complex, consistently with both NMR data and mutagenesis experiments.

Conclusion
The main advantage of coarse-grained models over all-atom descriptions is that they smooth the energy landscapes of the studied protein complexes, allowing a more efficient exploration of their conformational ensemble. In particular, coarse-grained MD simulations enable studied complexes to visit structures far from the initial guess and to reach more rapidly the most stable conformations. This advantage was used here to model the most probable conformations of the 1:1 and 2:1 CXCR4-CXCL12 complexes, with regard to the receptor activation, taking into consideration evidences from NMR, X-ray and mutagenesis studies. In this way, the quasi-exhaustive coarse-grained docking calculations of monomeric CXCL12 on either a monomeric or a dimeric receptor generated several complexes in which the chemokine N-terminus lies in the CXCR4 transmembrane cavity, satisfying the overall geometrical criteria required to trigger its activation, as envisioned by Crump et al. and later on confirmed by Kokufu et al. (Crump et al., 1997;Kofuku et al., 2009).
The coarse-grained MD simulation of one of the 1:1 complexes (11-WTC) converged towards a conformation in which the chemokine N-terminus mainly occupies the transmembrane minor sub-pocket of the receptor, with chemokine residues Lys1 and Pro2 making steady contacts with CXCR4 key residues Asp97, Asp187, and Glu288. The packing of the chemokine N-terminus within the receptor cavity are stabilized by hydrophobic interactions between the apolar groups of the Lys1 and Pro2 side chains with the CXCR4 residues Trp94, Tyr116, and Tyr255, which are disrupted upon K1R or P2G mutations. These findings are consistent with most of the site-directed mutagenesis studies of CXCR4, as well as with the crystallographic structure of CXCR4 in complex with the viral chemokine vMIP-II, with the notable difference that in our model, the CXCL12 core domain is located outside the receptor transmembrane helix bundle and above a putative adjacent protomer if the receptor was in a dimeric form.
By means of a second coarse-grained study, motivated by the hypothesis raised by the 11-WTC model, we subsequently generated a very stable 2:1 CXCR4-CXCL12 complex (21-WT), in which the chemokine core domain partially covers the cavity of one of the two CXCR4 protomers. In this model, the chemokine N-terminus is found to be located within the transmembrane minor sub-pocket of the other protomer, with very steady contacts between the receptor key residues Asp97, Asp187, and Glu288 and the CXCL12 first residues Lys1 and Pro2. More broadly, our results demonstrated that both monomeric and dimeric CXCR4 can bind a monomeric chemokine CXCL12 in a way (i.e. with its N-terminus buried in the transmembrane cavity) that would be functional regarding the triggering of CXCR4 signaling activities.

Supplementary material
The supplementary material for this paper is available online at http://dx.doi.org/10.1080/07391102.2016.1145142.