Simulating the active sites of copper-trafficking proteins. Density functional structural and spectroscopy studies on copper(I) complexes with thiols, carboxylato, amide and phenol ligands

Abstract A series of mononuclear binary and ternary Cu(I) complexes with formato, formamide, methylphenol, and methanethiolato ligands were optimized at DFT-B3LYP/6-31G** (BS1) and DFT-B3LYP/6-311++G** (BS2) levels of theory. The solvent effect was taken into account via PCM method (BS1W and BS2W, respectively). The coordination arrangement for [CuI(SCH3/S(H)CH3)(OOCH)]−/0 and [CuI(SCH3/S(H)CH3)(O(H)(C6H4)CH3)]0/+ was pseudo-linear and for [CuI(SCH3/S(H)CH3)(OOCH)(OC(H)NH2)]−/0 was pseudo-trigonal. The [CuI(S-S(H)CH3/CuI(S-SCH3)]+/0 link even to amide carbonyl and to general O(H)R residues (R=C6H5CH3). [CuI(SCH3)2(O(H)(C6H4)CH3)]− went towards dissociation of the O(H)(C6H4)CH3 ligand, whereas [CuI(S(H)CH3)2(O(H)(C6H4)CH3)]+ converged nicely, maintaining the hydroxy function linked to the metal. The trends of total electronic energies seemed to be significant, suggesting that linear CuIS2 coordination is more suitable than CuIS, CuIS3 and CuIS4 arrangements. The formation energies of [CuI(S(H)CH3/SCH3)(OOCH)]0/−1 were higher than those of [CuI(S(H)CH3/SCH3)2]+/− on starting from [CuI(S(H)CH3/CuI(SCH3)]+/0 by ca. 11–9 kcal mol−1 (BS2W). The structural arrangements, bond distances, and angles as well as computed spectroscopic parameters resulted in good agreement with experimental data for corresponding synthetic complexes and with metal site regions of several copper(I)-proteins. These data help in interpreting structural data of complex biological systems and in constructing reliable force fields for molecular mechanics computations.


Introduction
Several proteins are believed to be actively involved in copper trafficking and storage, and even small biomolecules such as glutathione (GSH) play an important role in binding copper in vivo [1,2]. Sometimes copper proteins are involved in serious human diseases like Parkinson and Menkes pathologies and the structural studies at metal sites might help in understanding the molecular mechanism of such health disorders [3][4][5]. The solution structures of the human metallochaperone HAH1, apo-HAH1 monomer, and of the copper(I)-bound form, have been studied via NMR spectroscopy [6]. Metals play an important role in folding and stability of proteins in vitro and in vivo and at the same time the protein conformational changes help in metal storage and trafficking, particularly in copper(I) [7]. The experimental structures for proteins in general and copper(I)-proteins also are often affected by large errors regarding the coordination sphere owing to complexity of the systems. In the present work, we report on selected structures optimized via density functional (DFT) methods of some models of copper(I) binary and ternary complexes, with low-molecular weight ligands, on the relevant selected computed structural parameters, spectroscopic data, and energy aspects. type of donors, some O-donor molecules were selected and fully optimized, i.e. HCOOH, HCOO − and OC(H)NH 2 , and O(H)(C 6 H 4 )CH 3 .
For HOOCH, 1, OOCH − , 2, OC(H)NH 2 , 3, and O(H)(C 6 H 4 )CH 3 , 4, owing to the absence of any major difference between the computed parameters at different basis sets BS1 and BS2, and the fact that even the solvent effects were marginal and maximum deviation with respect to BS2 was below 0.6%, the BS1/BS1W levels of theory ( Figure 1a, Table 2) could be considered adequate for structure simulations that contain CHO atoms, at least regarding the molecules examined in the present work.
Regarding formate anion OOCH − , 2 (Figure 1b and Table 2 Figure  3) both at BS1 and BS2 levels (S-Cu-O, 174.9 and 175.2°, respectively). The structure computed at BS1 level had the same shape as that reported in Figure 3a even though some parameters were significantly different. In fact, the computed Cu-S and Cu-O2 bond distances were 2.114 and 1.819 Å (BS1), and 2.181 and 1.904 Å (BS2). The O1 from carboxylato was not linked to Cu(I), Cu…O1 being 3.071 Å (BS1) and 3.253 Å (BS2) (van der Waals radii: Cu, 1.4 Å and O, 1.52 Å [14]). When the PCM solvent treatment (water) was considered, the structures were successfully optimized, both at BS1W and BS2W levels, but the respective arrangements differed significantly (Figure 3b, c). Cu-S and Cu-O2 bond distances were 2.172 and 1.906 Å (BS2 W), respectively, whereas the Cu…O1 contact distance was 3.153 Å (BS2W). Therefore, upon addition of the formato ligand, the computed Cu-S bond distance lengthened from 2.024 Å [13] to 2.114 Å (BS1), whereas the S-C bond distance did not change appreciably. On the contrary, the O2-C bond distance at formato lengthened significantly upon formation of a coordination bond to Cu-SCH 3 (from 1.254, up to 1.292 Å). The non-metal-bound C=O bond underwent a shortening from 1.254 to 1.230 Å. Thus, upon metal ligation, the effects on CO bond lengths paralleled those caused by protonation but the magnitude were lower by metal than by proton. From a search on the CSD [15] data base, it was possible to download the structures of several metal(II)-formato complexes. The carboxylato/carboxyl ligand often bridged the metal centers and in those cases the two C-O bonds showed almost the same length (see e.g. [16], where the two O-C bonds had lengths of 1.230-1.270 Å). On the contrary, from a similar search on CSD database, structures that contain Cu(I) and formato are few. One of those structures was reported for a molecule of formic acid (O-monodentate) linked to a tricoordinate Cu(I) center [17], whereas a formate anion is co-crystallized. The S(H)CH 3 metal derivative [Cu I (S(H)CH 3 )(OOCH)], 7, was also fully optimized, both at BS1 and BS2 levels of theory, and showed different final arrangements with respect to the carboxylato group ( Figure 4) even though the starting structure was similar to that optimized for [Cu I (SCH 3 )(OOCH)] − , 6 (i.e. with CH 3 and HCO groupings anti to each other with respect to the S-Cu-O line, Figure 3a). using a quasi-linear input structure for O-Cu-S (175°), the optimization at BS1 brought about a O,O-Cu chelate (Figure 4a), characterized by Cu-O bond distances that averaged 2.013 Å and by a Cu-S bond distance that measured 2.069 Å. At BS2 level, the optimized structure had a quasi-linear arrangement around copper (S-Cu-O, 175.2°). The refined model presented the methyl group syn to the carbonyl oxygens, also (weakly) bound to the metal center ( Figure 4b). The coordination bond distances were 2.211, 1.897, and 2.651 Å for Cu-S, Cu-O2(trans to S), and Cu…O1(syn to CH 3 ), respectively. When the treatment of solvent effects was performed at a BS1W level, the structure converged to an arrangement very similar to that reported for BS1 at gas phase (Figure 4c), where the carboxylato chelated the metal. Computed Cu-S and Cu-O2/ O1 bond distances were 2.081 and 2.039 Å (av). When the computations were carried out at the BS2W level, the refined structure showed significant differences in coordinate mode from carboxylato and orientation of methyl group (Figure 4d). The Cu-S, Cu-O2, and Cu…O1 bond and interatomic distances were 2.211, 1.891, and 3.069 Å, respectively. Therefore, it is evident that both SCH 3 − and S(H)CH 3 linked to Cu(I) may allow monodentate or bidentate coordination mode for formato, and in the case of monodentate a second (weaker) Cu…O linking interaction was also operative. The linkage of Cu(I) to carboxylato seemed to be rare from searches on databases of experimental data and never reported for biological macromolecular systems.   optimized structures that were strictly dependent on the input model orientation. Two input structures (Figure 5a, b), having an almost planar trigonal arrangement around Cu(I) for the three donors (S,O(c,carboxylato),O(a,amide)), successfully converged at BS1 level to an almost η 2 -coordination arrangement between the Cu atom and the O=C(a) ligand moiety (Figure 5c, d and Table 4). The optimized Cu-O(a) and Cu-O(c) bond distances were 1.978 and 1.969, and 1.947 and 1.978 Å for the two models (8syn and 8anti, Figure 5c, d), respectively. The carboxylato ligand showed a weak chelating behavior just in one of the structures: the two Cu-O bond distances were 1.969, 3.001, and 1.978 Å, 2.852 Å for the two structures, respectively. The N(a)-H…O(c) interaction had N…O and Ĥ values by 2.955 Å and 160.2°, and 3.075 Å and 153.7°. Once the orientation of the SCH 3 − residue was syn with respect to the carboxylato, the structure was the more stable one (by just 1.26 kcal mol −1 ). It is interesting to note that when the same input structure (Figure 5a) Table 4) with respect to carboxylato residue (contrary to the result obtained for the SCH 3 − derivative). The most stable structure was favored by a strong N-H…O hydrogen bond (N…O, 2.726 Å; Ĥ, 175.7°). On the contrary, the structure with the CH 3 group syn with respect to carboxylato ligand showed a weaker N-H…O hydrogen bond. From a coordination stand point, the two structures were very different showing a three-coordination through S, O(c), O(a) for the more stable one (9anti), with respect to a four-coordination through S, two O(c), and an O(a), for the other (9syn). The optimization at the BS2 level for 9 was carried out by starting from the more stable molecule previously optimized at BS1 and brought about a refined structure that is presented in Figure 6c. On introducing the treatment of solvent (water, BS1W), the optimized structure ( Figure 6d) did not change much with respect to those obtained for the gas phase both at BS1 and BS2. Therefore, the addition of an amide ligand into the coordination arrangement of the ternary Cu(I)-thiolato-formato complexes might introduce a variety of stabilizing intramolecular hydrogen bond and hydrogen bond type interactions, and showed that the amide might behave as optimized parameters were 2.210, 1.954 Å, and 177.8°. The (ortho)H…Cu contact distance was 2.875 Å, the agnostic-type interaction that could be at BS1, was not confirmed.
On passing to [Cu I (SCH 3 ) 2 (O(H)(C 6 H 4 )CH 3 )] − , 12, several structure optimization attempts using BS1, BS2, BS1W, and BS2W levels of theory and starting from the model reported in Figure 9a did not reach acceptable convergence (see above computational methods). In fact, the O(H)(C 6 H 4 )CH 3 entity moved out from the metal coordination and the Cu I (SCH 3 ) 2 residue reached an almost linear arrangement (Figure 9b, c). The structure optimizations were halted when the total electronic energy reached an almost plateau, and in general when the last step structures had not an acceptable chemical meaning in terms of coordination of the hydroxyl function through the oxygen donor. An interesting finding was that the O(H)(1,4-C 6 H 4 CH 3 ) ligand moved towards complete dissociation from metal and the bis(methanethiolato)copper(I) residue reached linearity [13]. On adding a second S(H)CH 3 ligand to copper(I) in the 11 model to bring about [Cu I (S(H)CH 3 ) 2 (O(H)(C 6 H 4 )CH 3 )] + , 13, the structure had a pseudo-trigonal arrangement at metal (BS2, Figure 10 and Table 5). The S-Cu-S bond angle was larger at BS2 (148.9°) than at BS1 (126.0°), showing a significant effect by the basis set type. The computed Cu-S bond distances averaged 2.280 Å (BS2). The lengthening effect on the Cu-O bond distance by the basis set expansion was even larger: Cu-O 1.960 Å (BS1) and 2.208 Å (BS2).
The experimental structures for small coordination molecules that contained Cu(I) were not many (CSD, [15]). The structure coded FOSTIR [18] was relevant to a molecule in which a Cu(I) was linked to (C) S − and (CO)O − donors; the Cu-S and Cu-O bond distances were 2.35 and 2.25 Å, respectively. A second molecule, coded JAQROI [19], showed a Cu(I) linked to CS − where Cu-S distances averaged 2.42 Å, and a Cu(I) was linked to COO − (Cu-O lengths averaged 2.19 Å). Therefore, the estimated (DFT-computed) bond distances from the present work can be considered in agreement with those from experiments and also in agreement with values from a previous work from this laboratory [13] and citations therein.
As a partial conclusion for structural aspects, we wish to say that the effect of adding a carboxylato ligand (in this work, formato ligand was considered a model of glutamyl or aspartyl carboxylate group) to a Cu-SCH 3 /-S(H)CH 3 fragment (here considered as a Cu-S(cysteinyl group)) was small. In fact, lengthening on the Cu-S bond was just 0.034 Å (1.6%) when compared to the addition of a second methyl sulfide anion 0.058 Å (2.7%), even though formato should demand more space. In other words, the addition of formato to a Cu-SCH 3 − coordination group weakened the existing Cu-S bond less than the addition of a second SCH 3 − /S(H)CH 3 ligand. Second, amide groups were able to form adducts with the [Cu I (SCH 3

/S(H) CH 3 )(OOCH)] − complexes, via Cu-O(a) and even Cu-C(a)/O(a) (η 2 -coordination arrangement).
Furthermore, a search on the coordination modes at Cu(I) in some metal-protein systems via PDB [20] revealed several consistencies with the computed models; on the contrary, for other aspects the proposed structures from solid-state XRD studies and from NMR solution studies are unusual when compared to present theory models.
The first selected example is that coded 4BTe [3]. That structure had a Cu(I) ion almost linearly coordinated (S-Cu-S, 154°) by two mercapto-cysteinyl functions at the homodimer interface ( Figure 1S) of the DJ-1 protein. The Cu-S bond distance as obtained from the experimental structure was unacceptably short 1.93 Å (van der Waals radii being Cu, 1.40 Å and S, 1.80 Å [14]). On the basis of the computations reported in the present work, [13] and on the values of van der Waals radii [14], one can say that the coordination arrangement is chemically reasonable and the values of bond lengths are acceptable only if probable high standard deviations are taken into account. Furthermore, the structure coded 4BTe [3] had an almost linear S-Cu-S arrangement and that situation was nicely reached even upon refining the system [Cu I (SCH 3 ) 2 (O(H)(C 6 H 4 )CH 3 )]; these latter partially refined structures had Cu-S bond distances of ca. 2.207 Å (BS2 and BS2W) that well compare with those from fully optimized structure of [Cu I (SCH 3 ) 2 ] − (ca. 2.200 Å from BS2 and BS2W [13]).
Another interesting example is that reported in the series of structures coded 1KVJ [5] that showed a Cu(I) center linked by two cysteinyl groups (Cu-S, 2.16 Å; S-Cu-S, 145°) and by a serine OH group (Cu-O, 1.97 Å) from the ATP7A protein associated with the Menkes disease ( Figure 2S). The structure is in excellent agreement with the optimized one for the model [Cu I (S(H)CH 3 ) 2 (O(H)(C 6 H 4 )CH 3 )] + , 13 (Figure 10b), that at BS2 has a S-Cu-S bond angle of 148.9° and Cu-S bond distances by 2.280 Å (av), and Cu-O by 2.208 Å (BS2). One has to take into account the high standard deviations about structural parameters for large biomolecules as determined via solution studies. Noticeably, the metal center for structures coded 1KVJ was linked also to a carbonyl oxygen at Cu-O distance by 2.134 Å.
A third noticeable example was the PDB structure coded 1TL4 that reported on solution structures of the apo and Cu(I)-loaded human metallochaperone HAH1 [6] (Figure 3S). A biscysteinato Cu(I) was linked weakly by an amide oxygen Cu-O 2.692 Å; the Cu-S bond distances averaged 2.16 Å and S-Cu-S bond angle was ca. 153°. Therefore, the computations showed structural parameters that were similar to some experimental findings. The theory allowed better evaluation of certain bond distances from experiment that were grossly determined because of the complexity of large biomolecules, and disorder and thermal motion. In fact, the short bond distance Cu-S 1.925 Å claimed for structure 4BTe is barely acceptable when compared to the sum of covalent radii (Cu, 1.38 and S, 1.02 Å) [14], and to computed Cu-S bond distances by 2.205 and 2.243 Å at BS2 level for [Cu I (SCH 3 ) 2 ] − or [Cu I (S(H)CH 3 ) 2 ] + [13]. The experimental methods from XRD at solid state and from NMR in solution could then find benefit from the computed values at the stage of interpreting density Fourier maps and magnetic tensor values.

Energy
The values of total electronic energies (see Table 1) for selected computed models allowed the evaluation of selected formation energies ( Table 6). The total electronic energies were computed at the DFT-BS1, -BS2, -BS1W, and -BS2W levels of theory, respectively. Attempts to compute more reliable absolute energy and thermal parameters were preliminarily done at higher levels but it was found that just ab initio CCSD(T)-/6-311++G** could bring about reliable values. unfortunately, those computations require computational times and costs that could not be reached in the present project. Therefore, qualitative inferences were only performed using a comparative analysis of electronic energies. The formation energy for [Cu I (OOCH)], 5, was −39.30 kcal mol −1 when computed at BS2W; instead the corresponding value for [Cu I (SCH 3 )] was −54.89 kcal mol −1 , showing a gap by ca. 15.6 kcal mol −1 in favor of bond formation at thiolato. Therefore, the formation energy computations gave strong rationale to Cu-S linkage in copper-trafficking protein, at least when computed at the more expanded basis set, and the effect of solvent was taken into account (BS2W). Similar trends were observed for models computed at BS2 (different by 18 The computed formation and partial reaction energies gave rationale at least to a couple of experimental findings of complex structural types at solid state and solution phase. First, the mononuclear tris(methanethiolato)Cu(I) anion is energetically unfavored and had no experimentally relevant structure (or at least they are very rare). even the bis(methanethiolato)(O(H)R)copper(I) species did not have a significant stabilizing contribution. Thus, just the structures of the type {Cu I (SR) 2 } were found through computations (and possibly through experiments). This does not mean that {Cu I (SR) 3 } or {Cu(SR) 2 (O(H)R)} structures do not exist or that they will not be found in the future, but based on the data the absence/ paucity of structures of the type {Cu I (SR) 3   )], 9, revealed that computed frequencies and respective intensity depend significantly on basis set types and on inclusion of solvent effects (on the contrary the computed force constants were in general not much influenced by the basis set type and solvation). The data discussed in this paragraph will be mostly those from computations at BS2 and BS2W.

Vibrations
Selected computed infrared effects at BS2 and BS2W levels of theory are reported in Table 7 (and  Table 1S) and Figure 4S. Notes for the selected model molecules are: (i) on passing from the ternary methanethiolato derivative (6) to the methanethiol one (7), the frequency for the intense absorption attributable to the ν(C-H) formato vibration increased by ca. 20 cm −1 (BS2W); (ii) the computed ν(C=O) formato stretching frequency was 1615 cm −1 for 6, while for 7 the same C=O movement had a small blue shift to 1622 cm −1 . The magnitudes for infrared intensities (high for several vibrations) were 191.14 km mol −1 (7, ν(C-H) formato ), and 1007.50 km mol −1 (7, ν(C-O) formato ), as computed at BS2W; (iii) analysis for the S-H stretches were predicted to cause low intensity infrared effects centered at 2657 cm −1 (4.32 km mol −1 ) for 7 and 2664 cm −1 (1.09 km mol −1 ) for 9 as computed at BS2W; (iv) the computed stretching frequencies for the combined km mol −1 O formato -Cu and S-Cu symmetric and asymmetric movements for 7 were 449 cm −1 (23.10 km mol −1 ) and 285 cm −1 (2.27 km mol −1 ), respectively.
As the reported experimental infrared bands corresponding to S-H stretching mode were 2525-2600 cm −1 [21][22][23], a scale factor by 0.96 was estimated and the latter is in agreement with data previously reported [24]. The scaled frequencies are also reported in Table 7 (and Table 1S Figure 5S and listed in Table  8, whereas the structure for [Cu I (SCH 3

1 H-NMR chemical shifts
Computed 1 H-NMR chemical shifts (from TMS) for selected model molecules as computed at BS2W level of theory are reported in Figure 6S and in Table 9. The signals for CH 3  Noteworthy, NMR-GIAO computations at BS2W level were quite reliable for estimating 1 H chemical shifts for organic molecules [26] and metal complexes [25]. Therefore, it can be reasonably assumed that even for the complex molecules studied in the present work, the chemical shift values are significant.

Conclusion
The work provided rationale for several structural features of selected copper protein systems involved in copper trafficking found experimentally at the solid state and in the solution phase via X-ray diffraction and NMR techniques, especially those that involve mononuclear coordination. Most remarkably, the investigation of estimated qualitative energy of formation of the computed models explained that the linear Cu I S 2 coordination mode was the most favorable both for thiolato S(-)R and thiol S(H)R type ligands when compared to Cu I S, Cu I S 3 and Cu I S 4 modes. Then, the grafting on the Cu I S 2 skeleton by certain hard donor groups like alcohol/phenol(O(H)R) and carboxylate and amide functions are suitable for further stabilizing the overall coordination scaffolding.
Moreover, the structural findings in the present work represent an aid for better interpreting the experimental data from XRD and NMR techniques. At that regard, the work provided a comparative analysis among four levels of theory: the computed results from BS2 level of theory should be selected Table 9. Chemical shifts (ppm) for the 1 h-nmr spectra for the selected complex molecules with thiolato and methanethiolato ligands as computed at the Bs2W level of theory. when linking interactions between Cu(I) and sulfur are involved. The treatment of solvation effects via PCM methods (solvent, water, and BS2W) did not have dramatic influence on structural parameters and on reaction energies, even though it should always be adopted (unless it is not advisable for the high computational costs).

Supporting material
Diagrams showing portions of experimental Cu(I) proteins are reported in the literature: DJ-1 Parkinsonism-associated protein ( Figure 1S); ATP7A protein associated with the Menkes disease ( Figure  2S), human metallochaperone HAH1 ( Figure 3S). Selected computed infrared data as computed at (BS2) and {BS2W} levels of theory for the ternary and quaternary Cu(I) complex molecules with formato, methanethiolato, methanethiol, formato, formamide ( Table 1S, Figures 4S and 5S. Diagrams showing 1 H-NMR spectra for selected complex molecules with thiolato and methanethiolato ligands (reference, TMS at BS2W level of theory) are also reported in Figure 6S. Cartesian coordinates for fully optimized structures (and a few interesting not converging ones, partially optimized) are reported in Tables SI-SLXXVI.