Crystal structures of amino acids: from bond lengths in glycine to metal complexes and high-pressure polymorphs

After the discovery of X-ray diffraction by crystals, amino acids were among the first organic compounds to have their solid-state structures investigated. The Cambridge Structural Database now contains more than 3500 entries for α-amino acids alone. After a short introduction dealing with the early history of X-ray structure determination, this review provides a classification of amino acid structures, describes essential structural elements, especially hydrogen bonding preferences and coordination to metal ions, and considers recent investigations on phase transitions as the result of extreme temperatures or pressure.


Introduction
. The 20 amino acids directly encoded by the genetic code with full names and three-letter abbreviations. Names of amino acids that are 'essential' in the sense that they need to be part of our diet, as humans cannot synthesize them, have been underlined. pK a -values for acidic and basic groups have been indicated. The amino acids are depicted as their l-enantiomers, extra stereogenic centres in the side chains of Thr and Ile have been identified by an asterisk.

Chiral amino acids
An essentially correct crystal structure of dl-Ala was first determined by Levy and Corey [23] in 1941, with all calculations being made by hand. Seven years later Donohue [24] took advantage of the technological development to obtain a better model from the same data-set: The calculation of three-dimensional plots of interatomic vectors and electron densities has been made feasible recently through the use of punched card methods and especially by the design of a file of cards to correspond to a set of Beevers-Lipson strips. These methods have made possible the rapid calculation of one-and two-dimensional Patterson and electron density functions and have reduced to the order of several days the time required for the calculations of corresponding three-dimensional functions.
In view of the methods available today, the use of the word 'rapid' is interesting.
Solving the phase problem is considerably more challenging for chiral than for chiral or racemic compounds. Accordingly, in 1950, when Shoemaker et al. [25] published the results of their three-year long project on l-Thr, just 14 other detailed structures of organic molecules crystallizing in Sohnke space groups were known, of which only half dealt with chiral substances (i.e. not achiral or meso forms). More amino acid structures appeared in the following years (as hydrates for l-Asn [26] and l-Arg [27]), the sequence to some extent reflecting the ease with which high-quality crystals could be obtained. The initial rush thus ended, in 1976, with the 17th amino acid, l-Leu, notorious for its thin, flaky crystals. [28] After a 14-year break (or 10 if the determination of the α form of l-Glu [29] is considered), a crude structure (R-factor 0.147) was revealed for phenylalanine (as the d-enantiomer). [30] Recently, anhydrous structures were finally reported for l-Asn, [31] l-Arg, [32], l-Trp, [33] and l-Lys, [34] meaning that all structures of the standard enantiomeric amino acids have now been investigated (Table 1).
For the corresponding racemates, a few more are missing: dl-Lys seems extremely difficult to crystallize, while both dl-Asn and dl-Thr form large, nice crystals that are racemic twins, yielding effortlessly the orthorhombic structure of the pure enantiomer. dl-Phe is a more difficult case, giving a complex diffraction pattern where every reflection is split into at least five independent peaks, but where the unit cell of l-Phe can nevertheless be identified. [36] This suggests that true racemic dl-Phe crystals cannot be obtained.

Non-standard experiments and experimental conditions
Lists of all structures discussed in this section are available as supplemental data.

Neutron diffraction: locating the hydrogen atoms
As shown in Figure 2, there was a surge of neutron diffraction investigations in the 1970s. The resulting structures, limited to amino acids that easily grow large crystals, yielded detailed information on the hydrogen bonding patterns. Neutron diffraction is sometimes used to resolve issues concerning the protonation state of a molecule or a complex, but all these amino acids unequivocally occur in the zwitterionic state in the crystal. The more recent research efforts using single crystal neutron diffraction have been motivated by the search for phase transitions in Gly suggested to be associated with anomalous electrical behaviour, [37] search for structural evidence for parity violation (giving a preference for l-over d-amino acids in nature), [38] investigation of a very short hydrogen bond in an amino acid complex (WEHZAL01), [39] verification of a putative hydrogen bond between water and Pt(II) (CCAPGC11), [40] and in conjunction with X-ray diffraction (XRD) to perform topological analysis of the electron density in order to analyse the effect of the multi-polar refinement strategy. [41] Neutron powder diffraction, still a relatively new method, has emerged as a valuable tool for investigations of phase transitions at high pressures. [42][43][44]

Amino acids and synchrotrons
The first crystal structure of an amino acid derived from data collected at a synchrotron was the α-polymorph of dl-norleucine in 1995 (DLNLUA01). [45] After about 2002, crystallographers dealing with amino acids have been more regular clients at synchrotrons. The now 40 different investigations can be divided into two groups; either high-resolution data collections for charge density investigations or high-pressure studies using single crystal or powder diffraction (see below).

Charge density refinements
The traditional anisotropic refinement, with a spherical atom model, is adequate for most purposes, but cannot provide the details needed for a topological analysis of the experimental  electron density. For amino acids this methodology was first applied by Legros and Kvick in their description of the deformation electron density of α-glycine at 120 K. [46] A series of about 40 extremely accurate structures resulting from charge density refinements have since been reported. Usually the experimental data are compared with the results of high-level ab initio calculations. The 'invariom model' concept, where the independent atom model has been replaced by transferable pseudo-atoms taken from a database for theoretically predicted multipole populations, has been used for several amino acid structures. [47][48][49] Lately, the focus has been on crystal-field effects [50] and dynamic electron densities as a function of temperature.
[51] Figure 2. The evolution over time for different types of non-standard experiments including the use of neutron (single crystal or powder) or synchrotron radiation sources, charge density investigations, data collections below the boiling point of N 2 (l), as well as investigations based on powder data or data recorded at high pressure. The scale is the number of published investigations and not the number of entries in the CSD, [16] which particularly for high-pressure studies is significantly higher. A single structure report may occur in more than one column, e.g. a high-pressure investigation using powder neutron diffraction.

Ultralow temperatures
Data collections at temperatures lower than what can be obtained with N 2 (l) cooling (77 K) have been performed on about 17 different occasions, again usually associated with charge density refinements. The lowest temperature, 7 K, was used for a data collection on cis-bis(lalaninato)copper(II) (CIYQAC). [52]

Powder XRD
The vast majority of the structures in the CSD have been determined from single crystal diffraction experiments, but from 2004, with the ab initio structure determination of a nickel aspartate oxide (IVODOM), [53] several investigations using powder XRD have appeared. Recent studies include l-Arg (TAQBIY), [32] l-Lys, [34] two l-Ser oxalic acid salts (XENOF, XENXUL01) [54] and l-Phe polymorph II. [55] Powder XRD is now a preferred method for studies of phase transitions at high pressures as used, e.g. for the Gly polymorphs. [56][57][58]

High-pressure investigations
After the development of diamond anvil cells, structural studies at non-ambient pressure has become one of the hottest fields within structural studies of amino acids, see [59,60] for two recent reviews. Using X-ray, neutron or synchrotron radiation on either single crystal or powder samples, data have been collected at pressures up to 15 GPa, [45] giving unprecedented information of high-pressure phases and transitions between them, as described in Section 16.3.

2015: what is around? 4.1. Defining an amino acid
Considering a straightforward textbook illustration of an amino acid with a polar head and a side chain, defining an 'amino acid' may seem trivial, but is in fact not so. Figure 3 shows three molecules that in principle are amino acids, but are categorized as a peptide, [61] a carbohydrate [62] and a nucleotide [63] in the CSD. These rare examples of hybrid molecules have not been discussed in detail here. Other structures that have been omitted have very large side chains [64] or side chains with so many functional groups that the actual amino acid part becomes insignificant compared to the rest of the molecule. As a rather arbitrary limitation, a molecule will here be considered to cease to be an amino acid if the 'side chain' contains four or more functional groups, as PIVBOL in Figure 4, and the overall formula has 40 or more C atoms. Furthermore, the present review deals exclusively with 2-or α-amino acids, omitting, e.g. the much less numerous βand γ -amino acids, such as β-alanine and γ -aminobutanoic acid. Most structures presented belong to the Cambridge Crystallographic Data Centre (CCDC)-class 'amino acids, peptides and complexes', but a fair number of obvious amino acid entries, such as l-serine hydrate (LSERMH18), [65] do not carry such a tag, so CSD searches were not limited to this subset of structures. Among N-substituted amino acids only Pro and its endocyclic derivatives will be discussed in some detail together with N-alkylated amino acids. Acid esters are dealt with quite briefly, as are N-protected and N-and C-protected amino acids.

Acid-base properties and the issue of charge
The mandatory presence of an amino group and a carboxylate group confers vital acid-baseproperties on the amino acids, and even the simplest amino acids can be titrated as diprotic acids. Additional acidic or basic functional groups may be present in the side chains, yielding a series of different protonation states, as shown in Figure 4. A code of type x(N) is used to define each state, where x = a, z or c for an anionic, zwitterionic, or cationic polar head, while N gives the total charge on the molecule. z(0) thus represents the normal zwitterion of apolar and polar amino acids, while an amino acid in a c(2) state has a + H 3 N-CHR-COOH polar head Figure 4. The protonation states of amino acids in crystal structures. The charge of the polar head (vertical axis) is either −1 (anionic, a) or 0 (zwitterionic, z, or alternatively neutral, n), while the overall charge of the amino acid given in parentheses (horizontal axis) goes from +2 to −2. For each code the number of unique structures in the CSD is given in bold type face, with numbers for Gly added for z(0) and c(1). Cyclic amino acids, including Pro, are dealt with separately, with numbers appearing below in italic type face. Each state is illustrated by a selected structure with refcode indicated. with an additional positive charge in the side chain. The code a(0) correspondingly indicates an H 2 N-CHR-COO − polar head where the negative charge is being balanced by a positive charge in the side chain to give an overall charge of 0 (for an example of an untraditional zwitterion, see Section 8).
Depending on the nature of the side chain, various amino acids will move through the protonation states in various manners upon titration with a strong base. Figure 5 provides examples for the apolar amino acid Ala (A), the basic amino acids His (H) and Arg (R), and the acidic amino acids Glu (E) and O-phospho-Thr (X, see z(−1) GEDPIP [66] in Figure 4). The c(1) → z(0) → a(−1) path of Ala is shared with Gly and Pro as well as all polar and hydrophobic amino acids in Figure 1 except Cys and Tyr, which have additional, weakly acidic groups in the side chains. The difference between His and Arg comes from the fact that the imidazole moiety of the former is a weaker base (stronger acid) than the C α amino group, while the opposite is the case for the guanidine function of Arg. On the acidic side, the Glu protonation states reflect the higher pK a of the side chain compared with the C α carboxyl group. When dealing with non-standard acids this may not be the case; for O-phospho-Thr (X) the diprotic phosphate group is the strongest acid, so the first titration step is c(1) → c(0). A sixth path, c(1) → c(0) → z(−1) → a(−2), not shown in Figure 5, applies to a small number of nonstandard amino acids with a monoprotic side chain with pK a lower than the C α carboxyl group. One example is cysteic acid, shown in its a(−2) state in Figure 4 (AZIQAC). [67] The potential a(−3) state of X and three related amino acids has incidentally never been observed in a crystal structure. In fact, only a few amino acids have been crystallized in three different states, and Glu is the only one that has been crystallized in four, Figure 6. [68][69][70][71] In addition to the regular protonation states shown in Figure 4, there is also an intermediate state where two amino acids, usually in the c(1) and z(0) states, form a strong -COOH· · · − OOChydrogen bond and share a total charge of +1. This surprisingly large group of 57 structures will here be simply referred to as 'Speakman salts' [72] and is discussed in Section 10.  [70] and a(−2) dianion in the Ca 2+ trihydrate salt (CAGLUT10). [71] The variable side chain conformation is not directly linked to the charge; the N-C α -C β -C γ and C α -C β -C γ -C δ orientations are from left to right (gauche-,trans), (gauche-,gauche-), (gauche+,trans) and (trans,trans).

The CSD
The CSD [16] contains more than 700,000 entries, among them about 3500 amino acids and complexes thereof. Structures are retrieved from the CSD using the program ConQuest. Many journals currently require that structures have been deposited with the CCDC as crystallographic information files (cif) [73] prior to publication. They can then be referred to by their deposition numbers in the ultimate paper. Web-based services like checkCIF [74] mean that gross errors are now usually avoided, and the CCDC staff furthermore identify and correct minor errors before the structure is eventually integrated into the database. Nevertheless, any search for information must take into account that older, more error-prone structures will be retrieved, and also that, even for rather new structures, the original authors unintentionally or intentionally may have made assignments or omitted data (such as H atom coordinates) crucial to the type of investigation presented here.
One example of a common type of mishap becomes obvious if one chooses to search for the fragment indicated as n(0) in Figure 4, that is a state with an uncharged polar head and an overall charge of 0. With the R-group set simply to C, this query returns 37 hits (with 19 published after 2000); all of which upon closer inspection turn to be of amino acids in the normal z(0) state or something else. This is partly due to errors made by the original authors, often a failure to introduce three amino H atoms, or an inaccurate preparation of the associated structure diagram when the entry was introduced into the CSD, the latter incident usually being associated with entries devoid of H atoms. Notably, even in the absence of H atoms it is generally quite easy to tell the true protonation state from the C-O bond lengths of the carboxyl/carboxylate group and the proximity of obvious hydrogen bond donors and acceptors, as shown in Figure 7. [26,75] Of less importance for the present survey, but still of concern is the observation that not all authors check the chirality of their structure before submitting it to a journal. A non-exhaustive search quickly identified 27 zwitterionic structures claimed to be the l-form, but where the coordinates in the CSD actually describe the d-enantiomer (25 cases) or opposite (2 cases). Salts of His for some reason represent a particular case with wrong chirality being assigned to 7 out of 42 structures (17%). This serves as a reminder to always check that the coordinates supplied indeed describe the correct enantiomer with reference to the paper title and the chemicals being used. A routine in checkCIF gives chiralities (S or R) for all stereogenic centres.
When dealing with statistics on hydrogen bond geometry, other issues must be considered. Neutron diffraction investigations, whether they are performed for deuterated samples or not, give systematically longer N-H/D distances than XRD does. The simple solution is to normalize all these covalent bonds to a common value, at the expense of removing fine details such as the Figure 7. Illustration of l-Asn monohydrate from CSD refcode ASPARM. [26] The chemical diagram to the left shows the amino acid in the n(0) protonation state, but from the location of neighbouring hydrogen bond donors and acceptors to the right (water molecules are orange spheres) it is easy to tell that the molecule is actually a zwitterion. In this case the assumed z(0) state was later verified by high-precision redeterminations with H-atoms included, such as refcode ASPARM02 (neutron diffraction). [75] systematic elongation of the N-H bond in the strongest interactions. More serious errors can be introduced during refinement of the amino group, which by some authors is constrained to take a perfectly staggered orientation, e.g. by a SHELX HFIX 33 command. It is often obvious that a small rotation away from the idealized minimum would give three shorter hydrogen bonds with more linear geometry. Figure 8 provides an anonymous example with three incorrectly bent N-H· · ·O angles in the range of 125-147 • . Needless to say, the CSD is unduly biased by such structures. Figure 8. The effect of amino group rotation on hydrogen bond geometry. In the (anonymous) experimental structure (side chain as small sphere) the amino group was fixed in a staggered position, while it is obvious that a ∼30 • clockwise rotation would result in three shorter interactions. Figure 9. A selection of non-standard amino acids from the CSD. Included are straight-chain amino acids such as Abu, Nva and Nle, aIle with opposite chirality of Ile at C β , mCys and DAP, which together with Cys and Pen are used for the formation of metal complexes, and some naturally occurring amino acids with names indicated (L-DOPA is L-3,4-dihydroxyphenylalanine). JAPJUF [78] is an example of an achiral amino acid without an H atom on C α , BUHSOM [79] is a bulky amino acid with an endocyclic N atom, while BABREB [80] is another amino acid designed for metal ligation. In the bottom right corner are the dimeric amino acids cystine and lanthionine.
The current survey is based on retrieval of structures with ConQuest, with illustrations being prepared by Mercury. [76] Initial searches produced lists of amino acid structures, which were then manually scrutinized to remove hits for duplicate entries of the same compound. This measure was necessary as some amino acids have been subjected to two or more investigations, giving rise to a family of entries in the CSD with refcodes for one or more polymorphs being differentiated only by the final two digits (nn in the code ABCDEFnn). The extreme example is Gly, for which the CSD holds no fewer than 99 entries. For each refcode family, only the best entry with respect to R-factor for each polymorph was retained. Further searches were then performed on the appropriate subset of the database with the desired refcodes, e.g. a file called z(0).gcd for zwitterionic amino acids (an option was to perform searches on a subset of the database by selecting 'Entries in a pre-defined hitlist' in ConQuest and then 'Best R-factor list'. [77] This algorithm was, however, considered to be too strict for the present investigation as entries with R-factor >0.10 are excluded, as are entries with any kind of disorder).

Structure overview
The number of structures belonging to the various protonation states is depicted in Figure 4, not including structures where amino acids are ligands for metal ions. In addition to the 20 standard amino acids (Figure 1), illustrations of 366 other amino acids for which crystal structures are available are provided as supplementary material. Some of the most common or biologically important non-standard amino acids are presented in Figure 9 together with some more exceptional ones to illustrate the variation in terms of side-chain size and functional groups. [78][79][80] Some non-standard amino acids can be seen as slight modifications of the standard ones. A number of these occur in nature and have various biological functions. In the supplementary material an asterisk is used to denote an acid where the origin is rather obvious, e.g. l-Tyr* for l-DOPA in Figure 9. Other synthetic amino acids have been prepared for a range of reasons and objectives and may have no link to any of the standard amino acids. Such compounds are all referred to by the same three-letter code Xaa.

Zwitterionic amino acids, z(0) 5.1. Overview
A total of 213 monomeric amino acids, including 18 of the 20 common acids, shown in Figure 1 (not Arg and Lys), have been crystallized in the traditional zwitterionic state with charged amino and carboxylate groups, Figure 10. The 126 acyclic + H 3 N-CHR-COO − amino acids (excluding Gly) occur in 267 unique crystal structures. Additionally, there are eight dimeric amino acids. The side chain is usually neutral, but may on rare occasions be zwitterionic itself, as seen for AZASER11 [81] in Figure 4.

Hydrophobic side chains
The definition of a hydrophobic amino acid is not totally clear-cut. Here an amino acid is considered hydrophobic when no other functional groups than the amino and carboxylate groups of the polar head are involved in strong hydrogen bonds to O or N acceptors. This means that Cys is usually considered part of this group, while Trp is variable; its side chain >N-H is sometimes involved in N-H· · ·O (or N-H· · ·N) interactions, but often finds an acceptor in the π -electron system of a nearby Trp or some other aromatic moiety. The limited number of hydrogen bond formation possibilities within this group facilitates an analysis of intermolecular interactions. With the exception of Ala, the tetragonal form of dl-Abu [82] and a very small number of other amino acids, the structures are always divided into distinct hydrogen-bonded layers separated by hydrophobic regions containing the side chains. In a typical structure, the hydrogen bonding pattern is composed of two separate sheets, as shown for dl-Val in Figure 11. [83] Five different sheets called L1, L2, L3, Lx and LD have been identified, Figure 12.[84] These can be Figure 11. Typical construction of the crystal structure of an anhydrous hydrophobic amino acid. The example shown is dl-Val (VALIDL03). [83] A hydrophilic layer is divided into two sheets, and as this racemate has an L1-D1 hydrogen bonding pattern, [84] l-enantiomers (light grey C atoms) and d-enantiomers (dark grey) form independent sheets. Hydrophobic regions include the side chains, contributions from neighbouring molecular bilayers being separated by a hydrophobic interface. Figure 12. Hydrogen-bonded sheets found in the crystal structures of hydrophobic, zwitterionic amino acids. [84] Side chains are removed for clarity; C β -atoms are coloured in orange. In the LD pattern d-enantiomers are shown with C atoms in a darker tone. 3-Fluoro-l-phenylalanine monohydrate (EXAXAC) [85] is included in the bottom right corner as an example of a pattern incorporating co-crystallized water molecules. The yellow arrows labelled A and B in the L3 sheet show the two repeating distances, usually corresponding to simple translation along the unit cell axes. combined into layers in six different ways: LD-LD and L1-D1 for racemic (or quasiracemic) structures, and L1-L1, L2-L2, L3-L3 and Lx-Lx for enantiopure substances. Notably, the difference between L1 and Lx in Figure 12 may seem insignificant, but the distinct nature of L1-L1 and Lx-Lx layers is easily spotted as a 125 • relative rotational between the two sheets constituting a layer. Thus, there are no borderline cases with respect to identifying the hydrogen bonding pattern for these substances. Ab initio calculations [84] indicate that the energetically most favourable interactions occur in the L2-L2 layer, but a shift to an Lx-Lx pattern may be triggered to avoid voids that could form when side chains are linear, or to an L1-L1 pattern with a bulky side chain like for l-Phe. [84] A further increase in the size of the side chain compared to Leu and Phe may lead to single-sheet hydrogen bond architectures. dl-Trp (VIXQOK) [86] and dl-Pen (CEDFAS) [87] (for Pen, see Figure 9) share the same pattern, which is also related to l-Trp. [33] Another pattern has been observed for dl-hexafluorovaline (SIMRAH) [30] and dl-2-ammonio-3-(trimethylsilyl)propanoate (KUMKAF). [88] Alternatively, water molecules may be incorporated as parts of the hydrogen bonding network. This can happen in a number of different ways, Figure 12 shows one example observed for two fluorinated derivatives of l-Phe (EXAXAC and EXAXIK). [85]

Polar side chains
Many standard and non-standard amino acids have functional groups in their side chains that actively take part in hydrogen bonding. For this reason, polar amino acids constitute a heterogeneous group when it comes to crystal packing arrangements. The specific hydrogen bond donating and/or accepting groups put strict demands on the positions of accepting/donating groups in every single structure, regardless of whether these are present in neighbouring amino acid molecules, in co-crystallized solvent molecules or in partners in larger complexes. Essentially every single structure of a polar amino acid has a unique hydrogen bonding pattern and a crystal packing arrangement.

Pro and its derivatives
In addition to Pro itself, crystal structures of 40 zwitterionic amino acids with an endocyclic N atom and ring sizes varying from 4 to 7 atoms can be found in the CSD ( Figure 13). [89,90] The ring systems may carry several additional functional groups and be part of a polycyclic system. Due to the ring, Pro-derivatives have only two amino H atoms available for hydrogen bonding in the zwitterionic state. This means that the double-sheet construction principle for hydrogenbonded layers does not apply to Pro. Rather, group of amino acids may form, in the absence of other hydrogen bond donating groups and co-crystallized water, a single-sheet architecture or one-dimensional (1D) tapes. The structures shown in Figure 14 illustrate how a minor change in molecular composition can alter the hydrogen bonding pattern: l-Pro (PROLIN) [91] forms sheet, while the analogue with -C γ H 2 -replaced by -S-(NELSEC) [92] builds tapes. The pattern in Figure 14(a) is clearly related to the L1 pattern for regular hydrophobic amino acids Figure 13. A selection of amino acids with endocylic N atoms and variable ring sizes. CSD refcodes are indicated. Hydroxyproline (HOPROL12) [89] is produced by hydroxylation of Pro by the enzyme prolyl hydroxylase and comprises roughly 4% of all amino acids found in animal tissue, an amount greater than seven other amino acids that are translationally incorporated. CLAVIC [90] is the only amino acid with a seven-membered ring system.  Figure 15. Amino acids without an H-atom at C α . There may be two independent, acyclic substituents, as for Aib, but often they are connected to a ring system with size from 3 to 20. SIPNIO is one of two dimeric amino acids belonging to this category. The three last, bottom compounds have endocyclic N atoms. illustrated in Figure 12, but going from top to bottom the N-H· · · O head-to-tail chains switch direction rather than being parallel, as in Figure 12. The effect of this change is to put side chains on alternating sides of the sheet, an arrangement that is not possible for regular amino acids, which must have all side chains on the same side of the sheet in order to form the double-sheet hydrophobic layer discussed in Section 5.2. Figure 16. Hydrogen-bonded sheet for a C α -dialkylated compound (JAPJUF). [78] The ring system has been curtailed at the two C β atoms, shown as small spheres.

α-Dialkylated amino acids
Amino acids without an H-atom on C α may have two non-cyclic substituents (12 compounds), like α-aminoisobutanoic acid, Aib (AMMPRA01), [93] but often they are connected to form a ring system or polycyclic structures, (Figure 15) (30 compounds). There are also two examples of substituted Pro-derivatives, and one molecule (MEPROL) [94] with both types of ring systems, seen in the bottom right corner of both Figures 10 and 15.
The steric crowding of amino acids with two substituents on C α has a profound impact on the hydrogen bonding pattern of these structures. In the absence of side-chain donors and acceptors, the propensity for co-crystallization of water is very high, occurring for 21 out of 27 structures. Achiral Aib forms a 3D hydrogen bonding network, [93] while the remaining five anhydrates form structures that are divided into layers with two-dimensional (2D) hydrogen bonding in single sheets. The pattern shown in Figure 16 is shared by three compounds: the aminocyclohexanecarboxylic acid JAPJUF (Figure 9), [78] ACYHXA01 [95] and ACXTPY [96] (Figure 15). It is furthermore related to the patterns observed for l-Trp [33] and dl-Trp. [86]

Quasi-racemates
In contrast to regular racemates, quasi-racemates are complexes between two rather similar, but still distinct chemical compounds of opposite chirality. One example is l-Val:d-Ile, where only the additional side-chain methyl group of Ile distinguishes the two components. Amino acid quasi-racemates build crystal structures with hydrogen bonding patterns familiar from ordinary racemates, i.e. with LD-LD or L1-D1 patterns. [84] A substantial number of such complexes has been investigated, Figure 17, to reveal fine details on hydrogen bonding geometry and learn how small modifications of the side chains dictate the observed hydrogen bonding pattern. Figure 17. Crystal packing arrangements for quasi-racemates of hydrophobic amino acids, the diagonal representing the true racemates. dl-Abu is here indicated as having an LD-LD hydrogen bonding pattern. [84] This is true for the A, C and D polymorphs, while the tetragonal B polymorph has a 3D network. [82] [97-100] The following empirical rules have been deduced [97] for the effect of the side chain: (i) complexes with two linear amino acids have LD-LD patterns; (ii) complexes with two branched amino acids have L1-D1 patterns; (iii) complexes with one linear and one branched amino acid form L1-D1 when the acid with branching is Leu and LD-LD patterns for all other branched amino acids (Val, Ile, Phe, etc.). The only exception to this set of rules is l-Abu:d-Nle, Figure 17, which despite the linear side chains (ethyl and butyl) makes an L1-D1 structure. This irregularity was reproduced by theoretical structure prediction algorithms. [101] Dalhus [102] previously noted that complexes between aIle and other amino acids, if formed at all, invariably yielded crystals of poor quality, in marked contrast to the corresponding complexes with Ile and Leu. A comparison of the unit cell dimensions of complexes with Nva, Nle and Met shows that incorporation of aIle leads to a 6-18 Å 3 increase in the volumes of the asymmetric unit compared with Ile or Leu, clearly indicating a less efficient packing of the hydrophobic side chains.
There is only one quasi-racemate with Ala, a l-Ile:d-Ala complex (FITHIZ) [98] that formed the expected LD-LD structure. The small side chain of Ala may make it less straightforward to find a satisfactory packing arrangement of the side chains. Differences in solubility during crystallization may also be an issue, as Ala is about five times as soluble in water as, e.g. Leu and Phe at 25 • C.

Solvates and other complexes
Apart from hydrates and two methanol solvates, co-crystals involving amino acids are largely limited to complexes with a series of mono and dicarboxylic acids, including formic acid, acetic acid, benzoic acid, R-2-phenoxypropanoic acid, mandelic acid (2-hydroxy-2-phenylacetic acid), fumaric acid, succinic acid, tartaric acid (2,3-dihydroxbutanedioic acid) and pyridine-2,4-dicarboxylic acid. Special complexes have been obtained between l-Ala and l-Ala-l-Val (a dipeptide, EFUCIS), [103] l-Phe (DUMJEA10) [104] or l-Ser (ZUWQEN) [105] and a nucleotide as well as Gly and urea (NUBHOH). [106] The z(0):z(0) complex between l-Glu and l-Glp (LGPYRG) [107] is the only known example of a complex between two amino acids of the same hand, which seem very difficult to prepare. About 10 crystals incorporate metal salts, e.g. l-Glu·CaCl 2 ·H 2 O (CAGLCL10). [108] Amino acids may also form complexes with large molecules, usually as a guest in the cavity of a macrocycle like a crown ether. [109,110] Some special attention may be devoted to a series of eight hydrogen peroxide solvates, [111,112] which are interesting in that several of them involve amino acids that do not crystallize as hydrates from aqueous solution, e.g. l-Ile·H 2 O 2 in Figure 18 (TANCES). [111] This suggests that the interaction between the acid and H 2 O 2 is stronger than the interaction with H 2 O and that formation of such solvates from dilute hydrogen peroxide solutions may be a more generally applicable method for obtaining crystals of desired substances, as H 2 O 2 solvates, when plain aqueous solutions fail to give useful crystals for diffraction purposes.

Unit cell parameters
The direct hydrogen bonds between charged amino and carboxylate groups dominate the intermolecular interactions in amino acid structures even when the complete 2D sheets displayed in Figure 12 are missing, as in solvates, co-crystals and salts. This puts constraints on the unit cell parameters, as shown in Figure 19. The shortest crystallographic axis of any amino acid is 4.418 Å, found in the high-pressure (5.2 GPa) polymorph II of l-Ser monohydrate (LSERMH16). [113] The shortest axis associated with hydrogen bonding between molecules related by translation is 4.616 Å for 5-methyl l-Glu (GAVRAX, L3-L3 layer), [114] while the upper limit is 6.238 Å for diammonium O-phospho-dl-threoninate (GEJKUC, no 2D layer), [66] corresponding to the distances labelled A and B in Figure 12, respectively. The vast majority of amino acid structures have an axis in this range. There is a small increase above 7.0 Å from other types of hydrogen bonding patterns with interactions between molecules related by screw axes, glide planes, etc. The 'longest shortest axis' among amino acid structures in this review is 6. z(1) structures: basic amino acid salts 6.1. Structure types Salts where the amino acid has a zwitterionic head of 0 total charge, but where a positive charge in the side chain is balanced by a co-crystallized anion, constitute a large group of structures. The amino acid is usually one of the standard acids Arg, Lys or His or the naturally occurring Orn (see Figure 9); synthetic amino acids account for only 9 out of the 154 z(1) CSD-entries, Figure 4. Nevertheless, due to the almost endless options with regard to the nature of the anion, the z(1) group is also very varied. Anions can be split into two groups of about the same size: organic and inorganic. The organic anions are mostly carboxylates derived from the same acids as described above as well as imidazoledicarboxylic acid and longer chain dicarboxylic acids, such as glutarate, pimelate and trimesate. Inorganic anions come from the common strong acids, but may also be somewhat more exotic like ReO − 4 , SiF 2− 6 , and BF − 4 as well as larger structures like Mo 12 O 40 P 3− . About one-third of the structures include co-crystallized water molecules.
The guanidine moiety in the Arg side chain has a strong tendency towards formation of R(8) ring systems (for graph set notation, see [116] Due to the different demands on the hydrogen bonding environment, substitution of one acid for another always results in a new type of hydrogen bonding arrangement for this group of structures.

Mixing basic and acidic amino acids
A special group of salts have a z(1) basic amino acid as the cation and a z(−1) acid amino acid as the anion. These are listed in Figure 21, to which the two racemic complexes dl-Arg:dl-Asp (SITBOM) and dl-Arg:dl-Glu (SITBIG) may be added. [120] In this way both amino acids retain their protonation state at neutral pH (or slightly lower for His). An interesting aspect of the l-Arg:d-Glu complex, obtained either as a trihydrate (DUSMAF) [119] or a monohydrate (KEMYUW), [121] and l-Lys:d-Glu (JAVSII) [122] is that hydrogen bonding occurs in two independent regions: one involving the amino groups and the carboxylate groups of the polar heads, and the other involving the functional groups of the side chains and co-crystallized water molecules for the l-Arg structures. In the example shown in Figure 21, hydrogen bonding is thus simply the LD-LD pattern familiar from hydrophobic amino acids.  bond has been defined between the amino acid carboxylate group and the metal ion in the CSDentry. This limitation is somewhat random, as no O-Ca bond has been defined in calcium di-lglutamate tetrahydrate (LGLUCA), [71] where the shortest O· · ·Ca contact is 2.31 Å, while the average O-Ca distance in structures where a bond has been defined is about 2.40 Å with some bonds being longer than 2.80 Å. This group of z(−1) structures can accordingly be considered together with metal complexes in Section 17.

z(−1) and z(−2) structures: acidic amino acid salts
Alternative (B) is interesting due to the extremely high density of hydrogen bonds, not the least for four z(−2) ammonium salts. O-phospho-l-Thr is in its diammonium salt (IJICIN), [66] Figure 22, involved in no less than 15 hydrogen bonds including, in the absence of co-crystallized water molecules, 9 interactions to the ammonium ions. This appears to be an extreme limit for a simple amino acid.

Untraditional zwitterions: a(0) and c(0) structures
Since the carboxylate group of the polar head is a stronger acid than the side chain acid group, titration of Asp + or Glu + [c(1)] with base yields normal z(0) structures. The situation is, however, different for some non-standard amino acids with other types of functional groups in their side chains. For these compounds, the -COOH loses its proton first, leading to c(0) zwitterions where the negative charge sits in the side chain. The acidic group may be a carboxylate with modified pK a due to the presence of one or more nearby electron-withdrawing groups of atoms like in BUTFUR (Figure 23), [123] but is more commonly a monosulphate or a monophosphate, as for (R)-4-Oxo-5-phosphononnorvaline (POPJEJ) shown in Figure 4. [124] Likewise, a a(0) zwitterion has the positive charge in the side chain, which will happen when the functional group is a stronger base than the amino group of the polar head. The only example known is the naturally occurring amino acid Arg, which occurs in the a(0) state in its anhydrous form (TAQBIY) [32] as well as in the hydrates of the l-enantiomer (di, ARGIND11) [125] and the racemate (mono-and dihydrate, FUGXIO [126] and WIJNEI [127]).

Amino acids as cations 9.1. c(1) structures: salts of the apolar and polar amino acids
The c(1) structures, where the amino acids occur as cations, have been crystallized with a mix of organic and inorganic anions that resemble the group of anions discussed for z(1) structures, but due to the transition from -COO − , usually a three-to fivefold acceptor, to -COOH, which may accept at most one H atom, the anions are much more prominent participants in hydrogen bonding. In an attempt to classify the networks observed in these structures, it is again useful to consider the hydrophobic amino acids and then look at anions. It turns out that essentially every type of anion can be associated with a single pattern or a series of related patterns.
Using the largest group, halides, as an example, we find that they, like their zwitterionic counterparts, build double-sheet hydrogen bonding patterns. Three types of sheets have been found: CL with only one enantiomer (CD would be the mirror image), and CDL1 and CDL2, with Figure 23. Hydrogen bonding in the structure of dl-threo-β-fluoroaspartic acid dihydrate (BUTFUR). [123] The side-chain carboxylate group, with estimated pK a ≈ 1.7, accepts five hydrogen bonds, the carboxyl group of the polar head (pK a = 1.88, Figure 1) none. both enantiomers (Figure 24). These sheets are combined in three different ways for enantiopure substances ( Figure 25): • 2CLp parallel sheets • 2CLa antiparallel sheets • 2CLa* antiparallel sheets, deviation from the above For racemates there are also three types of layers: This means that the halides of hydrophobic amino acids can be rationalized in the same way as regular, zwitterionic structures. The same is true also for other types or classes of anions. For enantiomeric nitrates, the largest groups of structures second to halides, there is consequently just one common sheet with Z = 2 (or occasionally 4), which furthermore is closely related to the sheet observed in racemic structures, Figure 26.

c(2) structures: Arg, Lys and His salts
Doubly charged c(2) structures occur for the three basic amino acids Arg, Lys and His together with six other amino acids. These structures are dominated by hydrogen bonds to the mostly inorganic anions, as in l-Arg 2+ dinitrate (BOQVIM01) [128] in Figure 27. Organic anions are mostly trifluoroacetate, squarate, picrate and oxalate together with some large sulphonates. The amino acid carboxyl group obviously acts as a hydrogen bond donor, but it also accepts an amino H atom on its carbonyl oxygen atom in about one half of all structures.

Speakman salts
The term 'Speakman salt' is here used to describe a complex between a (NH 3 -CHR-COOH· · · OOC-CHR-NH 3 ) + cation [or in a few cases (NH 3 -CHR-COOH· · · OOC-CHR-NH 3 ) 3+ cations where both side chains carry a positive charge] and an organic or inorganic anion. The -COOH· · · − OOC-hydrogen bond in these structures is exceptionally short with O· · · O distances in the range 2.394-2.564 Å and an average of 2.468 Å for 72 observations in 57 CSD-entries. As for amino acids in the z(0) and c(1) states, the majority of the Speakman salt structures that involve hydrophobic amino acids are divided into layers, with hydrophobic layers being constructed from two easily distinguishable sheets. Only a single type of sheet has been identified. It is here called S1 and has been found not only for halides of a number of amino acids, but uniquely also for a range of other anions as seen for l-Val + l-Val chloride (OPEFUL), [129] l-Phe + l-Phe formate (JOTKIM01) [130] and l-Phe + l-Phe malonate (RAL-RUS) [131] in Figure 28. Two sheets can be combined into a layer in various ways, as shown in Figure 29. [129][130][131][132] The l-Phe + l-Phe nitrate (CONZEK01) [133] and hydrogen selenite (CAD-MAX) are isostructural to the [132] malonate (RALRUS), [131] in Figure 29(b). Such retention of structure upon replacement of a small inorganic anion (nitrate) with a larger organic anion (malonate) is indeed uncommon. In the formate (JOTKIM01) [130] and BF 4 − (CADLUQ) [132] salts sheets come together to form layers in slightly different ways (Figure 29(c) and 29(d)).
The (Ala-Ala) + Cl − salt (CADKID) [132] incidentally appears to be the only layered amino acid structure where hydrophobic regions are generated exclusively from the methyl side chains of Ala. As described in Section 5.6, quasi-racemates with Ala as one of the components are hard to form, which may be due to problems building, from such a small hydrophobic group, a Figure 25. Hydrogen-bonded layers in the crystal structures of halides of hydrophobic amino acids. All use the CL sheet from Figure 24. The two sheets are parallel in 2CLp, while they are antiparallel in 2Cla. The last pattern, 2Cla*, represents a small deviation from 2Cla. hydrophobic layer without unfavourable voids. In CADKID a good arrangement has been found, as there are no sizable voids in the crystal structure. Figure 4 indicates that a very small number of amino acids indeed, four in total, have been crystallized with an overall negative charge of −1 or −2 with a negatively charged polar head. This is not quite true, as many anionic amino acids act as ligands for metals atoms and are discussed in Section 17. The selected four structures are simply those that in their CSD-entries have not been registered as having any kind of bonding between the carboxylate group or the amino group and a metal ion. As described above, this is clearly a matter of how a bond to a metal ion is defined in the CSD-entry.

n(0) structures: Pro-derivatives
For pyroglutamic acid (Glp) (LPYGLU) [134] and related compounds, the formation of a cyclic secondary amide, a lactam, leaves a N atom without basic properties, and molecules are obtained as neutral species, Figure 30(a). Out of the 15 n(0) structures indicated for Pro-derivatives in Figure 4, 12 are lactams or thiolactams. The secondary amine function of normal Pro, with a pK a2 -value of 10.60, is, on the other hand, considerably more basic than the primary amino groups of the other amino acids. Nevertheless, for this acid alone we find indisputably uncharged molecules, literally meaning that the protonation state of the gas phase, the normal subject of ab initio calculations for isolated molecules, has been retained. The reason for this anomaly is associated with the presence of only two donors. The zwitterionic state occurs whenever the negative charge on the carboxylate group is stabilized through the acceptance of three or more H atoms from strong donors, but in the borderline case of two donors proton transfer is evidently sometimes not favourable. 5,5-Dimethyl-2-phenylthiazolidine-4-carboxylic acid (YESPER), [135] Figure 30(b), represents an extremely rare example (three overall: DILXAX, QADBON and YESPER) of such a non-lactam n(0) amino acid structure. It was refined to an R-factor of 0.048, and the covalent bond lengths in the carboxyl group are 1.202(7) Å (carbonyl) and 1.306(6) Å (hydroxyl). The methyl and phenyl substituents may play a role in creating  steric conflict, as the corresponding compound devoid of these groups crystallizes as a normal zwitterion (NELSEC). [92] 13. The conditions for finding isostructural crystals 13.1. Amino acid substitution A profound difference between apolar and polar amino acids is that the former can often be interchanged without any major modifications to the crystal packing arrangement, i.e. they are isostructural. The group of quasi-racemates discussed above provides several examples, as do the P2 1 structures with Z = 2 observed for a number of the enantiomeric amino acids. With the introduction of side-chain functional groups that participate in strong hydrogen bonds, this picture changes completely. Amino acids that incorporate the same functional group, but where the exact location (in terms of chain length) varies, always have different crystal structures and usually also distinct hydrogen bonding patterns. One such structural family is constituted by the four amino acids with -(CH 2 )n-COOH side chains with n = 0 (2-aminomalonic acid, AMMALA [136]), 1 (Asp, LASPRT03 [137]), 2 (Glu in its α form, LGLUAC03, [29] or β form, Figure 29. Four different ways of combining two S1 sheets into a layer. (a) Hydrogen bonding pattern in l-Val + l-Val chloride (OPEFUL) [129] common to most halides. (b) l-Phe + l-Phe malonate (RALRUS). [131] (c) l-Phe + l-Phe formate (JOTKIM01). [130] (d) l-Phe + l-Phe BF − 4 (CADLUQ). [132] Side chains are shown as small spheres.
LGLUAC11 [69]) or 3 (l-2-aminoadipic acid, VUYJEE [138]). In fact, not a single example has been found where an interchange of Asn with Gln or Asp with Glu occurs with full retention of crystal packing. Apparently only two structures share the same overall hydrogen bonding pattern: l-Glu, in the β-modification (LGLUAC11) [69] has the same interactions as l-2-aminoadipic acid (VUYJEE), [138] the extra side-chain methylene group of the latter merely resulting in a change from an orthorhombic to a monoclinic space group (P2 1 2 1 2 1 → P2 1 ).
Similarly, the standard amino acid Tyr, with its side-chain -OH group in the para position (LTYROS11) [139] has a totally different structure from the analogues with the -OH group in the meta (MTYROS01) [140] or ortho (DTYROS) [141] positions.

Changing the anion
Groups of structures with anions have been used to identify the conditions for obtaining isostructural crystals. These include the z(1), c(1), c(2) protonation states as well as Speakman salts. Monoatomic halides anions are particularly straightforward to investigate, the results being presented in Table 2.  It can be seen that Cl − can be replaced by Br − in 75% of all pairs. By comparison, only a single example was found of an isostructural F − /Cl − pair (albeit with different side-chain conformations, CEGKOQ/FEQYUW).
Among other anions ClO − 4 and BF − 4 are the only ones that usually (five out of six pairs) give isostructural crystals. Other isolated examples include Nitrates are never isostructural to other salts, with a single exception: in three special Speakman salts with formula (l-Lys) 2 3+ 2Cl − X − , X − can be either ClO − 4 , BF − 4 or NO − 3 (BOQVOS, MALQAT and BOQWOT). [142][143][144] 14. Amino acid geometry 14.1. Molecular geometry of the polar head In the z(n) (n = 1, 0, −1, −2) states, the two oxygen atoms of the carboxylate group are equivalent. When O1 is taken to be the oxygen atom with N-C-C-O1 torsion angle closest to 0, there is for l-amino acids a rather broad torsion angle distribution between −50 and +30 • with 83% of the structures falling in the range between −40 to 0 • with a maximum at −20 • . With a cationic polar head and a carboxyl rather than a carboxylate group, there is a strong preference (>80%) for having the carbonyl group close to the N atom. Calling the carbonyl O-atom O1, we thus find that the N-C-C-O1 torsion angle is in the range −30 to 10 • with the maximum at −10 • . There are several undisputed examples of torsion angles close to 180 • as well, including l-Arg 2+ dinitrate (BOQVIM01) [128] in Figure 27, but fewer than that suggested by simple CSD searches, as closer scrutiny reveals that the hydrogen atom of the carboxyl group is quite often erroneously positioned.

Side-chain conformations
It is beyond the scope of this review to carry out a comprehensive conformational analysis of all types of amino acid side chains (for some previous work, see [14,145]), but a few general remarks will be made.
Alanine, with R = -CH 3 , has no side-chain conformations, but for other amino acids each additional sp 3 -sp 3 single bond, either to C or to a heteroatom, in principle may introduce three alternative rotamers. Examples include Ser, Cys and Abu. For a sp 3 -sp 2 bond, there are only two orientations at +/−90 • , which degenerate to just one due to symmetry for Phe and Tyr. These amino acids consequently have only three conformations resulting from rotation around the C α -C β bond, although with rather wide distributions for the C α -C β -C γ -C δ torsion angles due to low energy barriers for the rotation of the aromatic ring. For Met the theoretical number of conformations is 3 × 3 × 3 = 27, but as several of them are prohibited due to excessive steric conflict, only 12 have been observed in crystal structures. Another example is Leu, which in theory has 3 × 3 = 9 side-chain conformations. In crystal structures two dominating conformations have been found, the distribution between them being dependent on the protonation state, as described in Figure 31. [146,147] Conformational freedom may also be limited by the need to avoid conflict with donors and acceptors to the amino and carboxyl(ate) groups. Molecules with N1-C α -C β -C γ = trans are thus absent for l-Asp, as this would evidently put the two carboxyl functions so close to each other that hydrogen bonding preferences would be compromised.

Substituted amino acids 15.1. N-alkylated amino acids
We have recently devoted some attention to amino acids where one of the amino H atoms has been replaced by an alkyl group, finding 169 such structures in the CSD. [148] These molecules have different hydrogen bonding properties than regular amino acids as the formation of the typical hydrophilic layers with two sheets cannot take place when the essential third amino H-atoms connecting the sheets is unavailable. In the absence of interfering hydrogen bond donors or acceptors, these compounds instead tend to form single-sheet hydrogen bonding patterns where every second head-to-tail chain has reversed its direction compared to a standard L1 sheet. Such motifs have previously been observed for l-Pro and some related compounds, Figure 14(a), but in a more strained version due to the ring system, which needs to be positioned on one side of the sheet. In rotationally unhindered acyclic N-alkylated amino acids, the side chain and the N-alkyl group of an individual molecule are located on opposite sides of a sheet. Consecutive molecules along the horizontal hydrogen-bonded chains in Figure 32 are rotated 180 • with respect to each other, resulting in an efficient stacking of the hydrophobic groups.
N,N-disubstituted and N,N,N-trisubstituted amino acids have not been considered in detail here, as the nature of the substituents, which are often large, tend to make the borderline to other Figure 31. (a) The structure of one of the two molecules in the asymmetric unit of l-Leu (LEUCIN02), [146] showing the typical conformation of the z(0) state (13 out of 14 observations after inversion of three d-Leu). The indicated values for the torsion angles ( • ) show characteristic deviations from the ideal staggered positions. (b) The second Leu conformation, here seen in a coordination compound with Re(V) (LOPTOZ), [147] is uncommon for zwitterionic molecules (one observation), but accounts for more than 50% of all structures in other protonation states such as a(−1) in metal complexes. structural classes obscure with few recurring structural characteristics. N,N-dimethylglycine, in a zwitterionic state with a single N-H donor, is a model compound for which two polymorphs [149] have been discovered: one with head-to-tail chains and one with 20-membered ring system constructed from four amino acids, a very unusual hydrogen bonding motif.

Amino acid esters
Esters of amino acids have been crystallized as neutral species (12 CSD-entries) and as cations (45 entries), the latter usually found in salts with inorganic anions. In the first group, the uncharged -NH 2 behaves like a rather weak donor, and in about one half of the structures there is only a single N-H· · · O interaction. Ester cations in halide salts have two unique hydrogen bonding patterns called E1 and E2. These are shown in Figure 33 [150,151] and recur in seven and three structures, respectively.

N-protected amino acids
Shifting the substituent on the N-atom from alkyl groups (Section 15.1) to groups that form ester-or amide type bonds increases the number of options tremendously, often to a point where many would rather recognize the molecule as an amide than an amino acid. Limiting substituents to -COOCR 3 , with R = C or H, which includes the most common N-terminal protecting group, tert-butyloxycarbonyl for amino acids and peptides, we find, however, just 23 simple structures with free -COOH. These structures constitute a varied group, but with frequent formation of acid dimers and separate chains of hydrogen bonds between the amide bonds, as seen for N-(tbutoxycarbonyl)-dl-Ala (BAJGOI) [152] in Figure 34.

N-and C-protected amino acids
As for N-protected amino acids, defining an N-and C-protected amino acid is not trivial. After employing three rather arbitrary constraints (total number of C atoms <40, not more than four non-amino acid functional groups, not polymeric), we are left with 90 crystal structures. There are no characteristic hydrogen bonding patterns, but interactions between the amide groups are common, and these are frequently corroborated by weaker C-H· · · O interactions in various ways. Figure 35 shows a special example for a methyl ester (GUTCAZ) [153] where four weak contacts (including the N-H· · · O=C < hydrogen bond) build a tape-like motif along a 5.78 Å crystallographic axis.

Polymorphism at ambient and extreme conditions 16.1. Frequency of polymorphism at ambient conditions
It is generally accepted that the number of known polymorphs for a compound is related to the time spent investigating it, and since most amino acids in the CSD have been subjected to only a single crystallization attempt, the general frequency of polymorphism for amino acid is not well explored. The standard amino acids make an exception, and Table 1 lists polymorphic forms at ambient conditions for l-Cys, l-Glu, l-His and l-Phe, i.e. 4 out of the 18 chiral amino acids with known structures (in addition to dimeric l-cystine). Among the racemates only dl-Val has two known forms. For Gly the α-form, in space group P2 1 /n, is the most familiar and readily formed polymorph, while the γ -form is the thermodynamically most stable. Both αand β-forms of Gly contain L1 hydrogen-bonded sheets, as discussed in Section 5.2, but differ in the way these Figure 33. Two hydrogen bonding patterns in crystal structures of amino acid esters with a positive charge. The E1 1D tape motif is illustrated for R-3-bromo-1-methoxy-1-oxopropan-2-aminium bromide (CAQJAH), [150] a methyl ester of fluoroalanine, while the E2 2D sheet motif is found for S-benzyl-l-cysteine methyl ester hydrochloride (FEGVOC). [151] There is a short contact with a C-H donor in the E1 pattern and a three-centre interaction in the E2 pattern.
sheets are stacked to produce either a 2D (α form) or a 3D (β form) overall hydrogen-bonded lattice.

Temperature-induced phase transitions
Of the standard enantiomeric amino acids in Table 1, transformations to new crystal forms upon cooling have been observed for l-Cys [154,155] and dl-Cys, [156] in both cases with retention of the overall space group symmetry. The first simply involves ordering of a disordered sulhydryl group, the second an overall change in side-chain conformation.
A completely different type of phase transitions occurs for the racemates of Met [35,157] and other, non-standard amino acids with linear side chains, such as Abu, [159] Nva [160] and Nle. [161] These structures are divided into layers, and at a specific temperature (which is different on cooling and on heating, i.e. hysteresis occurs) every second molecular bilayer is translated along the two shortest axes to take on a modified type of hydrophobic stacking, while leaving the hydrogen bonding region intact ( Figure 36). Recently we have shown that this behaviour is not limited to regular racemates, but also pertain to quasi-racemates, including l-Nva:d-Met, l-Nva:d-Nle and l-Nle:d-Met. [162]    Furthermore, enantiomeric amino acids with linear side chains also have interesting phase behaviour. Met has one room temperature and two high-temperature phases where the space group remains P2 1 , but with small changes to cell parameters and side-chain disorder. [35] For l-Nva, reversible displacive transformations like those seen for the racemates are observed. [35]

Pressure-induced phase transitions
A small selection of amino acids have been studied at elevated pressure, which usually produces new phases.
The stable γ -polymorph of Gly, in space group P3 1 /P3 1 , is converted to a new form in space group Pn over a wide pressure range. The Pn polymorph was discovered simultaneously by Boldyreva et al. [56] and Dawson et al., [57] the former group referring to it as 'a new δ-form', while the second group calls it 'ε-glycine', reserving the δ-notation for yet another form, in space group P2 1 /a, obtained by compression of β-Gly. [57] Later, Tumanov et al. [58] reported polymorph number six, denoted β , which was again obtained from the ambient β-form.
Two high-pressure forms were found for l-Ser (l-Ser-II and III) [163][164][165] and l-Cys (l-Cys III and IV). [166] Going from l-Ser-I (ambient) to l-Ser-II and l-Ser-III did not affect the space group (P2 1 2 1 2 1 ), but conformational changes lead to modified hydrogen bonding networks. The same type of changes took place on compression of orthorhombic l-Cys (I) to l-Cys (III). A subsequent reduction in pressure resulted in the formation of l-Cys (IV), not observed on increasing pressure, which can be regarded as an unusual intermediate phase between I and III. [166] l-Ser monohydrate has a single high-pressure transition, where particularly the hydrogen bonding role of the water molecular changes between the phases. [113] As for the racemates of these two amino acids, no phase transitions were observed for dl-Ser up to pressure of 8.6 GPa. [167] dl-Cys, on the other hand, undergoes a phase transition at only 0.1 GPa, the lowest pressure reported for a phase transition in a crystalline amino acid, yielding dl-Cys (II), which is also obtained by cooling at ambient pressure. [168] Structures after two additional transitions at 1.55 and 6.20 GPa were not properly characterized.
Ultimately, single-crystal-to-single-crystal transitions have been shown to take place above 1 GPa for S-4-sulpho-l-Phe monohydrate [172] and between 1.5 and 2.4 GPa for dl-alaninium semi-oxalate monohydrate. [173] 17. Amino acids as ligands for metal ions 17.1. Binding modes Amino acids may interact with metals in a number of different ways, Figure 37, with the bidentate ON binding mode resulting in the formation of a favourable five-membered ring being by far the most common. It is observed in 866 out of the 1621 amino acid-metal complexes in the CSD. Usually the amino acid is in a protonation state with an anionic polar head, but in modes not involving the amino group, such as X and OX , the head may also be zwitterionic. For amino acids without functional groups in the side chains, only the limited number of modes ON, Oo, O and N are available.

Structures
In addition to all the standard amino acids, a number of additional amino acids have been designed and synthesized in order to act as a tridentate or in some cases even tetradentate ligand. Figure 37. Various binding modes between amino acids and metal ions. Note that the side-chain length to the ligating group(s) X may vary considerably and is not limited to -CH 2 -as shown here. There is one additional bidentate binding mode called O o , where a side-chain carboxylate group makes a contact with a metal ion equivalent to Oo, and two tridentate binding modes OO o (a subgroup of (O/N)X X together with NX X ) and NX N, the latter used only for dimeric amino acids.
Out of all 366 structurally characterized amino acids, 42 occur exclusively in metal complexes. There is a prevalence of side chains with heteroatoms such as N, S and O, but also amino acids devoid of additional ligating functional groups, which accordingly are limited to the mono-and bidentate binding modes described above.

Tri-and tetradentate binding modes
The Co(III) complex shown in Figure 38(a) (BAKXER) [174] is one of only three structures with tetradentate binding (ONX X in Figure 37), which can be considered as a very special type of amino acid-metal interaction. Tridentate binding, on the other hand, is quite common, and 201 examples of the ONX mode were found in the CSD as part of this survey. The most common acids involved in this mode are His (48 structures), penicillamine, Pen (47), Asp (40), Cys (33) and Met (11). A ONS complex between Pen and Co(III) with additional binding of Pt(II) is shown in Figure 38(b) (LIGLAP). [175] If the Pt(II) in this structure is replaced by -CH 2 -CH 2 -a special structure of a dimeric amino acid with hexadentate coordination is obtained (CECRAD10). [178] Given the large number of structures with Asp, resulting in the formation of six-membered ring systems involving the metal ion and the amino or carboxylate group of the polar head, the absence of any structure with Glu is remarkable.
The by far most common metal ion is Co (83 structures), with much smaller numbers for Mo (27), Ni (23), Re (23), Cr (9), Cu (8), Ru (7) and V (4). The coordination number of Co is always Figure 38. (a) Tetradentate coordination of a special amino acid in a Co(III) complex where propane-1,3-diamine (in wireframe representation) completes the coordination sphere around the metal ion (BAKXER). [174] (b) Regular tridentate ONX binding by two penicillamine (Pen) dianions to Co(III) (LIGLAP). [175] The side-chain sulphur atoms also coordinate with Pt(II) together with 2,2 -bipyridine. (c) Tridentate NS O binding to Cd(II) (NAMDAH). [176] The Cd(II) ion achieves a total coordination number of seven by binding to three different amino acids (3 + 2 + 1) as well as a water molecule. The main carboxylate group is in this case involved only in a monodentate interaction through the O atom coloured in purple. (d) A rare example of tridentate NX N binding by a dimeric amino acid (RIFJOG), [177] here lanthonine (see Figure 9) in an Re(I) complex with three additional carbonyl ligands. The two polar heads together have one neutral carboxyl group and one charged carboxylate group. 6 (with the exception of three complexes with a cyclopentadienyl ligand), which in complexes with Pen and Cys (but not Asp) is frequently the result of binding to two different amino acid anions, as in Figure 38(b). Coordination number 6 dominates also for other metal ions, except Mo which has coordination number 7 in a long series of typical complexes when the central Mo-Mo interaction is included.
The NX X mode is by comparison much less common, as special amino acids with two side-chain ligating atoms are needed. Three out of five such structures in the CSD use S-carboxymethyl-l-Cys, as shown for NAMDAH [176] in Figure 38(c).
A third tridentate mode, which could be called NX N, is found in four structures with the dimeric amino acid lanthonine, as shown in Figure 38(d) (RIFJOG). [177]

Bidentate binding modes
The normal ON binding mode is observed for 866 amino acid structures, by far the largest group among metal-amino acid complexes. Total coordination numbers for the metal vary from three to nine. Complexes with threefold coordination occur for Hg, while fourfold coordination occurs for Pd, Pt and Cu (planar), and to a minor extent for Ag, Ir, Hg and V (variable, may involve cyclopentadienyl ligands).  [178] The five-membered ring generated by regular NO binding of l-Ala is given a blue shade, the six-membered ring from NN binding of l-His a red shade. (b) l-Pen being simultaneously involved in an NS contact to one Pd(II) (red shade) and an OS contact to a second Pd(II) (green shade) (IYIBIC). [179] (c) Rare example of an OO contact between l-Thr and Ho(III) (FUTVUL). [180] One of the carboxylate O atoms (in purple) forms a monodentate contact. The total coordination number of Ho is brought up to eight by interactions with five additional water molecules. The polar head of the amino acid in this case occurs in the zwitterionic z(0) state. (d) Total coordination number of 12, as a distorted icosahedron, is achieved by La(III) through 6 Oo contacts with l-Cys (orange shade) (UJECUI). [181] Two Co(III) are six-coordinated in two different ways: either by three bidentate NS contacts or by six monodentate S contacts. Fivefold coordination has been found for Ag, Cd, Tl and Ru and in somewhat larger numbers for Zn, but occurs above all for Cu. While the Zn-complexes tend to lean towards a trigonal bipyramidal geometry, Cu-complexes are always square pyramidal, and the ligand at the apex is a water molecule in about 63% of all structures (Figure 39(a), QEHTAA). [178] Coordination number 7 occurs primarily for V, Cd and Ca, while 9 is reserved for Yb.
The alternative binding mode NX , where the side chain is involved in metal binding, occurs for 138 amino acids, dominated by Cys (56) and Pen (36) followed by His (16), Figure 39(a), and Met (6) with the rest being non-standard amino acids. There is a 3:2 distribution between acids with anionic polar head and neutral polar head. As detailed above, the latter is a protonation state not observed for non-coordinating amino acids. Common metals are Co (56 structures), which invariably is six-coordinated, as well as Pd (23) and Pt (17), which are tetra-coordinated. The ligating X -atom of the side chain is always N or S, there are no examples of NO binding for Asp, Glu or modified amino acids with side-chain O atoms. In a large fraction of the NS structures with Cys or Pen, the sulphur atom is simultaneously ligated to another atom, often Cd, Co, Cu, Ag or Au, which for the first two may be a bidentate OS interaction as in Figure 39(b) (IYIBIC). [179] OS' is an example of the third bidentate binding mode, generally called OX . It is less frequent than NO and NX and has been found in only 20 structures, with Pen as a dominating amino acid (nine structures) in complexes with Pd that, as described above, is usually also involved in NS Figure 40. Schematic overview of amino acid structure in the CSD with polydentate metal contacts. The name of the contact is accompanied by the number of structures found, which is reflected by the size of each sphere. Orange colour is used for bidentate interactions, blue for tridentate and red for tetradentate. Overlap, with numbers indicated in white, highlights structures with more than one type of interaction. A single ONX -O o structure should be added. Figure 41. (a) Exceptional metal coordination in a complex between l-Phe and Ag(I) where the amino acid, in its a(−1) state, does not form any bidentate contacts, but rather two monodentate contacts, the linear geometry resulting in formation of a chain structure (ADILIJ). [182] (b) A second, rare type of monodentate coordination of an amino acid in the a(−1) state where only the -NH 2 group is coordinating to a metal ion. Here neutral methyl-(l-Ser-N)-Hg(II) behaves like a simple metalloorganic molecule (DUTZUN10). [183] (c) (η 6 -l-Tyr)-(η 5 -cyclopentadienyl)-Ru(II) sandwich complex (BACQUT) [184] where the amino acid is in the c(1) state. (d) Four zwitterionic l-Ala molecules and eight water molecules interacting with two Tb(III) (DUMCEU), [185] 1 of 20 complexes of this type where rare-earth metals have square antiprismatic coordination geometry.
binding. Thr is involved in four interactions of this type (Figure 39(c)) (FUTVUL). [180] The ligating atom is always S or O; there are no examples of ON modes.
Finally, bidentate binding may be realized through κ 2 coordination of a carboxylate group, resulting in a four-membered ring system. Usually this type of interaction involves Gly or a hydrophobic amino acid (32 out of 53 structures), but also Cys that is additionally involved in normal NS binding, Figure 39(d) (UJECUI), [181] as well as Asp and Glu. The latter, in z(−1) or a(−2) protonation states, may furthermore have their side chains involved in equivalent O o interactions. These occur for 15 Glu, 4 Asp and 2 non-standard amino acids (Figure 38(c), NAMDAH). [176] Metals involved in bifurcated interactions include Pb, Ni, Cd, Hg and Sr as well as various rare-earth elements.
In many metal complexes there are two or more amino acids in the asymmetric unit, and these may or may not coordinate with the metal ions in similar ways. In some cases two different amino acids are included, often with different binding modes. Figure 39(a) provides an example of ON bonding by l-Ala and NN bonding for His. It is furthermore possible for an amino acid to be simultaneously involved in more than one polydentate interaction. The N atom of the polar head is limited to one contact, but combinations like NX -OX (Figure 39(b)) and NX -Oo ( Figure 39(d)) do occur, although in very small numbers, as is evident from the overview of polydentate interactions in Figure 40.

Monodentate binding modes
When an amino acid is involved in bi-or tridentate contacts to a metal ion, the various functional groups are frequently involved in additional monodentate interactions to other metal ions. Examples are seen in Figures 38(b) and 39(c).
In rare cases, amino acids do not form any polydentate contacts with metal ions, only monodentate contacts. This group of structures may be divided into three groups depending on which group is a ligand: the -NH 2 or the -COO − of the polar head, or a group in the side chain. Notably, a molecule may simultaneously participate in several monodentate interactions, Figure 41(a) shows an l-Phe Ag(I) complex (ADILIJ), [182] one of only two structures where an amino acid with an anionic polar head is not involved in any bidentate interactions. Occasionally only the -NH 1 is involved in metal coordination (5 structures with Hg and Pt) as seen for an l-Ser Hg(II) complex in Figure 41(b) (DUTZUN10), [183] or vice versa for the side chain (17 structures, mix of metals). Cyclopentadienyl, Cp, is not a common ligand for metal ions in amino acid complexes (<30 structures in total), but occurs, among others, in the complex with l-Tyr shown in Figure 41(c) (BACQUT), [184] which is unique in the sense that the amino acid is in the c(1) protonation state and that none of the functional groups of the polar head are used as ligands.
Structures with a zwitterionic head where only the carboxylate group interact with the metal ions are much more abundant. Figure 41(d) shows an l-Ala Tb(III) complex (DUMCEU) [185] as an example of a special group of coordination compounds involving rare-earth metals, where four amino acids interact with two metal ions to form a well-defined cluster, here with an overall charge of 6+ (perchlorate anions are not shown). Alternatively, metal ions may link the amino acids into chain polymers. [186] 18. Conclusions Amino acids are relatively simple, small organic molecules, but, as should be evident from this review, their world of crystal structures is extremely diverse. Some continents are large, and well investigated, but there is an abundance of little known or even unexplored places to go to. With the wealth of information available through the CSD, now containing more than 3500 amino acid structures, we have the ability to decipher the preferences of amino acids with respect to intermolecular interactions and crystal packing, paving the way for extended use of these molecules, isolated, in organic complexes, as salts or in coordination compounds, as building blocks in future crystal engineering efforts. The possibilities are endless, as are the challenges.