An organic geochemical perspective on terrestrialization

Abstract The colonization of land required new strategies for safe gamete/diaspore dispersal, and to cope with desiccation, harmful radiation, fire and gravity. Accordingly, the morphology, behaviour and physiology of the organisms changed. Here, we explore to what extent physiological adaptations, reflected in the molecular content of the sediments, add to our understanding of the terrestrialization. Many compounds considered characteristic of land organisms do not provide valuable information from the fossil record since (1) they were not preserved; (2) they occur or correspond to substances that evolved prior to the terrestrialization (e.g. cutan vs. algaenan, cellulose); or (3) they have been changed diagenetically and/or catagenetically. The latter leads to geo(macro)molecules without a chemical fingerprint relating them to their original bio(macro)molecules despite, sometimes, excellent morphological preservation of the organic remains. Nevertheless, some molecular markers and their stable isotopes provide independent information on the terrestrialization process. The odd predominance of n-alkane surface waxes is a feature already apparent in early land plants and could, with caution, be used as such. Furthermore, fossil terpenoids and their derivatives are valuable for reconstructing the evolution of major plant groups. The radiation of the phenylpropanoid pathway with for example, sporopollenin and lignin seems to be closely related to the evolution of land plants.

As for other disciplines occupied with unravelling past life and environment, organic geochemistry relies heavily on the paradigm that the present provides a key to the past. For organic geochemistry, the biosynthetic pathways of living organisms provide such a key. The biosynthetic differences between organisms provide insight into the evolution of biosynthetic pathways and this can be applied to, and calibrated against, the fossil record. Evolution also implies adaptation and, by linking species ecology to their biochemistry, the adaptive value of the biosynthetic pathways and the biomolecules produced may be resolved.
This biosynthetic link to environment can also be applied to the past. For instance, the notion that oxygen is required in steroid and non-hopanoid triterpenoid synthesis implies that analysis of steroids in ancient sediments may help to unravel the early evolution of the atmosphere (Summons et al. 2006). Analogous to the oxygenation of the Earth's atmosphere, terrestrialization also required major adaptations of the terrestrializing organisms to the living conditions on land. The conquest of land required the development or strengthening of supportive structures such as skeletons, stems and roots to withstand Earth's gravity without the support of water and to resist wind. It also demanded water-saving strategies such as arthropod and plant cuticles, vertebrate skin and cork to survive low humidity environments and sometimes fire, as well as the development of water-conducting tissues in plants (roots, tracheids). Finally, the higher exposure to harmful radiation required sunscreens such as pigments, aromatic and other substances as UV filters. Whereas the endo-and exoskeletal adaptations are intrinsically based on the formation of large biopolymers, water saving and UV protection may be also realized by means of smaller molecules.
Our working hypothesis is that the major changes in biosynthetic pathways which accompanied the terrestrialization of organisms are reflected in the molecular composition of the organic matter present in Palaeozoic sedimentary rocks. For instance, a minor change in the biosynthesis of sporopollenin may have led to lignin. In return, we can expect the changes in composition of the organic matter (OM) in Palaeozoic rocks to help determine following evolutionary trends. The question being asked is to what extent is our working hypothesis valid and what does this provide us with to understand terrestrialization? We restrict ourselves to primary producers and arthropods since they have a rich fossil record, both organic geochemically and as micro-and macroscopic remains. We exclude other terrestrial (heterotrophic) life forms (worms, snails, vertebrates, etc.) since they do not usually leave a chemically characteristic signature in the sediments.

Organic matter analysis: lipids and macromolecules
From an analytical point of view, bio-and geomolecules can be subdivided into two types of organic substances. The first type consists of relatively small molecules which dissolve in common organic solvents and form the lipids. These lipids can be analysed relatively easily using for example, gas or liquid chromatography. Examples are archaeal membrane lipids, higher plant cuticular waxes, long-chain alkenones from Haptophyta and steroids such as dinosterol from dinoflagellates, etc. Those lipids that have a relatively low biodegradability may fossilize as such and can be applied to reconstruct past environments and the (early) evolution of life (e.g. Brocks & Summons 2003).
Apart from this, lipids and lipid ratios can be used to reconstruct the environment such as for long chain alkenones (Marlowe 1984;Brassell et al. 1986;Prahl & Wakeham 1987;Conte et al. 2006) and of archaeal glycerol dibiphytanyl glycerol tetraether membrane lipids (De Rosa & Gambacorta 1988;Schouten et al. 2003;Kim et al. 2008). These biomolecules can also become diagenetically or thermally modified but, as long as the resulting products can be reliably related to their source organisms, they may still provide important clues on past life and environment. For example, triaromatic dinosteroids are derived from the thermal modification of the dinoflagellate steroid dinosterol. Their presence in Palaeozoic sediments has been used as an argument for a Palaeozoic rather than Mesozoic origin of the dinoflagellates (Moldowan & Talyzina 1998;Empt 2004).
The second type of molecules is of macromolecular nature and therefore insoluble in most solvents. In living organisms, the most abundant macromolecules are proteins and polysaccharides. The insoluble organic matter in the sediments, also termed 'kerogen', is poorly understood despite being by far the largest organic carbon pool on Earth (Berner 1989;Vandenbroucke & Largeau 2007) including all the particulate organic matter we see with the naked eye and through microscopes such as leaf and arthropod cuticles and palynomorphs.
This kind of material provides considerable analytical problems with respect to structural elucidation and quantification. Non-destructive methods provide important structural information on the atomic level such as nuclear magnetic resonance (NMR) (Deshmukh et al. 2005) and at the level of functional groups such as (micro) Fourier transform infra-red (FTIR) spectroscopy (Marshall et al. 2005;Versteegh et al. 2007). Destructive methods fragment the macromolecules and these fragments also provide vital information needed to reconstruct the original macromolecule. Of these, chemical degradation applies a series of chemical treatments whereby each successive treatment is able to break stronger chemical bonds than the previous treatment (Hunt et al. 1986;Gelin 1996;Blokker et al. 1998). Pyrolysis breaks down the molecule thermally in an inert atmosphere (Maters et al. 1977;Nip et al. 1987), however. For the characterization of fossil macromolecular organic matter and its preservation pathways, it is essential to combine several of these techniques. Despite these problems, studying this material is worthwhile (e.g. Briggs et al. 2000).

The biochemical signal from terrestrialization
Desiccation management -long-chain aliphatics Simple lipids-waxes. Protection against a lack of water is of prime importance for land organisms. This can be achieved by resisting desiccation to keep a positive water balance, for example, by erecting an evaporative barrier at the organism surface and/or by developing desiccation tolerance. It seems reasonable to assume that adaptations to desiccation developed prior to the terrestrialization of plants and algae. It might be an important adaptation for freshwater species or species that live in smaller enclosed and coastal habitats to enable them to resist periods of dryness and allowing spreading from one watershed to another (e.g. by wind and animals).
The earliest photosynthetic organisms on land were probably cyanobacteria possibly colonizing land as early as 2.6 Ga ago (Watanabe et al. 2000). They may have been present as single cells or filaments and, with time, colonized a variety of environments such as tidal zones, soils and desert crusts. Although emerging much later, this also accounts for photosynthetic microalgae such as Chlorophyta, Streptophyta, Bacillariophyta (diatoms), Eustigmatales and Dinoflagellata. Generally, these taxa survive dryness by tolerating dehydration as such, or by producing resting cysts (e.g. Zygnemataceae). Mostly they use sugars, lipids or proteins to stabilize the cell contents upon desiccation (Cardon et al. 2008). To our knowledge, there is no chemical fingerprint known of upon which this strategy may be reconstructed from the fossil record. Stable carbon and hydrogen isotopes of lipids can perhaps help here.
The sister group of the Chlorophyta, the Streptophyta, gave rise to the Embryophytes (land plants), the only group which developed into macroscopic organisms with strategies to cope with the uneven and erratic water supply on land. The very first evidence of land plants is indirectly documented by cryptospores (Strother 2000). Taylor & Strother (2008) describe Middle Cambrian palynomorphs which morphologically and ultrastructurally resemble cryptospores in many ways. However, their affinity with terrestrial or algal organisms is still under discussion; the earliest accepted occurrence of cryptospores is of Llanvirn age (middle Ordovician) (Strother et al. 1996).
The earliest 'megafossils' are documented in the late Llandovery (Lower Silurian) (Wellman & Gray 2000;Edwards 2001) (Fig. 1). These plants were probably small (Wellman et al. 2003) and had no water conducting organs such as trachea or roots. In this respect, they were like the modern Bryophyta which are the most primitive land plants living today. According to Proctor (2000), many of the Bryophyta have three water components: symplast water (in the cells), apoplast water (in the 'free space' of the cell walls) and external capillary water which, for much of the time, exceeds the symplast water by a large but variable quantity. For these species, water and nutrient transport is therefore largely via the outside of the plant. This implies that for these very primitive plants, resistance to desiccation is mostly achieved by drought tolerance (although this seems not to be the case for their spores).
However, not all bryophytes rely on external capillary water. Most thalloid liverworts and some erect growing mosses with waxy water repellent surfaces (e.g. many Polytrichaceae and Mniaceae) rely predominantly on internal water conduction (Proctor 2000), so that these organisms develop external coverings which function as water barriers.
With few exceptions, lipids associated with external coverings are the principal water barrier component of land plants and animals (Hadley 1989). For plants, these lipids consist primarily of unbranched fully saturated linear hydrocarbon backbones with chain lengths of 20-40 carbon atoms. Typically these lipids are n-alkanes, n-ketones and secondary alcohols with a predominance of oddnumbered chain lengths and primary n-aldehydes, n-alcohols and n-fatty acids with a predominance of even-numbered chain lengths. Furthermore, a wide variety of C 36 -C 60 wax-esters has been detected. These aliphatic coatings are responsible for .99% of the water barrier efficiency in plants (Schönherr 1976;Mérida et al. 1981;Jetter et al. 2006). It is therefore no surprise that pollen are also covered with surface waxes (Piffanelli et al. 1998;Schulz et al. 2000).
For arthropods (earliest evidence: late Cambrian; McNaughton et al. 2002) the strategy towards desiccation is similar to that of plants. Cuticular lipids are to a large extent identical but mono-and di-methyl alkanes also occur and may even dominate; for these organisms, they are also the primary passive barrier to evaporative water loss (e.g. Ramsay 1935;Hadley 1989;Gibbs 1998Gibbs , 2002. Long-chain lipids are also synthesized by various algae such as long-chain mid-chain diols by Eustigmatophyta and diatoms (de Leeuw et al. 1981;Versteegh et al. 2000;Sinninghe Damsté et al. 2003) long chain alkenones and alkenoates by Haptophyta (de Leeuw et al. 1980;Volkman et al. 1980;Rontani et al. 2007) or long-chain polyunsaturated alkenes by dinoflagellates (Mansour et al. 1999). However, the odd predominance of n-alkanes is a typical feature of land plants, though some microalgae such as Tetrahedron (Chlorophyta) or Nannochloropsis (Eustigmatophyta) also produce long-chain n-alkanes with a strong odd predominance (Gelpi et al. 1970;Gelin et al. 1997). Other algae also produce long-chain n-alkenes with odd predominance (Gelpi et al. 1970; see also Volkman et al. 1998) and it has been suggested that such alkadienes from the green alga Botryococcus braunii have given rise to an n-alkane odd-predominance in sediments (Lichtfouse et al. 1994;Riboulleau et al. 2007).
Since the odd predominance of long chain n-alkanes already occurs in the cuticular waxes of primitive plants such as liverworts (Matsuo et al. 1974) and mosses (Nissinen & Sewón 1994), it is interesting to investigate how this feature developed through the Palaeozoic. A marked odd predominance of n-alkanes with a maximum in C 23 or C 25 is observed in the extracts of early mature to mature coals of Permian age from Brazil and Australia (Casareo et al. 1996;Silva & Kalkreuth 2005). Contrastingly, the n-alkane distribution in the extracts of many coals of Carboniferous age is not characterized by a strong predominance of odd-numbered compounds; this feature can be ascribed to the maturity of the studied coal samples. However, even early mature coal samples only show a moderate odd predominance in the C 25 -C 31 range (e.g. Powell et al. 1976;Christiansen et al. 1989;Dzou et al. 1995;Fleck et al. 2001;Armstroff 2004).
An exception is the marked predominance in the C 25 -C 33 range observed in the extracts of very immature early Carboniferous coals from the Moscow Basin (Armstroff 2004). The long-chain predominance is also visible in some Devonian samples, in particular when (mio)spores are present in significant amounts: C 25 -C 33 (max C 25 ) in the immature Fammenian marls from Poland (Marynowski & Filipiak 2007); C 23 -C 31 (max C 27 ) in the middle-late Frasnian early mature cannel coals from Melville Island, Arctic Canada (Fowler et al. 1991); C 23 -C 29 (max C 25 ) in early mature Middle Devonian cutinite-rich humic coals from China containing Zosterophyllum remains (Sheng et al. 1992). These observations indicate that the characteristic signature of epicuticular   Strother (2000) and in black from Strother et al. (1996) and Steemans et al. (1996). Occurrence of plant megafossils from Wellman & Gray (2000) and Edwards (2001). See text for further references and explanations. Figure created with TS Creator (www. stratigraphy.org).
waxes was already acquired in the Middle Devonian. However, if we do not consider maturity problems, there seems to be a temporal evolution in the maximum of the n-alkane distribution which could be related to plant evolution: around C 25 in the Devonian to C 29 in the Carboniferous and C 23 -C 25 during the Permian. Present-day higher plants are characterized by a maximum in C 29 -C 31 .
Most coals older than the Devonian are liptinite-rich coals, which may derive from spores or from algae. Their n-alkane signature may therefore not be terrestrially derived. Due to the occurrence of odd-numbered n-alkanes in some algae, the n-alkane distributions must be interpreted with care. This is particularly true for samples older than Devonian where terrestrial debris is scarce. An odd-number n-alkane predominance has been previously observed in several Botryococcus richtorbanite of Permian age and was mainly attributed to a higher plant contribution (Araujo et al. 2003;Dawson 2006). These odd-numbered n-alkanes could also derive from the saturation of Botryococcus lipids, however (for an overview on Botryococcus lipids see Metzger et al. 2007).
Slight odd predominances in the range C 25 -C 31 were also observed in the extracts of a Cambrian sediment from Tarim Basin (China) (Zhang et al. 2000) as well as in the extracts of several Proterozoic, Cambrian and Ordovician sediments from different basins in East China (Li et al. 2002). In the case of East China, this distribution was assigned to a contribution from cyanobacteria of the Spirula type (Li et al. 2002). Considering the stratigraphic distribution of microscopic higher plant remains (Fig. 1), a contribution from higher plant lipids is somehow questionable in the Cambrian and Ordovician samples. Since land-plant derived n-alkanes are (as far as C3 plants are considered) usually more enriched in 12 C relative to 13 C than their algal counterparts, comparison of the stable carbon isotopic composition of these alkanes with the isotopic composition of typical marine/aquatic biomarkers may aid in the identification of a land-plant signal.
Macromolecules. In addition to simple lipids, protection against desiccation could also be offered by resistant biomacromolecules made of long-chain lipids (see also reviews by van de Leeuw et al. 2006), although the exact role of these macromolecules is not entirely clear. Mono to tetra-functionalized long-chain alcohols and acids form the building blocks of cutin, a biopolymer present in the cuticles of most land plants (Kolattukudy 1981;Deshmukh et al. 2003). Pyrolysis shows that aliphatic lipids are also major building blocks of cutan (Tegelaar et al. 1989a;Boom et al. 2005) a macromolecule which occurs in the cuticles of plants with crassulacean acid metabolism (CAM), where it may be an evolutionary solution to severe drought stress (Boom et al. 2005). Nuclear magnetic resonance (NMR) analysis suggests that this aliphatic biopolymer also contains aromatic units (Deshmukh et al. 2005). Cutin and cutan should therefore be considered 'terrestrial markers' of higher plants. The preservation potential of cutin is, however, very low; the building blocks are cross-linked with relatively weak esters while stronger ether-bridges connect the cutan building blocks, which is more resistant.
The production of aliphatic biopolymers was not invented by land plants but evolved much earlier.
The analysis of the cell walls of a wide range of algae shows that some algae produce a resistant aliphatic biopolymer made of cross-linked longcarbon chains called algaenan (see review by Metzger et al. 2007). Algaenans resist harsh chemical treatment and they have a relatively high preservation potential which may account for the long fossil record of the Chlorophyta (Batten 1996;Batten & Grenfell 1996;Colbath 1996;Guy-Ohlson 1996;van Geel & Grenfell 1996;Wicander et al. 1996). Even although the exact biological role of algaenan is not known, it has been suggested that the highly aliphatic (plastic-like) algaenan may function as a relatively waterproof layer; it is interesting to note that apart from the marine eustigmatophyte Nannochloropsis, the development of algaenan mostly occurs in freshwater algae, notably Chlorophyta (e.g. Tetraedron, Scenedesmus, Pediastrum) which probably have the largest risk of desiccation.
The similarities in structure and function between algaenan and cutin and cutan of land plants (all aliphatic biopolymers, which are both based on ether-or ester-linked long-chain fatty acids; van ) may imply that they represent subsequent stages in the evolution of the same biosynthetic pathway. This could constitute an argument in favour of the freshwater algal origin of terrestrial plants. In sediments, cutan is difficult to discriminate from algeanan or from a highly aliphatic cutan-like geopolymer. The latter can be formed from common lipids by oxidative polymerization during early diagenesis discussed below (Boom et al. 2005;Gupta et al. 2006aGupta et al. , 2007ade Leeuw 2007). This implies that fossil aliphatic biopolymers do not provide evidence for a landplant (cuticular) origin of the organic matter per se. In this case, stable carbon isotopic analysis of the aliphatic constituents may also be needed to obtain a conclusive answer.

UV protection, radiation damage
Terrestrialization also involved an increased exposure to harmful radiation. Ultraviolet-B (UVB) rapidly attenuates in water (penetration depth is of the order of millimetres) so protection to UVB had to increase considerably. The optical properties of plants (especially the effects of UV radiation on plants) have been intensively studied, partly in reaction to the recognition of the polar ozone holes in the atmosphere (Rozema et al. 2002a;de Bakker et al. 2005;Pfündel et al. 2006). In land plants, UV absorption is achieved by aromatic compounds such as the flavonoids and their derivatives (I -V) and other compounds produced via the phenylpropanoid pathway such as hydroxycinnamic acid (VI) and lignin. A wide diversity of flavonoids are already present in the Bryophyta. Flavonoids are absent in Hornworts (Stafford 1991;Rausher 2006) and have been found in one alga, the Charophyte Nitella (Markham & Porter 1969;Iwashina 2009). Flavonoids, or their derivatives in sediments, are of potential interest to elucidate the early terrestrialization of the embryophytes.
Flavonoids were widely used during the 1970s for plant systematic as well as evolutionary studies, in particular for angiosperms, based on the distinction of 'advanced' v. 'primitive' characters (Crawford 1978;Giannasi 1978;Stuessy & Crawford 1983) (Fig. 2). However, it appeared that the advanced v. primitive distinction in flavonoids composition was not straightforward, and that one given compound could be synthesized via different pathways. Flavonoids may therefore not be fully reliable indicators for phylogenetic studies at higher taxonomic levels (Crawford 1978;Giannasi 1978). Flavonoids are known from Tertiary sediments such as Kaempferol (III) (Niklas & Giannasi 1977a, b) and their earliest record is of the biflavonoid 5-O-Methylginkgetin (II) from Cretaceous Ginkgo fossils (Zhau et al. 2006). Accordingly, despite their potential interest for the terrestrialization process, it seems that they are not preserved long enough in sediments to provide further insight into the early embryophyte evolution.

Suberin
Suberin (Tegelaar et al. 1995) is another ether-based macromolecule produced by plants. It seems to be primarily used to form barriers between compartments or with the exterior. Depending on the place of deposition in the plant it protects against fire, desiccation and pathogens and limits ion transport and gas diffusion. It occurs in roots and tubers, bundle sheet cells of C4 plants and as cork in woody species that have secondary thickening (Enstone et al. 2002;Franke & Schreiber 2007).
The biopolymer suberin consists of an aromatic domain. Aromatic building blocks are generated via the phenylpropanoid pathway such as p-coumaric, ferulic and sinapic acids (VII-IX) also found in sporopollenin and as alcohols in lignin (see below), as well as an aliphatic domain with aliphatic building blocks similar to those of the plant cuticle (discussed above). The aliphatic component is considered to reduce transport whereas the aromatic part has been suggested to inhibit pathogen invasion (Kolattukudy 2001;Bernards 2002;Franke & Schreiber 2007). Like lignin, suberin is not known from the most primitive embryophytes. Being ester cross-linked, suberin probably has a low preservation potential. Due to its mixture of aliphatic and aromatic monomers it may be difficult, if not even impossible, to deduce a specific suberin fingerprint from the fossil record. Suberan is a rather enigmatic highly aliphatic non-hydrolysable biopolymer. It has been described from bark (Tegelaar et al. 1995). Possibly, this polymer originates from oxidative polymerization of unsaturated lipids.

Signalling and warfare
Living on land also required a new way of transmitting signals between organisms. Previously, signalling was restricted to water soluble compounds; on land, volatile compounds had to be developed.
Here, nature has expanded in a myriad of molecules such as repellents, odours and pheromones. To be effective, these molecules need to provoke a reaction by the receiver, implying that the compounds must be biologically active. Often such compounds are already active at low concentrations. For warfare this is different; the toxins may remain on the organism outside or in the cells and tend to be lipid or water soluble. They need not be transportable by air. In this case, however, the compounds are also constructed to be biologically active and interfere with the physiology of the attacking organism.
Many of these compounds are produced via the phenylpropanoid pathway which experienced a huge radiation with the adaptation of plants to land (Cooper-Driver & Bhattacharya 1998; Lewis & Davin 1999). Compounds produced via this pathway function for example, as antioxidants, have antifungal or antimicrobial properties or are insecticides, nematocides, antifeedants and poisons. These include the flavonoids mentioned above (I, III, IV) and their dimer (II) to polymers [the latter based on flavan-3-ols and/or flavan-3,4-diols as building blocks (I)], the proanthocyanidins or condensed tannins (V) (He et al. 2008). In addition, these all have a strong influence on soil structure and composition by retarding organic matter breakdown and capturing nutrients and, through this, substantially modifying the global carbon cycle (Kraus et al. 2009).
Condensed tannins appear much later in evolution than the flavonoids (Fig. 2). They occur only in leptosporangiate ferns, gymnosperms and angiosperms (Popper & Fry 2004;Popper 2008), leaving a much smaller impact on the carbon budget for the early evolution of land plants. Such fossil tannins are only unequivocally known from brown coal (Wilson & Hatcher 1988).
Two other groups of tannins exist; neither are derived from flavonoids . These are the hydrolysable tannins typical for angiosperms (Okuda et al. 2000) but which also occur in the filamentous green algae Spirogyra (Nishizawa et al. 1985) and the phlorotannins occurring in brown algae. Since neither group has direct relevance for the terrestrialization they will not be considered further.
Another common and diverse group of products produced via the phenylpropanoid pathway are lignans (phenylpropanoid dimers), nor-lignans (with diphenylpentane carbon skeletons) and lignan oligomers which, in contrast to lignin (a polymer of monolignols), are non-structural components (Lewis & Davin 1999;Suzuki & Umezawa 2007). Lignans are known from bryophytes such as liverworths and hornworths (Lewis & Davin 1999) and are not known from algae (see also the lignin section below). Lignan remains may therefore have been preserved in the earliest land plants and could be used as tracers.
Another protective strategy is the production of resins. Plant resins are used typically for protection (Langenheim 1995). Resin can be exuded passively (when a plant is wounded) or actively. Often resin emerges from canals or resin cells. Resin provides a mechanical and chemical protection against pathogens. When present on leaves, resin also acts as a barrier against desiccation and UV damage.
Resins contain a complex mixture of nonvolatile compounds mainly consisting of di-or triterpenoids. In addition, resins contain volatile compounds, including mono-and sesqui-terpenoids, which can be dominant in fresh resins. These volatile compounds tend to disappear with time but a fraction can remain entrapped when the matrix polymerizes and becomes hard (Anderson & Crelling 1995). Resin-(and amber-) producing trees are present both among gymnosperms (e.g. conifers, cycads) and angiosperms. The most prolific genera all live in the tropical to subtropical region (Langenheim 1995).
Plant resins have a rich fossil record in the form of amber or resinite. Amber and resinite are more or less synonymous terms describing geological material evolved from plant resin. The main difference lies in the fact that amber generally describes macroscopic remains, while resinite describes microscopic remains observed petrographically (Anderson & Crelling 1995). This may explain why 'true' amber is rarely described before the Cretaceous while resinite has been described in coal samples as old as Carboniferous (and maybe older). Among the oldest recognized ambers are Late Triassic ambers from Italy (Roghi et al. 2006), although some reports of Carboniferous amber exist (Smith 1896). This observation fits well with anatomical evidence that the earliest plants showing resin channels or resin-filled cells in their anatomy are members of the earliest but now extinct gymnosperm group, the Pteridospermopsida (Rothwell & Taylor 1972;Millay & Taylor 1977;Dunn et al. 2003). Although Pteridospermopsida emerge in the Late Devonian, the representatives with resin channels are from the Carboniferous. From this perspective, resin production seems to have evolved in the early gymnosperms. Analysis of the molecular composition of resinite and amber may therefore provide valuable information on the evolution of gymnosperms and angiosperms but is unlikely to elucidate the early evolution of land plants.
Five chemical classes of ambers have been recognized (Anderson & Crelling 1995). A large majority of ambers is based on labdanoid diterpene (X) polymers (Class I amber). Less abundant are ambers based on cadinenes (XI) (sesquiterenoids) polymers (Class II ambers) (Anderson et al. 1992;Anderson & Crelling 1995). Class III-V resins are relatively anecdotic. While class I resins derive both from gymnosperms and angiosperms, class II resins only derive from Angiosperms, in particular the Dipterocarpaceae (Anderson et al. 1992). From this observation, the oldest resins and, in particular resinites from the Palaeozoic, should belong to class I and derive from gymnosperms.
Amber or resinite samples older than the Cretaceous have often been studied by pyrolysis gas chromatography mass spectrometry (py-GC-MS) rather than by simple extraction. In the case of recent to Cretaceous resinous material, the py-GS-MS approach allows the successful identification of the structure and the class of the resin to be identified (Anderson et al. 1992;van Aarssen & de Leeuw 1992;Anderson & Botto 1993). The method has unfortunately proven much less successful in the case of older ambers or resinites, because most compounds liberated upon pyrolysis were poorly informative.
In this way, Roghi et al. (2006) conclude that Triassic ambers from Italy have an affinity with class II or class I resins despite the fact that, due to their age, it is unlikely that these resins are class II. Similarly, the pyrolysates of Carboniferous resinlike material, that is, resinites isolated from coals (Crelling & Kruge 1998;Nip et al. 2009) and resin rodlets of pteridosperms (van Bergen et al. 1995) are dominated by alkane-alkene doublets, as well as alkylphenols and aromatics. This strongly contrasts with the pyrolysates of Cretaceous or younger resins. These results could reveal an 'unknown extinct type of resin' associated with Carboniferous plants (van Bergen et al. 1995). However, it seems highly plausible that, despite its reputation of excellent chemical preservation, amber also suffers from diagenetic aliphatization similarly to other macromolecular compounds (see below). As far as we know, no biomarker of taxonomic interest has yet been extracted from Palaeozoic resinites. Amber and resinite still have to demonstrate their suitability for the study of terrestrialization.
Elsewhere in plants (not only for plant resins), most of signalling and warfare is the matter of a large family of compounds: the terpenoids. The smallest compounds (the monoterpenoids) are highly volatile and therefore rarely preserved in sediments, the major exception being in amber. The terpenoids of higher molecular weight (referred to as sesqui-, di-and tri-terpenoids) are frequently observed in ambers and sediments.
Monoterpenoids. Monoterpenoids mostly correspond to small odoriferous compounds incorporated in amber. Original monoterpenoids or their products of diagenetic transformation can therefore be observed. The extraction of amber frequently releases borneol, isoborneol and camphene (XII -XIV) (Armstrong et al. 1996;Czechowski et al. 1996). Monoterpenes found in amber could potentially be of taxonomic interest for plants of carboniferous or younger age, but not for the earliest plants (see above). In addition, this process would require that the plant which produced the resin is clearly identified.
Sesquiterpenoids. Among the sesquiterpenoids, eudesmane and cadinanes are the most important for the study of terrestrialization. The eudesmane skeleton (XV) is present in many terrestrial plants, in particular angiosperms, but also in liverworts. Although it has also been described in a few sponges (Gross & König 2006), eudesmane is generally considered as a marker of terrestrial origin for petroleum and sediments (Philp 1994). Eudesmane is absent from petroleum older than Devonian age (Alexander et al. 1984), confirming its association with non-flowering plants.
Often analysed with eudesmane is drimane (XV, XVI) which, however, very likely has a bacterial origin (Alexander et al. 1984) since it has been observed in Palaeoproterozoic sediments (Dutkiewicz et al. 2007). The presence of eudesmane in Palaeozoic coals is rarely reported. However, Dzou et al. (1995) observed significant amounts of eudesmane in Late Carboniferous coals from Pennsylvania. Although its presence was questioned by Borrego et al. (1999), eudesmane was also reported by del Rio et al. (1994) in late Carboniferous oil shales from Spain. These are the earliest reports of eudesmane in geological samples to our knowledge (Fig. 1), although we might expect an earlier appearance since this compound is present in some liverworts (Toyota & Asakawa 1990). The abundance of eudesmane in coal extracts decreases as maturity increases (Dzou et al. 1995), which might explain the rarity of reports of this compound in Palaeozoic sediments so far.
Cadinane (XVII) is thought to derive from cadinenes and cadinols which are ubiquitous in plants, bryophytes and fungi (Bordoloi et al. 1989) and from fragmentation of polycadinene resins (class II) produced by angiosperms (van Aarssen et al. 1990). Cadinane is particularly present in gymnosperm (class I) resins (Simoneit 1986;Grimalt et al. 1988). The dimers of cadinane, bicadinanes, are often observed in oils. They are generated by maturation of angiosperm (class II) resins (van Aarssen et al. 1990;Stout 1995). Cadinanes are therefore well-characterized higher plant biomarkers which are often observed in Cretaceous or younger sediments (van Aarssen et al. 1990). So far, fully saturated cadinanes have not been observed in Palaeozoic sediments.
Diterpenoids. Particularly abundant among conifers and their ambers are diterpenoids with abietane, pimarane, kaurane and podocarpane (XVIII -XX) skeletons. They are mostly produced by higher plants, though some marine algae also synthesize these compounds in a much functionalized form (Simoneit 1986). According to the review of Alexander (1987a), the different land plants can be recognized from their specific diterpenoids contribution. Bryophytes and pteridophytes differ from gymnosperms by their absence of abietane, beyerane and phyllocladane (XVIII, XXII, XXIII) skeletons. The appearance of these latter compounds in the sedimentary record could therefore document the transition from 'horizontal' to 'vertical' land plants. Phyllocladane, in particular, would be a specific biomarker for conifers.
The occurrence of the fully saturated diterpenoids in Permian and Carboniferous coals is relatively frequent (e.g. Noble et al. 1985;Schulze & Michaelis 1990;Fleck et al. 2001;Fabianska et al. 2003;Piedad-Sánchez et al. 2004;Izart et al. 2006); phyllocladane, ent-beyerane and ent-kaurane (XX -XXV) are often reported in particular. The presence of phyllocladane and ent-beyerane in these samples is consistent with the evolved flora which existed in the Late Carboniferous.
The study of Schulze & Michaelis (1990) on Carboniferous coals from Germany showed that ent-kaurane is more abundant in the Westphalian samples from the Ruhr, while ent-beyerane and phyllocladane are more abundant in the Westphalian and Stephanian coals from the Saar. The authors proposed that this change might be due to different inputs of higher plants, possibly related to the different sedimentary settings (limnic in the Ruhr v. paralic in the Saar). Changes in the pentacyclic terpenoids were also documented in these coals (Vliex et al. 1994;Auras et al. 2006) (see below). The presence of ent-beyerane, ent-kaurane and phyllocladane in Lower Carboniferous sediments has led to the suggestion that a precursor of the conifers already produced these compounds at that time (Disnar & Harouna 1994). Since these compounds do not occur in the Pinaceae, it has been suggested that the Pinaceae separated early from the other conifers (Armstroff et al. 2006), implying that conifers had already evolved in the Early Carboniferous. Sheng et al. (1992) described an abundance of tetracyclic diterpanes in Middle Devonian humic coals from China. Among the identified compounds are 17-norphyllocladanes, ent-beyerane and entkaurane. This corresponds to the earliest reported occurrence of ent-kaurane and ent-beyerane (Fig. 1). Palaeobotanical data indicate that these Middle Devonian coals mainly derive from pteridophytes (Sheng et al. 1992). These are plants which, according to the review of Alexander et al. (1987a), should neither contain phyllocladane nor beyerane. The absence of phyllocladane from the Middle Devonian coals therefore appears consistent with the absence of conifers during this period, while the presence of ent-beyerane questions either the origin of this compound in the Devonian coals or its absence from pteridophytes (Sheng et al. 1992). As far as we know, the oldest reported occurrence of phyllocladane is Serpukhovian that is, late Early Carboniferous (Fabianska et al. 2003;Izart et al. 2006) (Fig. 1).
Totally or partially aromatized compounds deriving from the tricyclic terpenoids are also frequently reported in sediments. The most common compounds are retene and simonellite (XXVI -XXVII) which are thought to derive from the aromatization of abietane. However, Alexander et al. (1987b) demonstrated in Miocene coals that retene and simonellite are more likely derived from phyllocladane and kaurane. Retene has been described in the extracts of numerous Carboniferous coals (del Río et al. 1994;Stefanova et al. 1995;Fabianska et al. 2003;Armstroff et al. 2006;Izart et al. 2006). Armstroff (2004) also describes the presence of retene in Frasnian cannel coals from Russia. To our knowledge, this is the earliest reported occurrence of retene which can be confidently associated with a terrestrial origin (Fig. 1). If, as proposed by Alexander et al. (1987b), retene from land plants derives from aromatization of kaurane, its presence in Devonian coals has no strong significance as kaurane is mostly associated with Bryophyta and Pteridophyta. Conversely, if it is demonstrated that retene only derives from abietane or from phyllocladane, its presence in Devonian sediments would document that at least the biosynthesis of abietic acid (if not conifers) already existed in the Fammenian.
Retene, however, has also been observed in the extracts of several Lower Palaeozoic to Precambrian carbonates, where inputs from terrestrial plants are not likely (Jiang et al. 1995;Zhang et al. 1999). An algal and/or bacterial source was proposed by these authors. Consistent with this conclusion, retene was also observed (although in low amounts) in the pyrolysates of a green alga and cyanobacterium cultures (Wen et al. 2000). Care is therefore recommended in the interpretation of the presence of retene in sediments where inputs from higher plants are low and, most particularly, in Ordovician-Silurian sediments.
Triterpenoids. The triterpenoids are a very large family which comprises the well-known group of bacterial biomarkers, the hopanoids, as well as the hypersalinity biomarker gammacerane (XXVIII) (Simoneit 1986). Several higher plant biomarkers also belong to this family, the most famous compound being oleanane (XXIX) (Simoneit 1986).
According to a recent review of the distribution of pentacyclic triterpenes by Jacob (2003), some skeletons are particularly widespread among angiosperms such as oleanane, lupane, friedelane and ursane (XXIX -XXXII). At least one skeleton, the serratane (XXXIII), would be characteristic of gymnosperms, mosses, ferns, lycopodiophytes and bryophytes. Although it has been observed in several angiosperms (and in particular Poaceae), fernane (XXXIV) is in particular present among ferns (Jacob 2003). In relation to their abundance in angiosperms, the occurrence of oleanane, lupane, ursane and their derived compounds is mostly restricted to Cretaceous or younger oils and sediments (Peters et al. 2005). The presence of oleanane in Carboniferous sediments has however been described by Moldowan et al. (1994). This was followed by a long investigation in order to identify the source of this compound in Palaeozoic sediments. Recent studies identified Gigantopterids as the source of oleanane in Palaeozoic sediments which, added to morphological arguments, would place the appearance of the angiosperm lineage before the Permian (Taylor et al. 2006).
As a matter of fact, aromatic arborane/fernane derivatives have also been observed in several sediments where terrestrial inputs are seemingly inexistent (Hauke et al. 1992). According to Hauke et al. (1995), however, the compounds identified by Vliex et al. (1994) are true fernane derivatives. Recently, a detailed geochemical and botanical study of these Stephanian coals allowed Cordaites to be identified as the source of these fernane derivatives (Auras et al. 2006).

Skeletal materials
Organisms living on land lack the support of water to overcome gravity and therefore have to strengthen or develop supportive tissues. If we consider the present-day skeletal biopolymers, which are potentially stable enough to survive in the fossil record on a regular basis and thus potentially leave a fingerprint of the terrestrialization process, we often observe these are at the boundary between the cell/organism inside and its outside and combine their structural function with protection, for example, cuticles. They consist of only four basic building blocks produced by ancient biosynthetic pathways common to Archaea, Bacteria and Eukarya (Kandler & König 1998). These building blocks are sugars, amino acids, long-chain lipids and aromatics.

Sugars, amino acids and their polymers -chitin.
Among the earliest such structures are probably the peptidoglycans (amino acid -sugar polymers) found in the cell walls of Bacteria and Archaea. Peptidoglycans are known to be resistant to degradation compared to proteins and are observed in recent sedimentary organic matter (Grutters et al. 2002;Nagata et al. 2003). However, they do not account for an important part of the organic matter (Veuger et al. 2006). Although sugars and amino acids may severely crosslink and form degradationresistant polymers (Maillard 1912), they have not been used for evolutionary or environmental reconstruction. Probably, they are too omnipresent and taxon-unspecific for these purposes.
Polysaccharide synthesis is also ancient. Cellulose is produced by cyanobacteria and proteobacteria. It has been suggested that the ability of vastly unrelated eukaryotic species to produce cellulose has been acquired via endosymbiosis with these bacteria and lateral gene transfer (Niklas 2004). Despite a large chemical diversity in peptidoglycans and polysaccharides among organisms, most are degraded. This explains why there is so little evidence of fossil bacterial polymeric products, despite the bacterial omnipresence. Only cellulose and chitin appear to be relatively resistant to biodegradation and form a considerable fossil record. The refractory character of these macromolecules is clearly related to the exact composition of the monomers and their stereoconfiguration in the polymer. This is demonstrated by comparing the extremely low fossilization potential of starch (poly 1 ! 4 b-d-Glucose) with that of cellulose (poly 1 ! 4 a-d-Glucose) (Fig. 3).
The polysaccharide chitin (N-acetyl-d-glucosamine; Fig. 4) which is so abundant in arthropods and oomycetes today is not known from prokaryotes (Niklas 2004). The capability to synthesize chitin is therefore onsidered to have arisen much later in evolution compared to cellulose synthesis. Its presence in many marine organisms such as arthropods, molluscs and annelids places its evolution well before the evolution of land-adapted organisms, however.
Studies on chitin preservation provide another argument as to why this substance is not suitable for unravelling the terrestrialization process. Laboratory experiments show that chitin belongs to the most degradation-resistant parts of arthropods Briggs et al. 1995). At first sight, this is not surprising since arthropod cuticles are also abundant in the fossil record. But are these cuticles still made of chitin?
The oldest known traces of the chitin marker D-glucosamine (XXXX) occur in extremely wellpreserved weevil cuticles (up to 0.6% of the organic matter) present in the 25 Ma lacustrine sediments from Enspel (Stankiewicz et al. 1997;Flannery et al. 2001). Less well-preserved or older arthropod cuticles show no such traces of chitin (Stankiewicz et al. 1998;Gupta et al. 2007a). This alone probably explains why chitin could not be detected in chitinozoans (Voss-Foucart & Jeuniaux 1972;Jacob et al. 2007); the biological affinity of the chitinozoa therefore remains unresolved. This also implies that a reassessment of the presence of chitin in Palaeocene dinoflagellate cysts (Belayouni & Trichet 1980) Fig. 3. Structural formulas of (a) cellulose which is made of (b) glucose and (c) starch (which is also made of glucose).
fossil cuticles release series of alkanes and alkenes upon pyrolysis, suggesting that the chitin has been replaced and/or transformed by an aliphatic geopolymer. Experimental evidence suggests that these aliphatic compounds may in fact be lipids that have become attached to the biomacromolecule. These lipids are likely to originate from the closest source available, the organism itself (Gupta et al. 2006(Gupta et al. , 2007bde Leeuw 2007).
Aromatics and their polymers -lignin. Aromatic polymers are lignin (Fig. 5) as are at least some sporopollenins (Boom 2004). There are some enigmatic algal biomacromolecules with high preservation potential such as the wall material of dinoflagellate cysts. The wall material is currently referred to by the cryptic name 'dinosporin'. Although dinosporin has been suggested to be aromatic with the isoprenoid tocopherol as an aromatic building block (Kokinos et al. 1998), this view has been challenged by others (de Leeuw et al. 2006). The position of dinosporin in the scheme presented above therefore remains unclear. Apart from this single report of a possible aromatic signature in dinosporin and the increase of phenolic moieties in the algaenan of the Ordovician freshwater acritarch Gloeocapsamorpha prisca in relation to salinity increase (Derenne et al. 1992), the presence of aromatic moieties seems to be a feature of terrestrial biomacromolecular organic matter.
Lignin is a macromolecule resulting from the polymerization of three phenolic units synthesized via the phenylpropanoid pathway, namely the monolignols p-coumaryl, coniferyl and sinapyl alcohols (XXXXI -XXXXIII) Raven 2000) (Fig. 4). The polymerization reaction has long been considered to be a random process but this concept appears to be wrong (see reviews of both Lewis 1999;Davin & Lewis 2005). The corresponding degradation products are coumaryl, guaiacyl (or vanillyl) and syringyl moieties (XXXXIV -XXXXVI), respectively (Hedges & Mann 1979).
Differences in the abundance of the structural lignin compounds are observed among higher plants: guaiacyl units dominate in gymnosperms wood, syringyl and guiacyl units are dominant in woody tissues of dicotyledonous angiosperms, p-coumaryl and guaiacyl dominate in woody tissues of monocotiledonous angiosperms and non-woody tissues are generally dominated by p-coumaryl units (Hedges & Mann 1979;Logan & Thomas 1985). Pteridophyte lignins are derived from sinapyl alcohol (Barcelo et al. 2007).
From these observations, the respective abundance of the three units in sedimentary organic matter should change in parallel with the evolution of land plants (Logan & Thomas 1987). Although structural motifs of syringyl peroxidases (an enzyme in lignin synthesis) have been identified in Bryophytes , lignin is absent in these plants (Lewis & Yamamoto 1990). Lignin has also been reported from a red algae which is considered a case of parallel evolution of lignin synthesis (Martone et  2009). The results obtained by Logan & Thomas (1987) on different Carboniferous plants were consistent with this idea: guaiacyl oxidation products were mostly detected from Sigillaria ovata, a plant which contained woody tissues, while no guaiacyl units were obtained from Lepidodendron and Lepidophloios which were mostly non-woody plants. However, these results should be regarded with care since monolignols also are building blocks of lignans (Lewis & Davin 1999). As discussed above, these latter compounds are widespread among tracheophytes and are also present in bryophytes (Lewis & Davin 1999;Raven 2000). Consistently, the three lignin phenols were observed (although in low amounts) in the oxidation products of different bryophytes (Logan & Thomas 1985). A second problem with lignin phenols is that, during diagenesis, all three units degrade differently. The general order of resistance is p-coumaryl . guaiacyl . syringyl (Hedges et al. 1985;Logan & Thomas 1985;Orem et al. 1996). Among diagenetic/catagenetic transformations of wood are demethoxylations which also naturally lead to the diminution of syringyl and guaiacyl units, favouring p-coumaryl units in the remaining tissues (Orem et al. 1996;Hatcher & Clifford 1997). A significant contribution of p-coumaryl units was obtained by Logan & Thomas (1987) in the oxidation products of Carboniferous Sigillaria ovata. This feature could reflect both the diagenetic increase of these units due to decarboxylation of lignin units and the contribution of lignan from bryophytes.
Several alkylphenols were observed in the flash pyrolysates of the Lower Devonian plants Renalia, Zosterophyllum and Psilophyton (Ewbank et al. 1996). Alhough these compounds may correspond to lignin pyrolysis products, in particular after diagenetic demethoxylation of lignin, their presence in the pyrolysates of Lower Devonian plants does not unequivocally attest for the presence of lignin in these early plants; they could also derive from the pyrolysis of condensed tannins (Ewbank et al. 1996). Despite its interest, the study of Ewbank et al. (1996) mostly demonstrated that, since pyrolysis products are poorly characteristic, flash pyrolysis is not suited to molecularly characterize the material of early land plants.
The development of spores (and pollen) is an important requisite for terrestrialization since it enables dispersion of gametophytes through air. The effective UVB absorbance of aromatic rings (Pfündel et al. 2006) may play a role in protecting airborne pollen; variations in the aromatic content of fossil pollen have been proposed as a UVB proxy (Rozema et al. 2001(Rozema et al. , 2002b. However, this proxy is based on pyrolysis of the (fossil) pollen. The coumaric and ferulic moieties produced in this way are probably derived from sporopollenin-type biopolymers in the pollen wall and not derived from compounds believed to regulate UV damage (de Leeuw et al. 2006).
The structure of sporopollenin, the wall polymer of pollen and spores, has long been a matter of debate and it seems likely that both aliphatic and aromatic sporopollenins occur (de Leeuw et al. 2006). Although the structure of the aliphatic sporopollenins is unclear, the aromatic sporopollenin consists of coumaric, ferrulic and sinapic acids (VII -IX) as building blocks. These are the same building blocks for lignin but with propyl-acids in stead of propyl-alcohols (Boom et al. 2005). Biosynthetically, the lignols are formed from these carboxylic acids by reduction and it is interesting to note that, prior to the evolution of lignin synthesis, plants already were able to produce the biopolymer sporopollenin.
Recently, it has been demonstrated that one of the key enzymes in the phenylpropanoid pathway needed to convert the phenylpropanoid acids into their alcohols, 4-coumarate:CoA ligase (4CL) (Ferrer et al. 2008) already occurs in the Bryophyte Physcomitrella patens (Silber et al. 2008). The presence of a similar enzyme cinnamate:CoA ligase (ScCCL) in the bacterium Streptomyces coelicolor (Kaneko et al. 2003) suggests that this part of the pathway has a much longer history than previously expected. It would be interesting to know to what extent the phenylpropanoid pathway had to evolve in order to arrive at sporopollenin synthesis and further to lignin biosynthesis.
Another open question is why, later in evolution, lignin and not sporopollenin became a major structural element in vascular plants. Although apparently the earliest land plants already produced spores, it is not known where in evolution the synthesis of sporopollenin started. Sporopollenin has been claimed to be produced by Coleochaete (Delwiche et al. 1989), a member of the Characeae which is a sister group of Embryophytes (Waters 2003). However, due to a lack of insight in the nature of sporopollenin in the past and despite increasing insight into the nature of acid-(and acetolysis-) resistant algal walls (e.g. de Leeuw et al. 2006) this has not yet been resolved. One method of shedding light on sporopollenin evolution could be a systematic analysis of the structural diversity (if any) of sporopollenins of primitive land plants and their closest relatives.

Composition and preservation
An important step in investigating the relation between fossil organic matter and its source organisms from a chemical point of view is assessing the extent to which the biomolecules survive taphonomic processes. Knowledge of present degradative pathways also provides a key to the past, enabling the sedimentary organic molecules to be linked to their biological sources. Through this, the evolution of past life and environment can be reconstructed.
Biological tissues are made of different types of molecules which can be broadly classified according to their chemical functions into carbohydrates (simple sugars and polymers among which cellulose), lipids, peptides (simple amino acids and their polymers, the proteins) and lignin, the principal component of wood. Although different from peptides, nucleic acids (which are the building blocks of DNA) have a fate which is similar to that of peptides during burial, and therefore can be assimilated to peptides.
After organism death and during burial in sediments, the organic matter is bio-or chemically degraded. This results in an important loss of the organic material, but also to chemical modifications of the biomolecules. After this stage, the original material can be either totally unrecognizable or recognizable to a certain degree. During this transformation of biomolecules to geomolecules (the diagenesis), the fate of the different classes of compounds is very different: simple sugars and peptides are generally rapidly mineralized while lipids, complex sugars, sporopollenin and lignin are less easily degraded and therefore have a higher chance of being buried in sediments.
The sedimentary environment is clearly of prime importance. Burial rates, primary productivity, oxygen availability, water depth, organic matter concentration and mineral composition all influence organic matter preservation (Tyson 2001;Burdige 2007;Rothman & Forney 2007). In addition, the initial chemistry of the organic matter is also important (Middelburg 1989;Sinninghe Damsté et al. 2002;Versteegh & Zonneveld 2002;Prahl et al. 2003).
From the study of the organic matter deriving from a wide variety of sedimentary and diagenetic environments, a series of preservation pathways has been proposed de Leeuw et al. 2006;de Leeuw 2007). The degradation-recondensation pathway (Tissot & Welte 1984) is based on the formation of macromolecular organic matter by random, post-mortem polymerization reactions of degradation residues. Because the organic matter involved in this pathway is generally highly degraded, the deduction of the biological affinities of the fossil organic matter preserved along this pathway is somehow complicated.
In contrast to this, the selective preservation pathway (Philp & Calvin 1976;Tegelaar et al. 1989b) assumes that the biomolecules preserve as they have been synthesized. This pathway concerns many lipids and a few specific biomacromolecules including lignin. Selective preservation of macromolecules is generally associated with the preservation of the morphology (Largeau et al. 1986) although the opposite, excellent morphological preservation, does not imply excellent chemical preservation (see review of de Leeuw et al. 2006;de Leeuw 2007;Gupta et al. 2007b). Biomolecules preserved through this pathway are highly recognizable, even after millions of years of burial (Derenne et al. 1988).
The natural sulphurization (Sinninghe Damsté & de Leeuw 1990) and oxidative polymerization pathways (Harvey et al. 1983;de Leeuw 2007;Gupta et al. 2007b) stress that free sulphur species and oxidizing agents cause condensation and crosslinking, respectively, of both lipids and macromolecules. This reduces the bioavailability of the material so that labile compounds that otherwise would have been mineralized may escape into the fossil record (Kok et al. 2000). Molecules preserved through this pathway can retain most of their original specificity, even after long periods of time Versteegh et al. 2007). Clearly, due to the much higher availability of oxygen in air and much longer oxygen exposure times, the oxidative polymerization pathway is particularly likely to happen on land.
Burial to important depth, or for long periods of time will also lead to the thermal modification of the organic matter (OM). This process, termed cracking, is the base of petroleum and natural gas formation. The particular organic matter becomes increasingly aromatic and cyclic by selective removal of the aliphatic components and by aromatization and cyclization of the residue. The more the compound is thermally degraded, the less will its original structure will recognizable.
Although maturation of organic matter may play an important role in relatively young sediments (provided temperature is sufficiently high), this is a clear issue on Palaeozoic and older material (Roberts et al. 1995;Yule et al. 2000) where slow transformation at mild temperature conditions is compensated for by long periods of time. The same is true for changes in the composition of stable carbon and hydrogen in the organic matter. This results in a 13 C depletion of the released compounds and a 13 C enrichment of the kerogen (Schimmelmann et al. 2001). For hydrogen, changes are larger and depend on the compounds considered. In particular, hydrogen on tertiary carbons (e.g. in isoprenoids) is subject to exchange with the surrounding water (Pedentchouk et al. 2006). Nevertheless, shifts are minor compared to the natural variations in the distributions of these stable isotopes in organic matter (Schimmelmann et al. 2001;Pedentchouk et al. 2006).

Aliphatization and related problems
Studies on the macromolecular nature of Palaeozoic and older acritarchs have shown both aliphatic and aromatic wall compositions (Kjellström 1968;Collinson et al. 1994;Arouri et al. 1999Arouri et al. , 2000Foster et al. 2002;Dutta et al. 2006). Others have concentrated on the biomarker lipids associated with the acritarchs (Moldowan & Talyzina 1998;Talyzina et al. 2000) and the host sediments (Meng et al. 2005), and have drawn conclusions on the biological affinities of the acritarchs. What are the consequences for the application of organic geochemistry to elucidating the terrestrialization of life? Considering both the diagenetic and catagenetic processes, attributing an aromatic or aliphatic contribution to an original biomacromolecular structure remains problematic. As long as the lipids which have become incorporated in the macromolecular matrix post mortem have been derived from the source organisms themselves, the approach of carefully releasing and analysing these lipids seems to be the more successful approach.
Analogous to the transformation of chitinous biomolecules into aliphatic geomolecules (see above), sporopollenin and other biomacromolecules seem to transform chemically over time. Whereas fresh megaspores of Isoetes and Salvinia are purely aromatic, the fossil material consists of a mixture of aliphatic and aromatic moieties, again suggesting addition of long-chain aliphatic compounds (van Bergen et al. 1993;Boom 2004). Furthermore, the cyst walls of the recent dinoflagellate Lingulodinium polyedrum seem to be non-aliphatic (Kokinos et al. 1998) whereas fossil dinoflagellate cysts have been reported to contain mixed aromatic and aliphatic moieties (de Leeuw et al. 2006).
An extreme case of aliphatization by condensation of aliphatic lipids has been described for 'dinocasts' from the Eocene of Pakistan: the relatively solid to spongy dinoflagellate-shaped structures occurring in the sediments are believed to represent the oxidatively polymerized cell contents of motile dinoflagellates ). Although addition of aliphatic components modifies the signature of several aromatic biomacromolecules (chitin, sporopollenin) such processes seem to be absent for fossil lignin. This may result from the fact that, in most cases, the membrane lipids are very closely located to the biomacromolecule; in lignin there are no lipids around. As such, this may be an indirect and circumstantial piece of evidence for the oxidative polymerization pathway.
It is not only aliphatics which are subject to oxidative polymerization. This process also applies to the terpenoids in resins, leading to resin hardening and amber formation. One may wonder to what extent the oldest ambers, which are dominated by aliphatic moieties (van Bergen et al. 1995), were originally aliphatic or have become so by aliphatization.
For initially aliphatic biomacromolecules such as cutin, cutan and algaenan, the post mortem aliphatization is intrinsically much more difficult to detect. The incorporation of free lipids to the naturally resistant algaenan of Botryococcus race A by oxidative cross-linking has been clearly demonstrated in coorongite (Gatellier et al. 1993), a rubbery material derived from the accumulation of algal remains on the shores of lakes. As Botryococcus free lipids and algaenans were both aliphatic, the aliphaticity of coorongite was very similar to that of the algaenan; however, the signature upon pyrolysis was significantly different (Gatellier et al. 1993). For Botryococcus braunii race B, the algaenan walls also incorporate polyacetals of polymethylsqualanes. This may provide a clear marker for the presence of cell walls of this taxon in sediments (Metzger et al. 2007). However, in this case assessment of the degree of change of the original biopolymer by the post mortem oxidative polymerization of membrane and other associated free lipids also remains problematic.
One of the classical examples of the selective preservation pathway is the algaenan of fossil Tetraedron envelopes from the Messel Oil Shale. Apart from being strikingly well-preserved morphologically, the chemical fingerprint of these cuticles upon flash pyrolysis closely resembles its modern counterpart (Goth et al. 1988). But what difference would a contribution of aliphatic lipids from the organism have made? Similarly, oxidative polymerization of aliphatic lipids has been suggested to have played a role in the formation of the aliphatic algaenan of the Ordovician alga Gloeocapsamorpha prisca (Blokker et al. 2001), but it is difficult to ascertain to what extent this aliphatic material corresponds to the original cell walls.
An analogous problem is illustrated on cutin and cutan. The ether cross-linked cutan of CAM plants is, chemically speaking, more stable than ester cross-linked cutin of most other higher plants: cutin is broken down into its original monomers upon base hydrolysis while cutan resists this treatment. Since the fossil plant cuticles also survive base hydrolysis, they have been considered to represent selectively preserved cutan (Tegelaar et al. 1989c). However, fossil non-hydrolysable cuticles are known from plants that do not produce cutan (Gupta et al. 2006b(Gupta et al. , 2007bde Leeuw 2007). In fact, the depositional environment of CAM plants does not at all favour cutan preservation whereas several cutin-producing plants occur in or near excellent preservational environments. Laboratory experiments using elevated temperature and pressure have recently demonstrated that, similar to chitin, lipids may become incorporated in cutin in due course (Gupta et al. 2006a). It finally appears that most of the previously observed fossil cutans in fact correspond to cuticle lipids which were oxidatively linked during diagenesis .
Even relatively simple lipids are not always easy to relate to their source. Although structural modification such as loss of functional groups or changes in stereochemistry do not usually prevent assignment to their source organisms (e.g. Sinninghe Moldowan & Talyzina 1998) they may disappear from the analytical window. The corollary of aliphatization is that free lipids may become part of larger macromolecular structures so that extra analytical steps are required for their detection and identification (e.g. Adam et al. 2006).
For the particular organic matter, we may have visual information on the biological affinities of the fossils at hand, but to what extent is this matched by the chemical composition of the fossils? It seems that much of the aliphatization is brought about by lipids from the immediate surroundings of the original biomacromolecule, that is, derived from the source organism. Moreover, our present understanding of the natural sulphurization and oxidative polymerization pathways imply that these added substances survived relatively undamaged structurally and isotopically. This means that there should still be a fair chance of obtaining information on the nature of the source organisms, provided the individual products released upon chemical degradation or (offline) pyrolysis can be related to a single source and metabolic pathway (van Dongen et al. 2002;van Bergen & Poole 2002;Poole et al. 2004).
Aliphatization of macromolecular material from plants and animals therefore seems to be ineluctable, which complicates the identification of the molecular characteristics (and therefore the biosynthetic pathways) of very old organic matter.

Conclusion
Organic geochemistry plays an important role in the elucidation of the history of early life (Brocks et al. 1999;Brocks & Summons 2003;Summons et al. 2006). Similarly, it should play a role in understanding the terrestrialization process, in particular for plants. Numerous molecular biomarkers of terrestrial plants, deriving either from structural tissues such as lignin phenols, from epicuticular waxes or from the large class of terpenoids exist, and they are widely used in Tertiary and recent sediments.
The study of the terrestrialization process with organic geochemistry is associated with numerous difficulties, however, in particular in assigning Palaeozoic fossil organic matter to its source. Apart from the fact that the samples have often suffered from thermal alteration, the difficulties mostly arise from a lack of taxonomic precision of the molecular biomarkers. Other difficulties arise from the frequent chemical modification of the material, despite excellent morphological preservation (e.g. aliphatization).
All these difficulties easily explain the relatively large temporal gap which currently exists between the earliest microscopic plant remains documented in Middle Ordovician (Strother et al. 1996) and the earliest unambiguously documented terrestrial biomarkers in Middle Devonian (Sheng et al. 1992). Despite this, the set of currently identified molecules of terrestrial origin is already sufficiently good to discriminate changes in plant associations during the Carboniferous, revealing further information on the terrestrialization process.
It is additionally hoped that condensation processes, which remove lipids from the pool of bioavailable products, may conversely facilitate the survival of specific lipids over long periods of time and, as such, record the biochemical evolution related to the terrestrialization in the sediments. Advancement of the assessment of the stable carbon and hydrogen isotopic compositions on lipids or (offline) pyrolysis products increasingly contributes to unravelling the evolution of biosynthetic pathways and diagenetic overprints. Great advances will also be made with the development of micro-scale techniques (microsampling, micro extractions and nano-SIMS). These techniques will allow the study of fossils present in very low amounts such as very early spores and cuticles, or the study of monospecific fossil associations. Another rapidly developing approach to resolving terrestrialization involves genomics: tracing the evolution of enzymes critical to the biosynthetic pathways involved in the terrestrialization process.
Terrestrialization and earliest plants had previously failed to attract many organic geochemists. However, this is changing as demonstrated by several recent studies and review papers Armstroff et al. 2006;Auras et al. 2006). It is therefore likely that the right compounds have not been looked at in the right place and with the right techniques -yet.
We thank M. Vecoli (UST Lille) and G. Clement (MNHN, Paris) for inviting us to the ECLIPSE II Workshop: Terrestrialization Influences on the Palaeozoic Geosphere-Biosphere. We also thank J. de Leeuw (Utrecht University) and M. Vecoli for constructive comments on the manuscript. H. Kerp (Münster University) is thanked for his help on the occurrence of resin in early gymnosperms. Financial support for GJMV by the USTL (Lille) is gratefully acknowledged.

Appendix
Structural formulae of the compounds mentioned in the text.