Biogeographical affinities of Jurassic and Cretaceous continental vertebrate assemblages from SE Asia

Abstract Over the last 25 years, rich vertebrate assemblages have been discovered in three distinct formations of the Late Jurassic and Early Cretaceous of Thailand. This work aims to compare the taxonomic assemblages of SE Asia within their palaeogeographical context in Asia. Occurrences of 477 taxa in 94 Regional Faunal Assemblages (RFA) have provided the raw material for producing a dissimilarity matrix based on the Raup & Crick index. These distances have been investigated statistically to infer relationships between the diverse faunal assemblages in space and time. Our results show that the Thai formations are more similar to each other than to any other formations, suggesting a strong provincialism. The relationship of SE Asian RFAs with other Asian RFAs is more influenced by the presence of freshwater or near-shore taxa than by strictly terrestrial ones. Our analysis shows that the faunal interchange between RFAs was rather low from the Late Jurassic to the end of the Early Cretaceous. However, faunal dispersals dramatically decreased during the mid-Early Cretaceous in Asia. The faunas show an overall stronger provincialism during the mid-Early Cretaceous, indicating the role of possible geographical barriers. This event is characterized by the absence of ornithischian dinosaurs in the Sao Khua Formation although they are present in the under- and overlying formations. Taxonomic diversity and exchanges between faunal assemblages recovered rapidly as early as the Aptian in Asia, but the fauna of SE Asia still retained a strong biogeographical signature.

For the last 25 years palaeontological field work on the Khorat Plateau of NE Thailand has yielded a large number of fossil vertebrates, providing a detailed picture of the faunal succession through the five non-marine formations of the Khorat Group, which spans at least the Early Cretaceous (and possibly the latest Jurassic). Such a relatively continuous record is known from very few places in the world in terms of stratigraphic completeness. The extraordinary richness of the Asian fossil record of non-marine Mesozoic vertebrates, especially in China, Mongolia and central Asia, allows comparison with Thai faunas and reveals some interesting similarities: for instance, the discovery of a psittacosaurid in Thailand (Buffetaut & Suteethorn 1992) has shown that this early ceratopsian was not restricted to northern and central Asia. However, other taxa from Thailand indicate a high degree of endemism, as exemplified by hybodont sharks (Cuny et al. 2005). Unexpected absences have also been noted, such as the apparent lack of ornithischian dinosaurs in the Early Cretaceous Sao Khua Formation, whereas this group flourished in other parts of Asia at that time. The coexistence in the Khorat Group of endemic and widespread taxa introduces difficulties in understanding the palaeogeographical relationships of the Thai faunas, and thus in determining the geological age of the formations of the Khorat Group. At the top of the Khorat Group, an Aptian age is well supported by biostratigraphical evidence for the Khok Kruat Formation; but the age of the older formations is much more uncertain (Racey et al. 1996). The main obstacle to more precise dating is the lack of marine intercalations in these fully continental formations (Meesook 2000).
The marginal location of Thailand, on the southeastern margin of Laurasia, since the collision of the Shan-Thai and Indochina blocks with mainland Asia more than 200 Ma ago (Metcalfe 1998), may explain some of the peculiar features of the SE Asian faunal assemblages. In addition, the complex topography of SE Asia, induced first by the Mesozoic Indosinian orogeny, may have contributed to the isolation of faunas during the Jurassic and Cretaceous, and then to the local diversification of several groups of land vertebrates.
In this context, biogeographical correlation between Thai assemblages and other Asian faunas could provide better age constraints for the formations of the Khorat Group. Studies of Mesozoic land vertebrate assemblages in Asia, and especially in China, have permitted the succession of faunas to be divided into different palaeobiogeographical provinces: for instance, the concept of an Early Cretaceous Psittacosaurus -pterosaur faunal complex was established and it is still well entrenched in the literature (e.g. Dong 1973Dong , 1979Dong , 1992Dong , 1993Dong , 1995Zhen et al. 1985). Considering the lack of temporal precision of this complex, Jerzykiewicz & Russell (1991) suggested biochronological units based on more global land-vertebrate faunas. For Late Jurassic to Late Cretaceous times, they proposed Mongolian land-vertebrate ages (MOLVAs) based on formations and fossil vertebrate assemblages from Mongolia. Those MOLVAs were then recognized by Lucas (2001) in China. The latter suggested the use of the term 'land-vertebrate faunachrons' (LVFs) for time intervals recognized on the basis of vertebrate faunas. Although the power of LVFs for long-distance stratigraphic correlation can be contested, the Thai assemblages have not been compared with any kind of biochronological or biogeographical general scheme: comparisons were always made considering a specific clade of the entire assemblage or in an empirical way, based on similarity at the family level (see, e.g. Buffetaut & Suteethorn 1998;Buffetaut et al. 2006).
In this paper, we have two aims: (1) to place Thai assemblages in the global Asian context, using automatic classification considering all vertebrate taxa found in non-marine Late Jurassic and Cretaceous formations; and (2) to investigate whether this kind of comparison could improve the time scale of biogeographical models.

Temporal and spatial settings
The Khorat Group is now considered as comprising five continental formations ranging from the Late Jurassic -Early Cretaceous to the Aptian -Albian ( Fig. 1; Racey et al. 1996;Carter & Bristow 2003). Triassic and Late Cretaceous sediments are separated from the group by a hiatus and/or unconformities. This temporal range was the basis of the study. Because of the absence of marine intercalations and of volcanic lava flows or ash layers, the age of these formations is still uncertain. Because our purpose was to compare the Thai fossil assemblages with other Asian assemblages, we took into account all the assemblages dated from the Oxfordian to the Santonian. Hence, the dataset used in this study consists of the lists of vertebrate genera occurrence in Asian continental formations during this time span. Because a significant number of species determinations are still uncertain or highly debated by palaeontologists, analyses have been performed at the genus level. Generic data, however, have been interpreted as providing a clear biodiversity signal (Sepkoski 1996(Sepkoski , 1997 and are considered robust enough to have been used in several recent global-scale biodiversity studies (Foote 2000a(Foote , b, 2001Kirchner & Weil 2000a, b;Brayard et al. 2006). Furthermore, because they cannot be clearly associated with a  Racey et al. 1996). genus, information from ichnofossils and eggshells was not considered here. On the basis of these criteria, the Phra Wihan Formation and the Phu Phan Formation were not used in this study (Fig. 1). Thus, we studied only the Phu Kradung Formation, the Sao Khua Formation and the Khok Kruat Formation in the Khorat Plateau.
These formations have yielded an important amount of fossils in several localities. However, we estimated the fossil assemblage of a formation (i.e. a time span) as the compilation of all genera found in the localities referring to this formation. Indeed, our experience shows that any marked differences between localities are merely the result of different sedimentological conditions rather than of the geographical distance separating them. Hence the standard units analysed here were fossil assemblages from three geological formations from the Khorat Group, thus at the regional scale of the Khorat Plateau.
For comparative purposes, we have defined the other Asian assemblages according to similar criteria, as follows.
(1) The area representing the extension of the assemblage should be of a similar size to the Khorat Plateau. In most cases, administrative provinces of the various countries considered in this study have provided clearly delimited spatial areas of similar size, although this choice can be subjective.
(2) The assemblage should be representative of a specific environment and of a time span (the narrower the time span, the better the resolution for the palaeobiogeographical study). Hence the best way to constrain the resolution was to consider all the localities from the same geological formation as a single unit representing the palaeo-assemblage for a given place and a given moment.
Each administrative region was, thus, characterized by one or more regional faunal assemblages (RFA) considering the number of formations (i.e. number of time sequences we can distinguish) present in this area. The detailed list of those assemblages is available on request from the corresponding author. The dataset included 94 RFAs covering Asia from its southeastern part (mostly Thailand and Laos) to Central Asia (the list of RFAs is available as supplementary material). The compiled dataset consisted of all taxa published before May 2006. A total of 477 genera represented by 776 occurrences were included as presence or absence data (zero indicates absence and one indicates presence). Although abundance data are sometimes found to be more descriptive in palaeogeographical studies than incidence data (Johnson & McCormick 1999), they are very difficult to collect from a large area, such as that covered by the present study.

Dataset construction
The construction of the 94 RFAs was undertaken using the following protocol. First, data were taken mostly from reviews (Lillegraven et al. 1979;Dong 1992;Lucas 2001;Benton et al. 2002;Weishampel et al. 2004) and specialized papers (a full list is available on request from the corresponding author) related to the considered time span and area. Complementary information was extracted from the online Paleobiology Database (http:// paleodb.org). Taxonomic data were homogenized from updated systematic revisions. When no taxonomic revisions were available, the original author's point of view was followed. Second, all the local faunas from each region were pooled into taxonomically standardized regional lists of genera. Third, all regional lists were brought together, leading to the construction of a presence -absence matrix (available on request from the corresponding author). In this matrix, indeterminate taxa at the supra-generic level were excluded.

Processing method
Biogeographical comparisons made in this study were based on the computation of overall taxonomic similarity between regional taxonomic assemblages. The starting assumption of this 'numerical biogeography' was that each compared assemblage was characterized by a reasonably comprehensive and unbiased list of taxa. Nevertheless, few tools have yet been developed to control such quantitative estimates of biogeographical relatedness for the quality of the fossil record under analysis. As the sampling effort and underlying real biological diversity are largely unknown, but are likely to vary within and between the studied geographical areas, conventional incidence or abundance-based diversity measurements (e.g. taxonomic richness, Shannon's and Simpson's indices, etc.) were useless for this purpose. Based on these considerations, we first used taxonomic distinctness analysis to check the studied assemblages for overall comparability. Then we used the Raup & Crick (1979) index of taxonomic similarity coupled with two distinct modes of graphical display to extract and to visualize the patterns of biogeographical relatedness contained in the available data. Of the 94 original RFAs only 49 were suitable for the analysis, containing 81 different taxa found at least in two different RFAs.

Taxonomic distinctness analysis
Taxonomic distinctness analysis is a robust method of diversity analysis taking into account the taxonomic (hierarchical) structure of the studied assemblages (Warwick & Clarke 1995, 1998. Based on an incidence (presence or absence) matrix of taxonomic occurrence, two complementary indices were defined: the average taxonomic distinctness (AvTD) and the variability in taxonomic distinctness (VarTD). Both indices show highly robust statistical sampling properties, including a lack of dependence, in mean value, on sample size, sampling effort and taxonomic identification skills of different workers, a very appealing characteristic in the context of this study. The computation of these two indices first relies on the construction of a Linnean taxonomic tree as a reasonable proxy of the underlying phylogenetic history.
For a given assemblage made of n distinct taxa, AvTD is the average taxonomic path length measured for each of the n Â (n 2 1)/2 possible pairs of taxa in the taxonomic tree. Thus, one can consider AvTD as a taxonomic disparity index reflecting the level of phylogenetic heterogeneity of the assemblage: the higher the AvTD, the more the assemblage is made of several different phylogenetic groups; conversely, the lower the AvTD, the more the assemblage is dominated by a reduced number of higher rank taxonomic groups. For instance, a taxonomic assemblage of 10 genera from the same family has the same generic richness, but a lower taxonomic disparity than an assemblage with 10 species from 10 distinct families. Thus, unless we consider highly specialized, taxonomically impoverished assemblages corresponding to rather rare and atypical environmental conditions (see , taxonomic fossil assemblages (taphocoenosis) characterized by low AvTD values are likely to poorly represent their underlying life assemblages (biocoenosis).
VarTD is the variance associated with AvTD. A low VarTD value indicates that the n taxa of the assemblage tend to be 'taxonomically equidistant', whereas a high VarTD value illustrates a heterogeneous distribution of the pairwise taxonomic distances. Complementary to AvTD, VarTD can be viewed as a confidence index of the randomness of the fossil assemblage when compared with the underlying life assemblage: a low VarTD value, especially when associated with a low AvTD value, is likely to indicate a sampling or preservation bias, indicating that the analysed taphocoenosis is not a random sample of its parent biocoenosis (see Fig. 2 for examples). Therefore, and regardless of the origin of the bias, this taxonomic assemblage carries peculiar biogeographical information and must be taken with caution for the study.
For both indices, we applied a Monte-Carlo procedure (random resampling without replacement). It estimated the confidence funnels associated with the null hypothesis that a given observed assemblage is made of n taxa randomly sorted from the global pool of taxa recorded in the analysed dataset (see Clarke & Warwick 1998, for methodological details).
In this study, we used a partially resolved taxonomic tree made of four hierarchical Linnaean levels: genus, family, order and class. A simple linear weighting scheme was adopted, with a taxonomic path length of one when contrasting two genera from the same family; two for two genera from distinct families, but from the same order; three for two genera from distinct orders, but the same class; and four for two genera from distinct classes. The confidence funnels associated with AvTD and VarTD were estimated from 100 000 random samples. All the computations have been performed using the TDA.pro software (Escarguel & Legendre 2006).

Cluster analysis of taxonomic similarity
We analysed the biogeographical matrix of generic occurrence using Raup & Crick's taxonomic similarity coefficient (RC; Raup & Crick 1979). This index is the confidence level associated with a unilateral randomization test estimating the probability that the observed number of taxa shared by two assemblages is only due to chance. More formally stated, the Raup & Crick coefficient is the 1 2 p value associated with the significance test involving the following null and alternative hypotheses.
H 0 : the species observed in the two regions are distributed between them by random sorting from a common pool of species made up of all the taxa recorded in the biogeographical matrix. This hypothesis of independent random sprinkling of each taxon implies that the observed number of species common to both regions is only due to chance.
H 1 : the similarity observed between the two regions is higher than would be expected as the consequence of the random sorting from a common pool of taxa.
Hence, a couple of regions characterized by a very high RC value (say, RC . 0.95) show a significant similarity between their studied taxonomic assemblages (they non-randomly share too many taxa); conversely, a couple of regions characterized by a very low RC value (say, RC , 0.05) show a significant difference between their studied taxonomic assemblages (they non-randomly share too few taxa). For each pair of regions, the associated null hypothesis was estimated by generating 499 successive random samplings from the common pool of taxa without taking into account the observed probabilities of taxa occurrence (following remark 2 of Harper 1981).
Once computed, the resulting matrix of similarity (S) was converted into a Euclidean matrix of dissimilarity (D) using the transformation Gower & Legendre 1986, theorem 6). Then, D was clustered using the Neighbour-Joining (NJ) method of tree reconstruction (Saitou & Nei 1987; program NEIGHBOR from the PHYLIP v. 3.5 package, Felsenstein 1993). The NJ algorithm is a widely used distance-based heuristic method of phylogenetic inference (Felsenstein 2004). From a given dissimilarity matrix, it allows the computation of the shortest total length additive unrooted tree with the branch lengths estimated by unweighted least squares. We predicted that an endemic area will have few exchanges with others, if the regional faunal assemblages form clusters that have a geographical identity; in contrast, if no geographical identity is found in the clustering pattern, the cluster could correspond to a stratigraphic assemblage based on evolutionary grounds. Finally, environment and taphonomy can also cause clustering. We should therefore keep in mind that ecological, geographical and stratigraphic signals can interfere.
Because of the statistical nature of Raup & Crick's similarity index, the tree representation of the observed dissimilarity matrix D must be very cautiously considered. Indeed, D is made of square-rooted p values that are not additive (and even not a priori metric) quantities. Thus, we performed a quality analysis of the resulting NJ-tree by computing the topological criteria proposed by Guénoche & Garreta (2001). These criteria are based on the comparison of the topology of the quadruples (i.e. trees made of only four RFAs) implied by the observed dissimilarity matrix D and by the resulting NJ-tree T. Two overall indices are computed: (1) the overall rate of well-designed quadruples (Rq), defined as the percentage of quadruples having the same topology according to D and T; (2) the arboricity coefficient (Arb), defined as the percentage of quadruples of D for which the median sum involved by Buneman's quadruplet inequality is closer to the largest one than to the smallest one (see Guénoche & Garreta 2001, for details). These two criteria estimate the overall topological congruence of T and D: the higher Rq and Arb, the more T actually reflects the structural information contained in D. In addition, the percentage of well-designed quadruples was computed for each RFA, as well as the rate of elementary quadruples (Re) of D supporting each internal edge of T. 'Individual' Rq values allowed us to identify RFAs that were potentially ill-placed within the NJ-tree (low Rq values). Re values are reliability estimates of the bipartitions induced by each edge of T; they play the same role that bootstrap supports, whose computation is not straightforward in the case of the RC index. Thus, these 'individual' indices provided information on the confidence associated with the internal and external edges of the tree representation: a low Rq value indicates that its associated RFA is positioned in the tree with difficulty (e.g. because of a high degree of endemism of its components), or that this RFA clusters with another one by default (e.g. because they share different taxa with different assemblages); Re provides similar interpretation but for an internal edge separating two sets of RFAs.

Representation of similarities on palaeogeographical maps during different time intervals
To understand the evolution of biogeographical patterns in time and space, we opted for a graphical representation of RFA similarities at different time intervals and on palaeogeographical maps. Contrary to the clustering method, the dataset was divided into three temporal units, each of these units including one of the Thai faunas. Because of the temporal incertitude about the formations of the Khorat Group, these time intervals included a large number of RFAs considered to be possibly contemporaneous with the Thai ones (based on Fig. 1 and Racey et al. 1996). The first unit included the fauna from the Phu Kradung Formation (tha002), and covers the latest Jurassic to late Berriasian time span. Because the Phu Kradung Formation is not clearly defined as a Cretaceous formation, the first unit should encompass Late Jurassic RFAs. The second unit consisted of the fauna of the Sao Khua Formation (tha003) and all the RFAs ranging in age from the early Valanginian to the late Barremian. The youngest fauna of the Khorat Group is the Khok Kruat Formation (tha001) and is clearly assigned an Aptian age, so this third unit is the best constrained from a temporal point of view. As RFA could show differences in composition because of factors that result not only from the real occurrence of taxa (taphonomical, collecting biases), we focused our attention on the geographical distribution of localities that showed a rather high similarity rather than to look at those that were different. The objective was therefore not to find a connection of some kind between all assemblages but to focus on geographical and stratigraphical causes of resemblance, considering especially time intervals that correspond to deposition of SE Asian formations. High values of the RC index indicate a non-random similarity. We decided to connect assemblages with an RC index higher than 0.5 (meaning a 50% probability that the similarity is a non-random effect). Thus, we also emphasized values higher than 0.9 and 0.95, considered as the most well-supported connections, and indicated values between 0.8 and 0.9 with different symbols. If only some close localities are connected whereas localities are disconnected from more distant ones, this will illustrate the Fig. 3. Average taxonomic distinctness of the analysed RFAs as a function of the taxonomic richness. Dotted lines represent the 95% confidence interval associated with the null hypothesis that a given observed assemblage is made of n taxa randomly sorted from the global pool of taxa recorded in the analysed dataset. presence of provincialisms, and possibly highlight the geographical position of barriers. In contrast, if a large proportion of localities are connected without respect to biogeographical distances, this indicates that our data do not contain a significant biogeographical pattern or homogeneity in the composition of fauna throughout Asia. The results were plotted on a palaeogeographical reconstruction from Schettino & Scotese (2001).

Taxonomic distinctness analysis
The taxonomic distinctness analysis results are given in Figures 3 and 4. The average taxonomic distinctness (Fig. 3) shows an important portion of values that are outside the funnel zone. The low values represent mainly assemblages dominated by a specific group. The variability in taxonomic distinctness (Fig. 4) shows also some of the assemblages outside the funnel zone, mostly with high values showing that too much dispersion is present in the assemblage. The combined result of VarTD and AvTD of each assemblage is shown in Figure 5. The assemblages involved in this study showed mainly VarTD values in the 95% confidence range. This signifies that assemblages are composed of several different groups of taxa, and are not represented by a specific group. Conversely, AvTD is mostly divided into values contained in the 95% confidence interval and low values (AvTD , AvTD À 2s). These low values indicate Fig. 4. Variability in taxonomic distinctness of the analysed RFAs as a function of the taxonomic richness. Dotted lines represent the 95% confidence interval associated with the null hypothesis that a given observed assemblage is made of n taxa randomly sorted from the global pool of taxa recorded in the analysed dataset. assemblages dominated by a group such as dinosaurs or fish. The effect of these values on the cluster analysis is discussed below.

Cluster analysis of taxonomic similarity
The resulting tree shows three main clusters corresponding to different time spans: Late Jurassic, early Early Cretaceous and mid-Cretaceous (Fig. 6, nodes a, b and c, respectively). Unfortunately, this study includes very few formations of Late Jurassic age, mostly because of insufficient knowledge of this period in Asia.
Among this set of 49 RFAs, four groups were defined. Cluster 'a' is mainly composed of Late Jurassic RFAs from Central China. It represents typical late Jurassic faunas; in particular, those with the euhelopodid sauropods Mamenchisaurus and Omeisaurus, the goniopholidid crocodile Sunosuchus and the chelonians Xinjiangchelys, Sinaspiderestes and Plesiochelys.
Cluster 'b' represents middle Cretaceous RFAs (late Albian to Turonian). This cluster is characterized at its base by eastern RFAs and the four SE Asian RFAs (tha001, tha002, tha003 and lao001), whereas a subcluster contains localities scattered from the eastern to the western parts of the range of the study. This group is mainly based on chelonians, especially adocids, nanhsiungchelyids, anosteirids, lindhomemydids and trionychids.
Cluster 'd' consists of RFAs ranging from Barremian to Albian in age. This area is the widest of the tree, with fossil assemblages from Central Asia and a majority of the RFAs, from the Gobi Desert. The ceratopsian Psittacosaurus is the taxon present in a majority of the RFAs in this group.
Cluster 'e' is dominated by Early Cretaceous East Asian assemblages in which freshwater taxa play an important part. Because of this, a large part of this group includes assemblages with a low AvTD, outside the 95% confidence 'funnel'.
The representation of AvTD v. VarTD values does not greatly affect the topology of the tree. Only node e is constrained by low AvTD values, caused by their dominance by freshwater assemblages and more specifically by fish taxa. The problematic assemblages from a representativity point of view have been removed: these were assemblages with a small number of taxa or containing genera with a single occurrence in the entire database. Figure 7 shows the validity of the position of each RFA in the tree using the criteria of Guénoche & Fig. 7. Comparison of the topology of the quadruples implied by the observed Raup & Crick dissimilarity matrix D and the resulting NJ-tree: percentage of ill-designed quadruples containing each RFA and rate of elementary quadruples supporting internal edges. Garreta (2001). It appears that three of the four basal nodes are significantly well supported (nodes a, d and e have a rate higher than 85%). The percentage of well-designed quadruples containing each RFA is globally high, with values ranging from 61% to 90.8% and 76% of the values higher than 80%. This means that the topology reflects the dissimilarity in this range. Thus the topology of the tree was considered suitable for study. SE Asian RFAs display rather low values (61% for tha001 and 74% for tha002); therefore, it may be necessary to consider other solutions in these cases.

Representation of similarities on palaeogeographical maps during different time intervals
Unit 1: Oxfordian -Berriasian (Fig. 8) The resulting network suggests a good connection between the Late Jurassic RFAs from the mainland. There are few connections between Late Jurassic assemblages and Early Cretaceous ones. The Early Cretaceous assemblages have RC index values too low for them to be connected on the Unit 2: Valanginian-Barremian interval (Fig. 9) In this group of RFAs, only one connection is supported by the 95% threshold, between the two assemblages of the Jehol Group (Yixian Formation chi050 and Jiufotang Formation chi043). Otherwise, only very few RFAs, which were very close geographically, have similarity indicated by a .50% RC index.
Unit 3: Aptian -Albian interval (Fig. 10) The resulting network forms a cluster indicating good connections between all the RFAs from the northern part supported by the 80% threshold. SE Asian assemblages are isolated from this cluster.

Discussion
The different methods used in this study permit an understanding of the pattern of faunal evolution from the Late Jurassic to the Aptian-Albian, although not all RFAs could be included in the analyses (several assemblages are composed of taxa specific to their region and hence have too low a generic richness).
Looking at resemblances between localities of similar age, we can infer that the peripheral and mainland assemblages were more similar during the late Jurassic -Berriasian and during the Aptian-Albian intervals (Figs 8 and 10) than during the Valanginian-Barremian interval (Fig. 9) when very few RFAs showed a significant similarity. This suggests that an Early Cretaceous event isolated all the various parts of Asia from each other.
The first phase highlighted by this study is the transition between the Late Jurassic and the Early Cretaceous. Both methods converge to suggest a relatively radical turnover for the fauna at the boundary between these two periods. The tree shows a cluster containing most of the Jurassic assemblages (Fig. 6) and the representation of the RC index on the palaeogeographical map indicates connections between Jurassic assemblages but not with Early Cretaceous ones (Fig. 8).
During the early Early Cretaceous, the situation seems to have changed to one in which faunal interchange was more difficult, at least for terrestrial taxa. The scenario, however, was not the same for terrestrial and for aquatic faunas, as aquaticdominated assemblages are connected in the tree ( Fig. 6, node e). However, the reason for this clustering is based on rather low RC values and is not indicated on the palaeogeographical representation (Fig. 9). Within these relatively isolated peripheral assemblages, only one strong connection is provided by a group of Japanese assemblages (the cluster is supported at 99% in Fig. 7). Clustering between mainland assemblages is random in the tree, and when they are linked to a peripheral assemblages this is likely to be because of co-occurrence of a limited number of terrestrial taxa. Furthermore, the edge supporting this cluster in the tree (Fig. 6, node g) is supported only at a level of 49% (Fig. 7). In contrast, similarities between peripheral RFAs are because of co-occurrence of aquatic faunas. This suggests that dispersal pathways in the peripheral parts of Asia were suitable for aquatic vertebrates but more difficult for terrestrial taxa during the Early Cretaceous interval.
The Aptian -Albian situation shows a change in dispersal patterns, as demonstrated by the multiple connections between RFAs suggesting good pathways for faunal exchange in all Asia (Fig. 10). The late Early Cretaceous assemblages are mostly clustered together in the tree (Fig. 6, node d). This group is statistically well supported with an Re index of 87%. From an empirical point of view, the occurrence of the genus Psittacosaurus almost everywhere in Asia supports this result (Lucas 2006).
The situation of the Thai assemblages is peculiar in this evolutionary pattern, and is strongly imprinted with endemism. Thailand constitutes a cluster separated from the others (Fig. 6, node h) as the edge supporting the other group with which it is clustered has an Re of 52%. Furthermore, on the palaeogeographical representation, the Thai assemblages are never linked to any other. The endemic situation of the Khorat Plateau is clearly represented by this quantitative analysis. However, Thailand is not completely isolated from the main continent, as there are some taxa similar to those in other regions: the presence of the anosteirid Kyzylkumemys and the adocid Shachemys in the Khok Kruat Formation and in the early Late Cretaceous of Central Asia suggests a connection between those two provinces at least from the Aptian onward. These turtles explain the position of the Khorat Group with other peripheral RFAs in the tree (Fig. 6, node b). The occurrence of the genus Psittacosaurus in the same formation suggests faunal interchange with NE China, where this genus was widespread. Nevertheless, this connection was perhaps just incipient or limited, as the endemic imprint still links the Khok Kruat assemblage to that from the Sao Khua Formation.
The changing biogeographical patterns outlined above show SE Asia occupying a peculiar position in Asian faunal evolution during the Cretaceous. At the boundary between the Late Jurassic and the Early Cretaceous, the Phu Kradung assemblage, at the base of the Khorat Group, is one of the latest occurrences of specific Jurassic faunal elements in Asia. At that time, or just before, Indochina was not isolated, permitting the dispersal of northern or central Chinese taxa such as the crocodilian Sunosuchus or euhelopodid dinosaurs. The Early Cretaceous assemblage of the Sao Khua Formation indicates a different biogeographical pattern, as it does not show close links with other RFAs. The only possible link is based on a hybodont shark, Heteroptychodus, present also in an unnamed formation of the Matsuo Group, in Japan. However, it is too weak to be shown in this representation. This suggests that Indochina was partly isolated from the Asian mainland. Only euryhaline taxa were thus able to disperse to other provinces along coastlines. The end of this isolation is highlighted by the occurrence of the genus Psittacosaurus in the Aptian Khok Kruat Formation. The history of the relationship between SE Asian and other Asian faunas is not easy to unravel because of correlation problems linked to discontinuities in the continental fossil record. Whether the Khorat Group really provides a continuous sedimentary record for the period spanning the latest Jurassic to mid-Cretaceous interval is uncertain, and in any case the three faunal assemblages studied in this paper are separated by two formations (the Phra Wihan and Phu Phan Formations) that have yielded very few body fossils (although they contain fairly abundant dinosaur footprints; Le . This probably exaggerates the impression of rapid isolation of the Indochina block faunas at the time when the Sao Khua Formation was deposited. Buffetaut et al. (2006) compared Chinese and Thai dinosaur assemblages on an empirical basis. They proposed Chinese counterparts for some of the Thai dinosaur assemblages.
For the Phu Kradung assemblage, they recognized similarities to Late Jurassic Chinese assemblages, but not to those indicated by the present study. They considered the Upper Shaximiao Formation (chi089) and the Shishugou Formation (chi101) assemblages as possible counterparts. Thai assemblages are positioned between the mid-Cretaceous Central Asian cluster (Fig. 6, node f) and the Jurassic Eastern Asian one (node a). This reflects the similarity of the fauna from the base of the Khorat Group to Jurassic assemblages and of those from later parts of the group to mid-Cretaceous assemblages. Buffetaut et al. (2006) considered the assemblage from the Khok Kruat Formation, which is fairly well dated as Aptian, as relatively similar to that from the upper part of the late Early Cretaceous Xinminbao Group of Gansu, NW China (chi000). We did not notice a clear relationship in our analysis between these two groups. This may be because especially relevant components of the Khok Kruat fauna, such as early hadrosauroids , have not yet been described in detail and therefore were not taken into consideration in the present analysis. Buffetaut et al. (2006) also discussed the case of the abundant fauna from the Sao Khua Formation and noted that it is difficult to find a counterpart among the Early Cretaceous dinosaur faunas of China. The closest relationships seemed to be with the poorly known assemblage from the Napai Formation of Guangxi (chi021).
It is difficult to compare the results of the present analysis with those of Buffetaut et al. (2006) because they are not based on taxa of the same systematic rank. Nevertheless, both approaches indicate some isolation of the fauna of the Indochina block during the deposition of the Sao Khua Formation. Cuny et al. (2003Cuny et al. ( , 2006 concluded that the hybodont shark fauna from the Sao Khua Formation appears to be much less endemic, at least at the generic level, than its dinosaur assemblage. On the basis of Maisey's (1989) work, Cuny et al. (2005) supposed that similarities between hybodont sharks from different freshwater systems are linked to their euryhaline abilities, which allowed them to travel in coastal marine waters at least for short distances. Such a model could explain the fact that land vertebrates could not disperse in and out of SE Asia at that time, but freshwater taxa seem to have been able to do so.
The biostratigraphical units defined by Jerzykiewicz & Russell (1991) in Mongolia and then recognized in China by Lucas (2001) are not reflected in our study. However, it should be kept in mind that biostratigraphical and biogeographical units can be compared only to a certain extent, as they do not necessarily coincide. Our work certainly suggests that caution should be exercised when trying to recognize LVFs established in Mongolia and northern China in SE Asia, presumably because of biogeographical differences between roughly coeval assemblages. The three units spanning the Early Cretaceous are respectively the Ningjiagouan, the Tsagantsabian and the Khukhtekian LVFs. The vertebrate fauna of the Mengyin Formation (chi075) is the basis of the Ningjiagouan LVF. The Phu Kradung assemblage might be associated with this faunachron considering its indexed fossils, notably a euhelopodid sauropod and an indeterminate stegosaur. However, this conclusion is not derived from our quantitative analyses, according to which the Mengyin assemblage is connected with Early Cretaceous assemblages (chi061 and chi057 in Fig. 6), although those edges are not well supported (0.35 and 0.58, respectively), permitting other positions for this assemblage.
The two other faunachrons extend from the Barremian to the late Albian. Following the current definition by Lucas (2006, p. 10) the Tsagantsabian faunachron ranges from the early Barremian to the mid-Aptian and includes chi021, chi050, chi057, chi069, chi098, chi103, kir003, mon040 and mon041 in our dataset. The Khukhtekian faunachron ranges from mid-Aptian to late Albian and is composed of the Ximinbao Group (chi000), chi055, chi76, mon011, mon019, rus004 and rus006. As is shown in our tree (Fig. 6), we did not clearly recognize those faunachrons on the basis of a quantitative analysis. Only the Psittacosaurus biochron (Lucas 2006) is recognized in our tree; that is, the biogeographical unit that includes all the RFAs where the genus Psittacosaurus occurred (node d in Fig. 6). However, as mentioned by Lucas (2006), this unit corresponds to a relatively long time span (about 20 Ma), so that it does not provide a very precise basis for correlation.

Conclusions
Our results indicate that the biogeographical history of SE Asia during the Early Cretaceous was complex. During the latest Jurassic, faunal interchange between China and Indochina was possible. We cannot conclude that this applies to the entire Asian continent because of insufficient information about other parts of Asia for that period. The situation seems to have changed drastically in the middle part of the Early Cretaceous, when SE Asia apparently became isolated from the rest of Asia, possibly by mountain ranges. This hypothesis is in agreement with the tectonic situation of Asia at that time, with collision between all the microblocks from the old Gondwana (Metcalfe 2006). One of the most striking peculiarities of the vertebrate assemblage from the Sao Khua Formation is the apparent absence of ornithischian dinosaurs, which were present in SE Asia both before and after that time interval. Only some freshwater taxa could disperse between SE Asia and the rest of the continent. The situation changed again during the Aptian, when faunal interchange with other parts of Asia again became possible. Whether this temporary isolation of SE Asia also affected other parts of Asia at that time cannot be determined because of an insufficient fossil record.
Fund joint project. This is publication isem 2007-167 (J.C.) and contribution UMR5125-08-003 (G.E.). Finally, comments from S. Lucas (New Mexico Museum of Natural History and Science) and M. Benton (University of Bristol) greatly improved the first version of this paper.