Version 3 2021-02-11, 16:23Version 3 2021-02-11, 16:23
Version 2 2020-12-24, 21:43Version 2 2020-12-24, 21:43
Version 1 2019-12-07, 11:01Version 1 2019-12-07, 11:01
dataset
posted on 2021-02-11, 16:23authored byDaniel RichterDaniel Richter, Romain Watteaux, Thomas Vannier, Jade Leconte, Paul Frémont, Gabriel Reygondeau, Nicolas Maillet, Nicolas Henry, Gaëtan Benoit, Da Silva OphélieDa Silva Ophélie, tom delmonttom delmont, Antonio Fernàndez-Guerra, Samir SuweisSamir Suweis, Romain Narci, Cédric Berney, Damien Eveillard, Frederick Gavory, Lionel Guidi, Karine Labadie, Eric Mahieu, Julie Poulain, Sarah Romac, Simon Roux, Céline Dimier, Stefanie Kandels, Marc Picheral, Sarah Searson, Tara Oceans Coordinators, Stéphane Pesant, Jean-Marc Aury, Jennifer R. Brum, Claire Lemaitre, Eric Pelletier, Peer Bork, Shinichi Sunagawa, Fabien Lombard, Lee Karp-Boss, Chris Bowler, Matthew B. Sullivan, Eric Karsenti, Mahendra Mariadassou, Ian Probert, Pierre Peterlongo, Patrick Wincker, Colomban de Vargas, Maurizio Ribera d’Alcalà, Daniele Iudicone, Olivier Jaillon
Biogeographical studies have traditionally focused on readily visible organisms, but recent technological advances are enabling analyses of the large-scale distribution of microscopic organisms, whose biogeographical patterns have long been debated. Here we assessed the global structure of plankton geography and its relation to the biological, chemical and physical context of the ocean (the 'seascape') by analyzing metagenomes of plankton communities sampled across oceans during the Tara Oceans expedition, in light of environmental data and ocean current transport. Using a consistent approach across organismal sizes that provides unprecedented resolution to measure changes in genomic composition between communities, we report a pan-ocean, size-dependent plankton biogeography overlying regional heterogeneity. We found robust evidence for a basin-scale impact of transport by ocean currents on plankton biogeography, and on a characteristic timescale of community dynamics going beyond simple seasonality or life history transitions of plankton.
Supplementary Table 1. List of Tara Oceans samples sequenced with a metabarcoding (18S V9) approach and with a metagenomic approach, including identifiers for sequencing reads deposited in the DDBJ/ENA/GenBank Short Read Archives (SRA). [This Table is identical in version 2.]
Supplementary Table 2. Table of environmental parameters for each sample. [This Table is identical in version 2.]
Supplementary Table 3. Matrix of metagenomic dissimilarity for the 0-0.22 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 4. Matrix of metagenomic dissimilarity for the 0.22-1.6/3 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 5. Matrix of metagenomic dissimilarity for the 0.8-5 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 6. Matrix of metagenomic dissimilarity for the 5-20 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 7. Matrix of metagenomic dissimilarity for the 20-180 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 8. Matrix of metagenomic dissimilarity for the 180-2000 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 9. Matrix of OTU dissimilarity for the 0-0.22 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 10. Matrix of OTU dissimilarity for the 0.22-1.6/3 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 11. Matrix of OTU dissimilarity for the 0.8-5 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 12. Matrix of OTU dissimilarity for the 5-20 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 13. Matrix of OTU dissimilarity for the 20-180 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 14. Matrix of OTU dissimilarity for the 180-2000 μm size fraction. [This Table is identical in version 2.]
Supplementary Table 15. Matrix of minimum travel time, in years. [This Table is identical in version 2.]
Supplementary Table 16. Matrix of minimum geographic distance (without traversing land), in kilometers. [This Table is identical in version 2.]
Supplementary Table 17. Matrix of imaging-based dissimilarity. [This Table is identical in version 2.]
Supplementary Table 18. Matrix of metagenome-assembled genome (MAG)-based dissimilarity for the 20-180 μm size fraction. [The filename of this Table was modified from version 2. The contents of the Table are identical.]
Supplementary Table 19. The cophenetic correlation coefficient for different methods of clustering metagenomic dissimilarity. [This Table is identical in version 2.]
Supplementary Table 20. Baker's Gamma index comparing clustering results within size fractions. [This Table is identical in version 2.]
Supplementary Table 21. Rand Index for K-means and spectral clustering, and multivariate ANOVA calculated by the adonis function. [This Table is identical in version 2.]
Dataset 1. Reference database (in FASTA format) used to perform taxonomic assignment of metabarcodes. The header line of each reference V9 rDNA barcode (with a > sign) contains a unique identifier derived from GenBank accession number, followed by the taxonomic path associated to the reference barcode. [This Dataset is identical in version 2.]
Dataset 2. V9 rDNA abundance at the metabarcode level. md5sum = unique identifier; totab = total abundance across all samples; cid = identifier of the OTU to which the barcode belongs (see Dataset 3); pid = best percentage identity to a barcode in Dataset 1; refs = identifier(s) of the best matching barcode(s) in Dataset 1; lineage = taxononmic lineage of the best match in Dataset 1; taxogroup = high-level taxonomic grouping of the best match in Dataset 1; sequence = V9 rDNA sequence; TV9_XXX = barcode abundance by sample (see Supplementary Table 1 for sample identifiers). [This Dataset is identical in version 2.]
Dataset 3. V9 rDNA abundance at the OTU (operational taxonomic unit) level. cid = identifier of the OTU; md5sum = unique identifier of the most abundant barcode in the OTU; pid, refs, lineage, taxogroup, sequence = defined as in Dataset 2; rtotab = total abundance of the most abundant barcode in the OTU; ctotab = total abundance of all barcodes in the OTU; TV9_XXX = abundance by sample of all barcodes in the OTU (see Supplementary Table 1 for sample identifiers). [This Dataset is identical in version 2.]
Dataset 4. Relative abundances of metagenome-assembled genomes (MAGs) in metagenomic samples from the 20-180 μm size fraction. [This Dataset is new in version 3.]