Main and Supplementary Tables and Figures
datasetposted on 12.06.2018 by A. Murat Eren
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Main and Supplementary Tables and Figures in Delmont et al.
Figure 1: Nexus between phylogeny and function of HBDs. (a) Phylogenomic analysis of 432 Proteobacteria MAGs and 43 Planctomycetes MAGs in the non-redundant genomic database (including the nine HBDs) using a collection of 37 phylogenetic marker gene families. Layers surrounding the phylogenomic tree indicate genome size and taxonomy of each MAG at the phylum and class level. (b) Functional network of the nine HBDs based upon a total of 5,912 identified gene functions. Size and colour of genomic nodes represent the number of detected functions and MAG taxonomy, respectively. The colour of functional nodes indicates their occurrence in the different HBDs.
Figure 2: Phylogeny of nitrogen fixation genes. Phylogenetic analysis of NifD (a) and NifH (b) occurring in the 15 nitrogen fixing MAGs (including five redundant MAGs and Ca. A. talassum) we identified from TARA Oceans in relation to 252 and 316 reference proteins, respectively. MAGs are colored based on their phylogenetic affiliation at the phylum level.
Figure 3: Abundance of nitrogen-fixing populations of Planctomycetes and Proteobacteria in the surface ocean. Box-plots in the top panel display the square-root-normalized cumulative relative distribution of the Planctomycetes (n=2) and Proteobacteria (n=7) HBDs in 93 metagenomes corresponding to 12 marine geographic regions (*: assuming that each liter in the surface ocean contains 0.5 billion archaeal and bacterial cells). Boxplots correspond to the first quartile, median and third quartile of distribution values with whiskers of 1.5x interquartile range. The maps in the bottom panel show the niche partitioning of HBD-06, HBD-07, HBD-08 and HBD-09 at the surface of four oceans and two seas (61 metagenomes from surface samples).
Figure 4: Relative abundance of the TARA Oceans nifH genes in the context of reference collections and amplicons. Violin plots are summarizing the average mean coverage of nifH genes retrieved in this study, nifH reference databases and nifH amplicon sequences from a large-scale survey across 93 TARA Oceans metagenomes using a competitive read recruitment strategy. The 18 nifH genes retrieved in this study were separated into two groups (‘HBD genomes’ and ‘Orphan genes’ for which we only have a scaffold) and compared to a database of nifH gene sequences. For each gene sequence, the coverage values were corrected by excluding nucleotide positions with coverage in the 1st and 4th quartiles to minimize the effect of non-specific mapping.
Figure S1: Phylogenetic analysis of nifH genes. The figure describes nifH genes in 15 nitrogen fixing MAGs (including five redundant MAGs) and nine orphan scaffolds we identified in this study, as well as 504 reference genomes.
Supplementary Table 1: Summary of the 93 metagenomes from TARA Oceans, and the twelve geographic regions they represent.
Supplementary Table 2: Summary of the co-assembly and binning outputs for each metagenomic set.
Supplementary Table 3: Genomic features of 957 MAGs from the non-redundant genomic database. A two-sided t-test was performed to compare the relative distribution of each MAG in the Pacific Ocean compared to all other locales.
Supplementary Table 4: The 16S rRNA gene sequence identified in HBD-09.
Supplementary Table 5: Genomic features, Pearson correlation (based on the relative distribution in 93 metagenomes) and average nucleotide identity of 1,077 MAGs from the redundant genomic database.
Supplementary Table 6: RAST subsystems and KEGG modules for the nine HBDs.
Supplementary Table 7: Digital droplet PCR assays targeting the nifH genes of HBD-08 and HBD-09 (phylum Planctomycetes) in DNA samples from Station ALOHA in the Pacific Ocean. The table lists the newly designed primers and summarizes detection levels in copies per litre.
Supplementary Table 8: Main characteristics of 18 nifH genes retrieved in this study, the similarity of reads they recruited across 93 TARA Oceans metagenomes, best matches against the NCBI non-redundant database, nifH reference databases and amplicon sequences from a large-scale survey, and compatibility with commonly used PCR primers. The table lists nucleotide sequences of the TARA Oceans nifH genes found in MAGs and orphan scaffolds, and mean coverage of the TARA Oceans nifH genes across the 93 metagenomes.
Supplementary Table 9: Corrected mean coverage of the non-redundant TARA Oceans nifH genes and all sequences from three reference collections (the FunGene database, the ‘Zehr database’, and the amplicon sequences, see Material and Methods section) that recruited any read across the 93 metagenomes, and blast results of nifH queries (see Methods).
Supplementary Table 10: Genomic features of 30,244 bins manually characterized from the 12 metagenomic sets. Completion and redundancy estimates are based on the average of four bacterial single-copy gene collections.
Supplementary Table 11: KEGG annotation for 1,077 MAGs.
Supplementary Table 12: Relative distribution of 1,077 MAGs across the 93 metagenomes.