Additional file 24 of Proteobacteria explain significant functional variability in the human gut microbiome

2017-03-23T05:00:00Z (GMT) by Patrick Bradley Katherine Pollard
Source data for figures. Figure 1, source data 1: matrix of read counts (after rarefaction) for every gene family in each sample included in the present study. Figure 1, source data 2: matrix of average family lengths for every gene family in each sample included in the present study. Figure 1, source data 3: log-RPKG abundances for every gene family mapped in the present study. Figure 2, source data 1: residual log-RPKG abundances (i.e., after fitting the linear model) for every gene family mapped in the present study. Figure 3, source data 1: counts of invariable, non-significant, and variable gene families per pathway. “Strong,” “medium,” and “weak” refer to FDR cutoffs of 0.05, 0.10, and 0.25, respectively. Figure 3, source data 2: counts of invariable, non-significant, and variable gene families for ribosomes in each domain of life. Figure 4, source data 1: residual log-RPKG scaled by the expected variance under the null model (see the “Methods” section). Figure 6, source data 1: log10 phylogenetic distribution (PD), log10 residual variance statistics (residvar), significance at 5% FDR (invariable coded as “dn”, variable coded as “up”, non-significant coded as “ns”), presence in at least one bacterial/archaeal genome in KEGG, and annotations for all measured gene families. Figure 6, source data 2: counts of significant associations of invariable, non-significant, and variable gene families with taxonomic summary statistics. Figure 7, source data 1: counts of significant associations of invariable, non-significant, and variable gene families with phylum-level abundances. Figure 8, source data 1: q values for gene families in the overall test. Figure 8, source data 2: q values for gene families in phylum-specific tests. Figure 8, source data 3: JSON-formatted lists of significantly (in)variable or non-significant gene families at 5% (“strong”), 10% (“med”), and 25% FDR (“weak”); overall test. Figure 8, source data 4: JSON-formatted lists of significantly (in)variable or non-significant gene families at 5% (“strong”), 10% (“med”), and 25% FDR (“weak”); phylum-specific tests. (BZ 51464 kb)