Supplementary Tables and Figures for the following study:
https://doi.org/10.1101/2021.03.03.433801
Figure S1. Taxonomic relative abundance at the genus level with ribosomal protein S7 and S2. We used the GTDB database to assign taxonomy to the two most abundant ribosomal proteins (S7 and S2) identified in long-read metagenomes using HMMs. For the short-read metagenomes, we used the taxonomy of the ribosomal gene S7 with the Genome Taxonomy Database (GTDB). We performed a 16S rRNA amplicon sequencing on an additional sample (Amplicon) and used MED with Silva (v132) assignation at the genus level. Genera representing less than 1% of a sample were pooled as rare (light grey).
Table S1. Read counts and cumulative length. (a) Number of sequences (and the cumulative length) of the MinION outputs, before and after quality filtering. We used a qscore of 7 as the quality filtering threshold. (b) Microbial read size distribution metrics. All percentages are relative to the total amount of reads (or total length) of the sequencing output, before filtering out human contamination, but after quality filtering (criteria: minimum Q-score of 7). (c) Impact of size selection on read-size distribution.
Table S2. Oligotypes identified by Minimum Entropy Decomposition (MED). (a) Matrix count, (b) representative sequences and (c) associated Silva taxonomy.
Table S3. Distribution of ribosomal proteins across assembled long-reads across extraction strategies.