Tree in Fig.2 - Archaeal and bacterial genomes used by Hug et al. in 2016 to construct a microbial tree of life; MAGs isolated from two hot springs in the Uzon Caldera, Kamchatka, Russia; and all genomes of taxa analyzed in Burgess et al. (2012) with one representative genome on NCBI (Newick file)

Taxonomy of our MAGs was refined by placing MAGs in a phylogenetic context using PhyloSift v. 1.0.1 with the updated PhyloSift markers database (version 4; 2018-02-12; https://figshare.com/articles/PhyloSift_markers_database/5755404/4). For this purpose, MAGs, all taxa previously identified by Burgess et al. (2012) with complete genomes available on NCBI (downloaded 2017-09-06) and all archaeal and bacterial genomes previously used in Hug et al. (2016) were placed in a phylogenetic tree. All genomes used in this tree and a mapping file can be found on figshare (https://doi.org/10.6084/m9.figshare.6863594.v1; https://doi.org/10.6084/m9.figshare.6863744.v2; https://doi.org/10.6084/m9.figshare.6863798.v1; https://doi.org/10.6084/m9.figshare.6863813.v1). For more information about the available core marker gene sets see Darling et al. 2014 (updated last on 2018-02-12). We used 37 of these single-copy marker genes (ribosomal protein S2 rpsB, S10 rpsJ, L1 rplA, L22, L4/L1e rplD, L2 rplB, S9 rpsl, L3 rplC, L14b/L23e rplN, S5, S19 rpsS, S7, L16/L10E rplP, S13 rpsM, L15, L25/L23, L6 rplF, L11 rplK, L5 rplE, S12/S23, L29, S3 rpsC, S11 rpsK, L10, S8, L18P/L5E, S15P/S13e, S17, S13 rplM, L24; and translation initiation factor IF-2, metalloendopeptidase, phenylalanyl-tRNA synthetase beta subunit, phenylalanyl-tRNA synthetase alpha subunit, tRNA pseudouridine synthase B, Porphobilinogen deaminase, and ribonuclease HII; i.e., PhyloSift markers DNGNGWU00001 - DNGNGWU00040 without DNGNGWU00004, DNGNGWU00008 and DNGNGWU00038). The amino acid alignment of these 37 concatenated genes was trimmed using trimAl v.1.2. Columns with gaps in more than 5% of the sequences were removed, as well as taxa with with less than 75% of the concatenated sequences. MAGs from ARK and ZAV that did not meet this threshold were manually kept in the alignment. The final alignment comprised 3,240 taxa (make a supplementary table) and 5,459 amino acid positions. This alignment was then used to build a new phylogenetic tree in RAxML v. 8.2.10 on the CIPRES Science Gateway web server. First, we searched for the best protein substitution model of the alignment with its empirical base frequencies using the Bayesian Information Criterion (BIC) within RAxML. Then, this substitution model; i.e, LG plus CAT (after Le and Gascuel), was used to infer a phylogenetic tree. We chose the rapid bootstrapping algorithm (flag -f a) to find the best scoring maximum likelihood tree with 10 starting trees in one run with the number of bootstraps automatically determined (MRE-based bootstopping criterion). One hundred fifty bootstrap replicates were conducted. The full tree inference required 2,236 computational hours on the CIPRES supercomputer.

--> I added bootstraps later (check latest version)!

Categories

Keyword(s)

License

CC BY 4.0