Entero_SPARSE.ZZ.NF.2.pptx (3.68 MB)
Download file

Deciphering ancient microbes with modern population genomic databases

Download (3.68 MB)
posted on 20.09.2018, 14:36 by Nabil AlikhanNabil Alikhan, Zhemin Zhou

Metagenomics reveals the unprecedented genetic variation of microbial communities, including those from ancient human remains. The analysis of metagenomic data begins with taxonomic prediction of all microbes in the sample. Recent evaluation studies (1) demonstrate that current methods for taxonomic predictions either lack of sufficient sensitivity for species-level assignments or suffer from false positives, overestimating the number of species in the metagenome. Both are especially problematic for the identification of endogenous pathogens in low-abundance, common in ancient metagenomic samples. In addition, the reference genomes used in the predictions are limited and biased towards pathogens over environmental species. Reads from unknown sources, e.g. unknown environmental strains, can accidentally map onto distantly related pathogens.

We designed a new method, SPARSE, which improves the taxonomic predictions of metagenomic data. SPARSE normalizes existing biased databases by grouping reference genomes into similarity-based hierarchical clusters (Fig. 1). SPARSE also filters out reads from unknown sources using a probabilistic model, hence avoiding over-enthusiastic matches to known pathogens. Our evaluation using both simulations and real ancient samples demonstrated SPARSE’s improved precision in comparison to other methods.

We have also integrated SPARSE as part of EnteroBase. Enterobase is a centralized database that allows free access for the users to the genomes and molecular typing of ≥200K bacterial strains from several important pathogens through a graphical web interface. Enterobase includes automatic pipelines to characterize bacterial strains based on short reads from public databases or uploaded by registered users.

Here we demonstrate the utility of SPARSE in Enterobase using 22 previously published ancient plague samples (2-7). The Yersinia pestis specific reads were extracted by SPARSE and compared with 714 modern relatives in the EnteroBase Yersinia database (Fig. 2). The combination of SPARSE and EnteroBase allows reliable placements of aDNA within the entire evolutionary history of Y. pestis.

1. A. Sczyrba et al., BioRxiv (2017).

2. K. I. Bos et al., Nature 478, 506 (2011).

3. K. I. Bos et al., Elife. 5, (2016).

4. M. A. Spyrou et al., Cell Host. Microbe 19, 874 (2016).

5. V. A. Andrades et al., Curr. Biol 27, 3683 (2017).

6. M. Feldman et al., Mol Biol Evol 33, 2911 (2016).

7. S. Rasmussen et al., Cell 163, 571 (2015).


BBSRC BB/L020319/1; Wellcome Trust 202792/Z/16/Z


Usage metrics