figshare
Browse
Poster_Ponsero_Draft2_BH.pdf (323.66 kB)

Poster_Ponsero_ViralMetagenomics.pdf

Download (323.66 kB)
Version 2 2020-04-14, 21:07
Version 1 2020-04-14, 21:06
poster
posted on 2020-04-14, 21:07 authored by Alise PonseroAlise Ponsero
Modern ‘omics allow the exploration in situ of the relationship between phages and their hosts, and provide new insights about the impact of viral populations on numerous biological systems. Retrieving viral sequences from bacterial metagenomes is critical to understand host-viruses interactions in a given ecosystem. Homology to known viral genes is a primary method to retrieve viral contigs from complex metagenomes. However these approaches limit the discovery of novel viral sequences with no similarity to previously known viruses. Recently, VirFinder, a novel tool to detect viral sequences in bacterial metagenomes using a machine learning method, was released [Ren et al. 2017]. This method distinguishes viral from bacterial sequences based on their k-mer signatures, rather than through homology based searches to viral genes. However, because VirFinder relies on a model trained on viral and bacterial genomes from the RefSeq database, the tool shows a bias toward the detection of the most abundant viral groups in reference databases [Ren et al. 2017].

Viromes represent a large collection of viral sequences that are unbiased by cultivation methods and cover a wide variety of ecosystems. These sequences are a vast and interesting source of information about viral k-mer signatures. We present a scalable computational framework to train machine learning models directly on curated aquatic ecosystem-specific metagenomic contigs. This novel tool aims to ensure reliable viral sequence detection even in ecosystems less studied, where a smaller amount of phages have been previously isolated and sequenced, and provide the user with ecosystem specific predictions. Finally, our approach takes into account the possibility of eukaryotic contamination that is fundamental in various environments.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC