figshare
Browse

Database for kraken2 KpSC plasmid classifier

Plasmid-free chromosomal sequences and complete plasmid sequences used for constructing a Kraken2 database for the Klebsiella pneumoniae species complex (KpSC). This dataset contains 2,487 complete KpSC chromosomal sequences, representing all seven KpSC taxa, and 30,095 complete Enterobacteriaceae plasmid sequences. These can be found in the kraken2-sequences.tar.gz file.

The original database was developed by Gomi et al., 2021, which has been expanded to include more chromosomal and plasmid sequences using a similar inclusion criteria. Accessions and metadata for all sequences in this database are available in the kraken2-classifier-metadata.xlsx file. This database was designed to be used with the kraken2 software developed by Wood et al., 2019, with example code shown in the build-kraken2-database.sh file.

References:

  • Gomi, R., Wyres, K.L., and Holt, K.E. (2021). Detection of plasmid contigs in draft genome assemblies using customized Kraken databases. Microb Genom 7. 10.1099/mgen.0.000550.
  • Wood, D.E., Lu, J., and Langmead, B. (2019). Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257.


History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC