figshare
Browse

ActDES – a Curated Actinobacterial Database for Evolutionary Studies

Download all (2.6 MB)
dataset
posted on 2020-04-21, 20:33 authored by Jana Schniete, Nelly Selem, Anna Birke, Pablo Cruz-Morales, Iain S. Hunter, Francisco Barona-Gómez, Paul A HoskissonPaul A Hoskisson

The Actinobacteria are a large diverse phylum of bacteria, often with large genomes with a high G+C content. There is great variation in the sequence quality, equivalence of annotation and phylogenetic representation in the sequence databases meaning that evolutionary and phylogenetic studies may be challenging. To address this, we have assembled a curated, high-level, taxa specific, non-redundant database to aid detailed comparative analysis of Actinobacteria. ActDES constitutes a novel resource for the community of Actinobacterial researchers that will be useful primarily for two types of analyses: (i) comparative genomic studies - facilitated by reliable orthologs identification across a set of defined, phylogenetically representative genomes, and (ii) phylogenomic studies which will be improved by identification of gene subsets at specified taxonomic level. These studies can then act as a springboard for the study of the evolution of virulence genes, studying the evolution of metabolism and metabolic engineering target identification.


Data summary

All genome sequences used in this study can be found in the NCBI taxonomy browser https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi and are summarised along with Accession numbers in Table S1


1. All other data is available on Figshare.


a. Perl script files


b. List of genomes from NCBI (Actinobacteria database.xlsx) Table S1


c. CVS genome annotation files including the FASTA files of nucleotide and amino acids sequences (612 individual .cvs files – folder cvs)


d. BLAST nucleotide database (.fasta file)


e. BLAST protein database (.fasta file)


f. Table S2 Expansion table genus level (Expansion table.xlsx Tab Genus level)


g. Table S2 Expansion table species level (Expansion table.xlsx Tab species level)


h. All data for GlcP and Glk data – blast hits from ActDES database, MUSCLE Alignment files and .nwk tree files

Funding

NERC (grant NE/M001415/1)

BBSRC (grants BB/N023544/1 and BB/T001038/1)

BBSRC/NPRONET (grant NPRONET POC045)

Royal Society Newton Advanced Fellowship (NAF\R2\18063)

History