figshare
Browse
gff3toembl-1.1.0.tar.gz (3.64 MB)

GFF3toEMBL: Preparing annotated assemblies for submission to EMBL

Download (3.64 MB)
software
posted on 2016-10-06, 06:49 authored by Andrew PageAndrew Page, Sascha SteinbissSascha Steinbiss, Ben Taylor, Torsten SeemannTorsten Seemann, Jacqueline A. Keane
GFF3toEMBL has been published in JOSS and this is the version of the code used for the paper. 

An essential part of open reproducible research in genomics is the deposition of annotated de novo assembled genomes in public archives such as EMBL/GenBank. The interfaces provided by the major archives do not allow for data to be easily submitted on a large scale without substantial prior knowledge on the part of the submitter. This has lead to a situation where less than 15% of all sequenced bacteria have corresponding public assemblies. We address this by providing GFF3toEMBL, which converts the output of the most commonly used automatic annotation tool, Prokka, and converts it to a format suitable for submission to EMBL. Built on the GenomeTools annotation processing library, GFF3toEMBL is robust, fast, memory efficient and well tested, and has been used to submit more than 30% of all annotated genomes in EMBL/GenBank. It is a small, but essential missing step in making genomic research more open and reproducible.

Funding

This work was supported by the Wellcome Trust (grant WT 098051).

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC