figshare
Browse
1/1
2 files

Arctic char transcriptome annotations

dataset
posted on 2017-12-13, 15:54 authored by Jenni ProkkolaJenni Prokkola

Two annotation tables that for a liver transcriptome of Arctic char (Salvelinus alpinus) (Prokkola et al). The de novo assembly available at DDBJ/EMBL/GenBank under the accession GEKT00000000. Open reading frame (ORF) peptide sequences were obtained for transcripts in the final assembly using TransDecoder. The predicted ORFs were annotated with four databases using Basic Local Alignment Search Tool for proteins (BLASTp v.2.2.31): predicted zebrafish (downloaded Oct 24 2015 from Ensembl) and salmon (NCBI Salmo salar Annotation Release 100) proteins using a reciprocal best hits approach and an e-value cutoff 1x10-5. Additionally, the ORFs were annotated with NCBI non-redundant protein database (downloaded Nov 25th 2015) with e-value cutoff 1x10-5 and when the query sequence matched the target sequence at >50 % protein length, and with human peptides. The first file (Annotations_Salp...") contains results that were prioritized with the order zebrafish > salmon > NCBI nr. When available, gene descriptions were retrieved from Ensembl using biomaRt in R. Gene symbols were retrieved for zebrafish Ensembl IDs, salmon Refseq IDs and NCBI gene names using Biological DataBase Network (https://biodbnet-abcc.ncifcrf.gov). Annotations were retrieved for 9,491, 4,037 and 4,117 genes with zebrafish peptides, Atlantic salmon predicted peptides and the NCBI nr-database, respectively. In total, 20,394 out of 44,784 ORFs in the assembly (45.5%) were annotated with 18,013 unique protein IDs. The second table contains all gene symbols found for the above mentioned fish or for human peptide sequences using BLASTp v.2.4.0 with an E-value threshold 10-5. After identifying human orthologs, we supplemented the annotation by the previously obtained gene symbols for genes that were missing an annotation. In total 18,232 genes were annotated using this approach with 9,577 unique gene symbols.


Funding

Academy of Finland

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC