Commun_Biol_aphelid_datasets

2018-11-14T16:45:50Z (GMT) by Guifré Torruella
Transcriptome assembly
Metatranscriptome of Paraphelidium tribonemae (PRJNA402032) assembled with trinity, cleaned from non-eukaryote sequences using blobtools with the refseq taxonomy affiliation & cleaned from eukaryote contamination with blastp using a custom database with fungi & stramenopiles. 10,669 cleaned peptides predicted with transdecoder and annotated with eggnog-mapper.

Version 1.5 contains 10,439 peptides, after removing 230 peptides with high identity (97% to 100%) with Tribonema gayanum RNA-seq (unpublished).

Phylogenomics
3 protein datasets were used to infer the position of Paraphelidium tribonemae. All were previously used for phylogenomic analyses of Opisthokonta and, more specifically, Microsporidia. They are:
- SCPD: 93 single-copy protein domains (SCPD) from Torruella et al. 2015 Curr Biol.
- BMC: 53 proteins updated from Capella-Gutiérrez et al. 2012 BMC Biol.
- GBE: 259 proteins updated from Mikhailov et al. 2017 Genome Biol Evol.

Here you can find the concatenated supermatrices 49sp and 36sp; with and without long-branch microsporidians respectively.