Grayling draft genome dataset
datasetposted on 22.06.2017 by Srinidhi Varadharajan, Lex Nederbragt
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Tthymallus_scaffolds.fasta - Assembled scaffold sequences
Tthymallus_RepeatLibrary_deNovo.fasta - De novo repeat library sequences
Tthymallus_transcriptome_DeNovo.fasta - De novo transcriptome assembly using Trinity (followed by RSEM based filtering)
Tthymallus_transcriptome_ReferenceBased.fasta - Reference-based transcriptome using STAR-Cufflinks-transdecoder pipeline (followed by filtering based on homology to known proteins from zebrafish and stickleback proteins.
Tthymallus_ScaffoldAnnotation.gff3 - MAKER pipeline based annotation for the scaffolds.
Tthymallus_proteins.fasta - Grayling protein sequences used for inferring orthologous groups (based on MAKER annotations)
Tthymallus_maker_fullOutput.gff - Full output from MAKER
Tthymallus_CPMcounts.txt - Expression counts for grayling
OrthologousGroups.txt - Inferred orthologous groups using Orthofinder