figshare
Browse

Assemblies_and_data.zip

Download (4.28 GB)
dataset
posted on 2025-01-13, 11:49 authored by Monika WisniewskaMonika Wisniewska, Jiří Kyslík, Gema Alama-Bermejo, Alena Lövy, Martin KoliskoMartin Kolisko, Astrid S. Holzer, Anush Kosakyan
<h3><b>All </b><b>datasets</b><b> used for the assemblies, annotations, </b><b>differential</b><b> expression analysis, </b><b>enrichment</b><b> analysis, and variant surface proteins (VSPs) identification.</b></h3><h3><br></h3><h4>Assemblies directory contains: raw Trinity assemblies (genome-guided: Trinity_genome_guided_assembly.fasta, <i>de </i><i>novo</i>: Trinity_de_novo_assembly.fasta), assemblies cleaned based on the taxonomy (genome-guided: Trinity_genome_guided_assembly_cleaned.fasta, <i>de novo</i>: Trinity_de_novo_assembly_cleaned.fasta), file with the original and simplified headers of the genome-guided assembly (Trinity_genome_guided_assembly_headers.map), predicted proteins for the genome-guided assembly (Trinity_genome_guided_assembly_transdecoder.pep), excel spread sheet with the combined annotations for the genome-guided clean assembly (Genome_guided_Trinity_assembly_annotations_summary.xlsx).</h4><h4>The cleaned_reads directory contains: reads used for the genome-guided and <i>de novo</i> assemblies.</h4><h4>The DE directory contains count matrix computed using featureCounts of the Subread R package (featureCounts_counts_matrix_for_DE_analysis.txt) and used for the differential expression analysis; Table with the differentially expressed genes, their fold change values, annotations with assigned categories as well as information if gene is specific to <i>S. </i><i>molnari</i> (DEGs_annotations_categories.xlsx); table (pathogenicity_related_transcripts.pdf) and corresponding fasta file showing genes classified as pathogenecity-related (pathogenicity_related_transcripts.fas).</h4><h4>The S_molnari_unique_genes contains an excel spread sheet (S_molnari_unique_genes.xlsx) with the gene identifiers and sequences identified as the genes unique to the <i>S. molnari</i>.</h4><h4>The VSPs directory contains: instal fasta file with the selected putative VSPs protein sequences (VSPs.faa), alignment computed using mafft-linsi (VSPs.aln), and the manually trimmed alignment showing the transmembrane domain and the N-terminus motif (VSPs.trim).</h4><p><br></p>

Funding

Czech Science Foundation number 20-30321Y

Czech Science Foundation number 19-25536Y

Czech Science Foundation number 19-28399X

Centre for Research of Pathogenicity and Virulence of Parasites number CZ.02.1.01/0.0/0.0/16_019/0000759

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC