All datasets used for the assemblies, annotations, differential expression analysis, enrichment analysis, and variant surface proteins (VSPs) identification.
Assemblies directory contains: raw Trinity assemblies (genome-guided: Trinity_genome_guided_assembly.fasta, de novo: Trinity_de_novo_assembly.fasta), assemblies cleaned based on the taxonomy (genome-guided: Trinity_genome_guided_assembly_cleaned.fasta, de novo: Trinity_de_novo_assembly_cleaned.fasta), file with the original and simplified headers of the genome-guided assembly (Trinity_genome_guided_assembly_headers.map), predicted proteins for the genome-guided assembly (Trinity_genome_guided_assembly_transdecoder.pep), excel spread sheet with the combined annotations for the genome-guided clean assembly (Genome_guided_Trinity_assembly_annotations_summary.xlsx).
The cleaned_reads directory contains: reads used for the genome-guided and de novo assemblies.
The DE directory contains count matrix computed using featureCounts of the Subread R package (featureCounts_counts_matrix_for_DE_analysis.txt) and used for the differential expression analysis; Table with the differentially expressed genes, their fold change values, annotations with assigned categories as well as information if gene is specific to S. molnari (DEGs_annotations_categories.xlsx); table (pathogenicity_related_transcripts.pdf) and corresponding fasta file showing genes classified as pathogenecity-related (pathogenicity_related_transcripts.fas).
The S_molnari_unique_genes contains an excel spread sheet (S_molnari_unique_genes.xlsx) with the gene identifiers and sequences identified as the genes unique to the S. molnari.
The VSPs directory contains: instal fasta file with the selected putative VSPs protein sequences (VSPs.faa), alignment computed using mafft-linsi (VSPs.aln), and the manually trimmed alignment showing the transmembrane domain and the N-terminus motif (VSPs.trim).
Funding
Czech Science Foundation number 20-30321Y
Czech Science Foundation number 19-25536Y
Czech Science Foundation number 19-28399X
Centre for Research of Pathogenicity and Virulence of Parasites number CZ.02.1.01/0.0/0.0/16_019/0000759