Supplementary data for Abdel-Glil M et al. (Scientific Reports 2021)
Comparative in silico genome analysis of Clostridium perfringens unravels stable phylogroups with different genome characteristics and pathogenic potential
File Descriptions:
Data set 1: Detailed reports on the de novo genome assembly for PacBio sequence data of 23 NCTC Clostridium perfringens strains.
Data set 2: The folder includes Prokka-annotated "*.gbk" files of the analysed 206 Clostridium perfringens strains (A), as well as the RAST annotation of the chromosome from 34 circularized genomes (B).
Data set 3: (A) Core genome alignment constructed for the 206 Clostridium perfringens genomes using Parsnp. (B) A list of the 63,036 core genome SNPs identified in the 206 strains with their distribution across and impact on the reference ATCC 13124 genome, including 61,236 nonsyonymous, 15,489 synonymous, 4,395 in other features and 16,500 SNPs in non-coding regions.
Data set 4: (A) Gene sequences comprising the pangenome of the 206 Clostridium perfringens strains as calculated with Roary (options -i 90 -s). (B) Gene presence-absence matrix of the 14,942 genes comprising the pangenome of the 206 Clostridium perfringens strains. The total number of genes predicted per each strain is shown (Right). (C) The pangenome frequency in the investigated 206 strains. (D) A pie chart showing the pangenome compartements.
Data set 5 Gene sequences of virulence factors and (putative) iron uptake systems used for the BLAST analysis of the 206 Clostridium perfringens genomes.
Data set 6: (A) Visualisation of a large deletion of the colA gene in strain CBA7123 (bottom) compared to colA gene in strain 13 (top). (C) FASTA_Alignment of the colA genes of 206 C. perfringens strains. The colA genes with a large deletion mutation were artificially fragmented in the alignment file.
Data set 7: (A) Visualisation of a large deletion of the mu toxin gene (nagH) as observed in phylogroup I strains. As examples, the nagH region of strain SM101 (A) and strain NCTC 8081 (B) were compared to that in strain 13 (top). Strain NCTC 10240 was the only strain in phylogroup I that had an intact nagH gene. (B) FASTA_alignment of the nagH genes of 206 C. perfringens strains. The nagH genes with a large deletion mutation were artificially fragmented in the alignment file.
Data set 8: (A) Phylogenetic tree based on the deduced amino acid sequences of the pfoA gene and the pfoA like sequence (pfoA variant) in the strain JP838. Representative members of the cytolysin family were included. (B) Protein alignment of the pfoA gene (top) and the pfoA like sequence (bottom) in strain JP838. (C) FASTA nucleotide alignment of pfoA gene and the pfoA like sequence in strain JP838.
Data set 9: (A) RAST Annotation of the NCTC_8081 plasmid with highlighted cpe, tcp locus and the toxin homolog. (B) Phylogenetic tree based on the deduced amino acid sequences of representative members of Leukocidin/Hemolysin superfamily and the homolog (locus tag 02938) of the Darmbrand NCTC 8081 strain.
Data set 10: Amino acid sequences of identified putative virulence genes in the 206 Clostridium perfringens strains