The main 7z-archive [7z.org/Wikipedia] includes all used data (non-curated gene bank harvest, curated alignment) and analysis files in standard phylogenetic data formats (FASTA, NEXUS, NEWICK, Splits-NEXUS). For labelling conventions and archive content see ReadMe.txt.
New in Version 2: Fully annotated genotype spread sheet, CoV2Genotyping (XLSX) of the CoV-2 subsample included in our original harvest tabulating particular and general mutations patterns. The main archive has been updated.
New in Version 3: Archive Hack-and-Fish.7z including analysis files for the experiment described in this post:
MLTreeStrictGrCons—Maximum likelihood tree based on strict group consensus sequences and branch support established via non-parametric bootstrapping
NNetCPlusRecomb—Uncorrected p-distance (Hamming) based planar phylogenetic network based on the (strict) group consensus
data. Coloured lines refer to shared sequence patterns as visible from the alignment (likely recombination events)
NNetPlusSupport—Uncorrected p-distance (Hamming) based planar phylogenetic networks based on the non-consensed (original) data (in total, 291 near-complete virus genomes) used to define major groups for consensing approach. Bottom-right, bootstrap consensus network for the same data.
MutationPatterns1, ...2—Visualisation of mutation patterns that are either the consequence of homoplasy, i.e. convergent mutation from C to U in independent CoV-2 sublineages, or recombination. See 2nd post (Grimm, 2020a) for further details.
HnS.All.sumCNet, HnS.All.sumBSNet—Consensus networks of nine bit-wise ML trees (...CNet, strict) and according, pooled ML bootstrap pseudoreplicate trees (...BSNet, only splits with a frequency ≥ 20%). See 5th post (Grimm, 2020b) for explanations.
Referenced and further related posts are linked below.