figshare
Browse
1/1
8 files

Datasets, matrices and supplementary figure for QMaker

Version 12 2020-12-24, 09:06
Version 11 2020-11-01, 16:04
Version 10 2020-11-01, 16:01
Version 9 2020-10-09, 04:41
Version 8 2020-07-17, 05:51
Version 7 2020-01-31, 07:55
Version 6 2020-01-31, 05:37
Version 5 2020-01-31, 02:22
Version 4 2020-01-31, 02:11
Version 3 2020-01-31, 02:07
Version 2 2020-01-31, 02:06
Version 1 2019-09-04, 16:40
dataset
posted on 2020-12-24, 09:06 authored by Cuong DangCuong Dang
Pfam_datasets.zip: Pfam training and test sets which were used with QMaker.

05_clades.zip: Training and test sets of five clades which were used with QMaker.

Matrices-normalized.zip: Eight output matrices for LG, Pfam, Pfam-gb, Bird, Insect, Mammal, Plant, and Yeast datasets.

Fig_A1.tiff: The performance of four matrices Q.pfam, JTT, LG, WAG on Pfam, Bird, Plant, Insect, Yeast, and Mammal datasets.

Fig_A2.tiff: The bubble plot show relative differences between amino acid exchangeability rates in Q.pfam and Q.yeast. The explanations as similar as in Figure 4.

Table S1 (Correlations).docx: Correlation values (1000x) between six new matrices and 20 existing matrices, upper half are correlations of frequencies, lower half are correlations of exchangeabilities.

sample_training_10alignments.zip,
sample_training_10genes.zip: two small datasets and training scripts, one has 10 alignments (these alignments do not share species and are extracted from Pfam dataset, shoulf be trained with option -S), the other has 10 genes of a same species (extracting from Plant dataset, training with option -p). Each dataset will need ~30 mins training time on a 10-core machine.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC