figshare
Browse
1/2
22 files

TGD POInT datasets and models

Version 5 2020-02-24, 19:05
Version 4 2020-02-20, 15:04
Version 3 2020-02-05, 19:13
Version 2 2020-02-05, 18:33
Version 1 2019-12-04, 16:56
dataset
posted on 2020-02-24, 19:05 authored by Gavin ConantGavin Conant
Datasets for analysis of the teleost-specific genome duplication (TGD) using POInT (Polyploidy Orthology Inference Tool; https://github.com/gconant0/POInT). Eight genome order files are included (*POInT_gene_orders.txt) corresponding to the eight genomes studied. An optimized ancestral genome order (TGD_OptOrder.txt), WGD gene loss model (WGD_fix_biasconverg_singlerate_model.txt) and assumed phylogeny (TGD_Topo.tre) are also included. All of these files are POInT inputs. POInT's optimal orthology inferences for each duplicated locus is also provided (Optimal_orthology_inferences.txt).
Additionally, I provide the conditional probabilities of all possible state transitions along each branch at every duplicated locus (TGD_WGDbcnbnf_OptOrder_condprobs.xlsx). Finally, I provide a list of all inferred ohnolog pairs in zebrafish (AllDrerio_WGD_duplpairs_wtandem_merged.txt), a list which includes a number of ohnolog pairs omitted from the main dataset because homologs of the pair were lacking in at least one other teleost genome. The corresponding list of single-copy genes is AllDrerio_WGD_singls_WTANDEMS.txt.
All four supplemental figures (Supplemental Figure 1-4) from the manuscript are included as fully scalable PDF files. Finally, the underlying data for Figure 2-4 are provided as MS Excel files. (Figure 1 can be constructed from the posterior probability file provided).

Funding

NSF-IOS-1339156

NSF-CCF-1421765

History