Data from: Towards an Effective Sample Size of tree topologies from Bayesian phylogenetic analyses.

Version 2 2016-06-16, 22:43

Version 1 2016-03-22, 04:49

dataset

posted on 2016-06-16, 22:43 authored by Robert LanfearRobert Lanfear

This repository contains all of the scripts and data necessary to reproduce the analyses and figures in Lanfear, Hua, and Warren (2016): Assessing tree topologies from Bayesian phylogenetic analyses: autocorrelation plots and the approximate Effective Sample Size.

QuickStart

----------

1. Unzip the tree_ESS.zip file

2. Run the /R/analysis.r script (you will need to change the setwd, number of processors, and various calls to specific directories in the script first)

Note that this will run a lot of simulations and may take many hours.

Contents of tree_ESS.zip

------------------------

# /R folder

This contains the scripts to do the simulations and data analyses. To reproduce our analyses in full.

To run an R script, the easiest thing is to open the script in R, and then on the "Edit" menu, click "Execute".

## functions.r

Contains functions used in analysis.r. You should not need to change anything in this file, and you do not need to run it (analysis.r just looks here for the functions).

## analyses.r

Performs the simulations of trees, calculates the ESS values and data for the autocorrelation plots from the simulated and empirical datasets, and draws all the figures (which are saved to /Figures). Data from all of these analyses is written to the /output folder. Note that this folder contains the data we generated from running this script. Your data may not be identical (because the simulations are stochastic), but they should be comparable.

You will need to set a couple of things at the top of this script, as well as changing some calls to specific directories within the script itself. Please check all 'read' and 'write' commands and edit directories appropriately for your machine.

# /output folder

This cotains .csv files with the output from the /analysis.r script. It also contains nexus files of simulated datasets from analysis 1, but not from analysis 3, because that would be too many nexus files.

Of note in here are the fig5.csv and fig7.csv files, which contain the approximate and pseudo-ESS values calculated on all of the simulated datasets. These data form figures 5-7. I didn't store the simulated tree files, because they are big and numerous. Thus, you may not get identical results if you re-simulate the trees using the analyses.r script, but your results should certainly be qualitatively very similar.

# /empirical_datasets folder

This contains the trees from the DataDryad repo here: http://datadryad.org/resource/doi:10.5061/dryad.r1hk5

Specifically, this folder contains a single folder (/Scantlebury_2013), in which there is a subfolder of tree files from BEAST. This subfolder can be obtained directly from dryad, by unzipping the file from this link: http://datadryad.org/bitstream/handle/10255/dryad.50848/rawtrees.zip?sequence=1.

The files in this folder are used in the analysis.r script.

# /Figures folder

This contains all of our edited figures for the manuscript. If you run the analysis.r script, it will recreate the figures in R and save them in this folder with '_raw' appended to the filename.