Hufford et al. 2012
Files and data from
Hufford, M.B.*, X. Xun*, J. van Heerwaarden*, T. Pyhäjärvi*, J-M. Chia, R.A. Cartwright, R.J. Elshire, J.C. Glaubitz, K.E. Guill, S. Kaeppler, J. Lai, P.L. Morrell, L.M. Shannon, C. Song, N.M. Springer, R.A. Swanson-Wagner, P. Tiffin, J. Wang, G. Zhang, J. Doebley, M.D. McMullen, D. Ware, E.S. Buckler, S. Yang, J. Ross-Ibarra. 2012. Comparative population genomics of maize domestication and improvement. Nature Genetics 44:808-811
55K Data
55K SNP data in Hapmap_55K.zip
Recombination
Files (rho.LR, rho.teo, rho.mz) are estimates of rho using Hudson's maxhap for teosinte, landraces, and maize from Hufford et al. Nat. Gen. 2012 (though these data were not published with the paper). The files are a bit redundant, but each line looks like:
1 100 141 100 14 0.000098
Which is chromosome, window, and number of SNPs repeated twice, followed by the MLE of rho. So the line above would be the 100th 10kb window on chromosome 1 (on reference genome AGPv1), with a rho=0.000098. I would be hesitant to trust the values with low S (definitely <10 and probably <20 <30), as those probably reflect noise more than anything else.
Popgen Stats
Summary statistics for 10kb windows genome-wide and for genes in the maize v2 filtered gene set.
Files are for genes in teo, LR, and maize as above, as well as 10kb windows for all 3 taxa in one file.
See details in the paper for criteria for calling SNPs, data used for statistics, etc.
Columns are:
locus: GRM name of gene in the filtered gene set
S: number of Segregating sites
ThetaW: Watterson's estimate of theta (per locus)
ThetaPi: nucleotide diversity (per locus)
ThetaH: Fay and Wu (2000) estimator (per locus)
TajD: Tajima's D
seqbp: # of bp sequenced. this should be used as the denominator to calculate per bp. values of the above statistics.
src_NA
The modified version of the XP-CLR code used in the paper.