A method to scan genomes for introgression in a secondary contact model

<p>This poster was presented at the 2014 meeting of the Society for Molecular Biology & Evolution, in San Juan, PR.</p> <p><strong>Abstract:</strong> Secondary contact between divergent populations, or incipient species, may result in the exchange and introgression of genomic material. We develop a simple DNA sequence measure, called <em>G</em>min, that is designed to identify genomic regions experiencing introgression in a secondary contact model. <em>G</em>min is simply defined as the ratio of the minimum between-population number of nucleotide differences to the average number of between-population differences. The utility of <em>G</em>min is that it is computationally inexpensive relative to likelihood-based methods for detecting gene flow and it has better sensitivity and specificity than traditional measures, such as <em>F</em>st. We present extensive evaluations of the sensitivity and specificity of Gmin and suggest a simple statistical testing procedure for identifying genomic outliers. Finally, we present genome-level scans of <em>G</em>min for three different data sets. The first data set evaluates cosmopolitan and African populations of <em>Drosophila melanogaster.</em> In this case,<em> G</em>min is able to identify all of the regions of the X chromosome that were identified by a previous, more computationally intense method, in addition to one region that was not previously identified. Second, we perform genome scans of Gmin for two species of the <em>Drosophila simulans</em> clade. In this case, we identify large tracts of introgression between species whose hybrid males are sterile. Lastly, we apply the <em>G</em>min statistic to an analysis of African and non-African populations of domesticated rice. In this case, it was previously shown that that these populations have a history of recent genetic exchange, however, scans of <em>G</em>min show no evidence introgression of domestication genes between these two populations. <em>G</em>min is a biologically straightforward, yet powerful, alternative to <em>F</em>st, as well as to more computationally intensive model-based methods for detecting gene flow.</p>