Code and metadata: Genetic architecture and selective sweeps after polygenic adaptation to distant trait optima

Understanding the genetic basis of phenotypic adaptation to changing environments is an essential goal of population and quantitative genetics. While technological advances now allow interrogation of genome-wide genotyping data in large panels, our theoretical understanding of the process of polygenic adaptation is still quite limited. To address this limitation, we use extensive forward-time simulation to explore the impacts of variation in demography, trait genetics, and selection on the rate and mode of adaptation and the resulting genetic architecture. We simulate a population adapting to an optimum shift, modeling sequence variation for 20 QTL for each of 12 different demographies for 100 different traits varying in the effect size distribution of new mutations, the strength of stabilizing selection, and the contribution of the genomic background. We then use random forest regression approaches to learn the relative importance of input parameters for statistics of interest such as the speed of adaptation, the relative frequency of hard sweeps and sweeps from standing variation, or the final genetic architecture of the trait. We find that selective sweeps occur even for traits under relatively weak selection and where the genetic background explains most of the variation. Though most sweeps occur from variation segregating in the ancestral population, new mutations can be important for traits under strong stabilizing selection that undergo a large optimum shift. Additionally, we find that deleterious mutations are more strongly influenced by the strength of stabilizing selection. We also show that population bottlenecks and expansion impact overall genetic variation as well as the relative importance of sweeps from standing variation and the speed with which adaptation can occur. We then use the matrix of effect sizes and allele frequencies in each population as a target for machine learning and find that demography and the effect size of new mutations have the largest influence on present day genetic architecture. Because a variety of parameter combinations can result in relatively similar genetic architectures, we conclude that it is not straightforward to infer much about the process of adaptation from the genetic architecture alone. Overall, our results underscore the complex population genetics of individual loci in even relatively simple quantitative trait models but provide a glimpse into the factors that drive this complexity.