10.6084/m9.figshare.5405317.v1
Sahir Bhatnagar
Sahir
Bhatnagar
Karim Oualkacha
Karim
Oualkacha
Yi Yang
Yi
Yang
Marie Forest
Marie
Forest
Celia MT Greenwood
Celia
MT Greenwood
Estimation for High-Dimensional Multivariate Linear Mixed Models in Structured Populations
figshare
2017
linear mixed models
lmm
kinship matrix
penalized regression
GWAS
Biostatistics
2017-09-13 20:49:34
Poster
https://figshare.com/articles/poster/Estimation_for_High-Dimensional_Multivariate_Linear_Mixed_Models_in_Structured_Populations/5405317
<p>Complex
traits are thought to be influenced by a combination of environmental
factors and rare and common genetic variants. However, detection of
such multivariate associations can be compromised by low statistical
power and confounding by population structure. Linear mixed effect
models (LMM) can account for correlations due to relatedness but are
not applicable in high-dimensional (HD) settings where the number of
predictors greatly exceeds the number of samples. False negatives can
result from two-stage approaches, where the residuals estimated from
a null model adjusted for the subjects’ relationship structure are
subsequently used as the response in a standard penalized regression
model. To overcome these challenges, we develop a general penalized
LMM framework that simultaneously selects and estimates variables for
structured populations in one step. Our method can accommodate
several sparsity inducing penalties such as the lasso and elastic
net, and also readily handles prior annotation information in the
form of weights. Our algorithm is computationally efficient, scales
to HD settings and we mathematically prove that it converges to a
stationary point. Through simulations we show that when there are
several correlated causal variants with small effects, our method has
better power over the two-stage approach. We apply our method to
identify SNPs that predict blood pressure in 20 large Mexican
American pedigrees from the Genetic Analysis Workshop 18 data. This
approach can also be used to generate genetic risk scores that can be
useful for risk stratification and clinical decision making. Our
algorithms are available in an R package
(https://github.com/sahirbhatnagar/penfam). </p>