Supplementary Material for: A Non-Parametric Method for Building Predictive Genetic Tests on High-Dimensional Data

<i>Objective:</i> Predictive tests that capitalize on emerging genetic findings hold great promise for enhanced personalized healthcare. With the emergence of a large amount of data from genome-wide association studies (GWAS), interest has shifted towards high-dimensional risk prediction.<i>Methods:</i> To form predictive genetic tests on high-dimensional data, we propose a non-parametric method, called the ‘forward ROC method’. The method adopts a computationally efficient algorithm to search for environment risk factors, genetic predictors on the entire genome, and their possible interactions for an optimal risk prediction model, without relying on prior knowledge of known risk factors. An efficient yet powerful procedure is also incorporated into the method to handle missing data. <i>Results:</i>Through simulations and real data applications, we found our proposed method outperformed the existing approaches. We applied the new method to the Wellcome Trust rheumatoid arthritis GWAS dataset with a total of 460,547 markers. The results from the risk prediction analysis suggested important roles of <i>HLA-DRB1 </i>and <i>PTPN22</i> in predicting rheumatoid arthritis. <i>Conclusion:</i> We proposed a powerful and robust approach for high-dimensional risk prediction. The new method will facilitate future risk prediction that considers a large number of predictors and their interaction for improved performance.