Supplementary Material for: A Novel Kernel for Correcting Size Bias in the Logistic Kernel Machine Test with an Application to Rheumatoid Arthritis

<b><i>Objectives:</i></b> The logistic kernel machine test (LKMT) is a testing procedure tailored towards high-dimensional genetic data. Its use in pathway analyses of case-control genome-wide association studies results from its computational efficiency and flexibility in incorporating additional information via the kernel. The kernel can be any positive definite function; unfortunately, its form strongly influences the test's power and bias. Most authors have recommended the use of a simple linear kernel. We demonstrate via a simulation that the probability of rejecting the null hypothesis of no association just by chance increases with the number of SNPs or genes in the pathway when applying a simple linear kernel. <b><i>Methods:</i></b> We propose a novel kernel that includes an appropriate standardization in order to protect against any inflation of false positive results. Moreover, our novel kernel contains information on gene membership of SNPs in the pathway. <b><i>Results:</i></b> When applying the novel kernel to data from the North American Rheumatoid Arthritis Consortium, we find that even this basic genomic structure can improve the ability of the LKMT to identify meaningful associations. We also demonstrate that the standardization effectively eliminates problems of size bias. <b><i>Conclusion:</i></b> We recommend the use of our standardized kernel and urge caution when using non-adjusted kernels in the LKMT to conduct pathway analyses.