Supplementary Material for: A Network-Based Kernel Machine Test for the Identification of Risk Pathways in Genome-Wide Association Studies

Biological pathways provide rich information and biological context on the genetic causes of complex diseases. The logistic kernel machine test integrates prior knowledge on pathways in order to analyze data from genome-wide association studies (GWAS). In this study, the kernel converts the genomic information of 2 individuals into a quantitative value reflecting their genetic similarity. With the selection of the kernel, one implicitly chooses a genetic effect model. Like many other pathway methods, none of the available kernels accounts for the topological structure of the pathway or gene-gene interaction types. However, evidence indicates that connectivity and neighborhood of genes are crucial in the context of GWAS, because genes associated with a disease often interact. Thus, we propose a novel kernel that incorporates the topology of pathways and information on interactions. Using simulation studies, we demonstrate that the proposed method maintains the type I error correctly and can be more effective in the identification of pathways associated with a disease than non-network-based methods. We apply our approach to genome-wide association case-control data on lung cancer and rheumatoid arthritis. We identify some promising new pathways associated with these diseases, which may improve our current understanding of the genetic mechanisms.