figshare
Browse
uasa_a_1407776_sm2825.pdf (337.1 kB)

Accurate and Efficient P-value Calculation Via Gaussian Approximation: A Novel Monte-Carlo Method

Download (337.1 kB)
Version 2 2018-06-28, 20:09
Version 1 2018-01-15, 13:19
journal contribution
posted on 2018-06-28, 20:09 authored by Yaowu Liu, Jun Xie

It is of fundamental interest in statistics to test the significance of a set of covariates. For example, in genome-wide association studies, a joint null hypothesis of no genetic effect is tested for a set of multiple genetic variants. The minimum p-value method, higher criticism, and Berk–Jones tests are particularly effective when the covariates with nonzero effects are sparse. However, the correlations among covariates and the nonGaussian distribution of the response pose a great challenge toward the p-value calculation of the three tests. In practice, permutation is commonly used to obtain accurate p-values, but it is computationally very intensive, especially when we need to conduct a large amount of hypothesis testing. In this paper, we propose a Gaussian approximation method based on a Monte Carlo scheme, which is computationally more efficient than permutation while still achieving similar accuracy. We derive nonasymptotic approximation error bounds that could vanish in the limit even if the number of covariates is much larger than the sample size. Through real-genotype-based simulations and data analysis of a genome-wide association study of Crohn’s disease, we compare the accuracy and computation cost of our proposed method, of permutation, and of the method based on asymptotic distribution. Supplementary materials for this article are available online.

Funding

This work is supported by the National Institutes of Health Grant R21GM101504.

History