figshare
Browse
uasa_a_1498347_sm4347.zip (434.63 kB)

Selection-Corrected Statistical Inference for Region Detection With High-Throughput Assays

Download (434.63 kB)
Version 3 2019-10-25, 13:12
Version 2 2018-11-13, 16:25
Version 1 2018-07-18, 16:07
dataset
posted on 2019-10-25, 13:12 authored by Yuval Benjamini, Jonathan Taylor, Rafael A. Irizarry

Scientists use high-dimensional measurement assays to detect and prioritize regions of strong signal in spatially organized domain. Examples include finding methylation-enriched genomic regions using microarrays, and active cortical areas using brain-imaging. The most common procedure for detecting potential regions is to group neighboring sites where the signal passed a threshold. However, one needs to account for the selection bias induced by this procedure to avoid diminishing effects when generalizing to a population. This article introduces pin-down inference, a model and an inference framework that permit population inference for these detected regions. Pin-down inference provides nonasymptotic point and confidence interval estimators for the mean effect in the region that account for local selection bias. Our estimators accommodate nonstationary covariances that are typical of these data, allowing researchers to better compare regions of different sizes and correlation structures. Inference is provided within a conditional one-parameter exponential family per region, with truncations that match the selection constraints. A secondary screening-and-adjustment step allows pruning the set of detected regions, while controlling the false-coverage rate over the reported regions. We apply the method to genomic regions with differing DNA-methylation rates across tissue. Our method provides superior power compared to other conditional and nonparametric approaches. Supplementary materials for this article are available online.

Funding

R.A.I and Y. B. are supported by NIH grant R01GM083084. Y.B. would also want to thank Giovanni Parmigiani and the Stein Fellowship at Stanford for sponsoring a summer stay at the Biostatistic Department at Dana Farber that initiated this research.

History

Usage metrics

    Journal of the American Statistical Association

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC