figshare
Browse
gnst_a_1508678_sm5027.pdf (206.44 kB)

Decentralized nonparametric multiple testing

Download (206.44 kB)
journal contribution
posted on 2018-08-10, 06:35 authored by Subhadeep Mukhopadhyay

Consider a big data multiple testing task, where, due to storage and computational bottlenecks, one is given a very large collection of p-values by splitting into manageable chunks and distributing over thousands of computer nodes. This paper is concerned with the following question: How can we find the full data multiple testing solution by operating completely independently on individual machines in parallel, without any data exchange between nodes? This version of the problem tends naturally to arise in a wide range of data-intensive science and industry applications whose methodological solution has not appeared in the literature to date; therefore, we feel it is necessary to undertake such analysis. Based on the nonparametric functional statistical viewpoint of large-scale inference, started in Mukhopadhyay, S. [(2016), ‘Large Scale Signal Detection: A Unifying View’, Biometrics, 72, 325–334], this paper furnishes a new computing model that brings unexpected simplicity to the design of the algorithm which might otherwise seem daunting using classical approach and notations.

History