figshare
Browse

Nonparametric Fusion Learning for Multiparameters: Synthesize Inferences From Diverse Sources Using Data Depth and Confidence Distribution

Download (202.84 kB)
Version 2 2021-04-30, 14:33
Version 1 2021-03-15, 16:40
journal contribution
posted on 2021-04-30, 14:33 authored by Dungang Liu, Regina Y. Liu, Min-ge Xie

Fusion learning refers to synthesizing inferences from multiple sources or studies to make a more effective inference and prediction than from any individual source or study alone. Most existing methods for synthesizing inferences rely on parametric model assumptions, such as normality, which often do not hold in practice. We propose a general nonparametric fusion learning framework for synthesizing inferences for multiparameters from different studies. The main tool underlying the proposed framework is the new notion of depth confidence distribution (depth-CD), which is developed by combining data depth and confidence distribution. Broadly speaking, a depth-CD is a data-driven nonparametric summary distribution of the available inferential information for a target parameter. We show that a depth-CD is a powerful inferential tool and, moreover, is an omnibus form of confidence regions, whose contours of level sets shrink toward the true parameter value. The proposed fusion learning approach combines depth-CDs from the individual studies, with each depth-CD constructed by nonparametric bootstrap and data depth. The approach is shown to be efficient, general and robust. Specifically, it achieves high-order accuracy and Bahadur efficiency under suitably chosen combining elements. It allows the model or inference structure to be different among individual studies. And, it readily adapts to heterogeneous studies with a broad range of complex and irregular settings. This last property enables the approach to use indirect evidence from incomplete studies to gain efficiency for the overall inference. We develop the theoretical support for the proposed approach, and we also illustrate the approach in making combined inference for the common mean vector and correlation coefficient from several studies. The numerical results from simulated studies show the approach to be less biased and more efficient than the traditional approaches in nonnormal settings. The advantages of the approach are also demonstrated in a Federal Aviation Administration study of aircraft landing performance. Supplementary materials for this article are available online.

Funding

Their research is supported in part by the NSF grants DMS1737857, DMS1812048, DMS2015373, and DMS2027855.

History