figshare
Browse

LIC criterion for optimal subset selection in distributed interval estimation

Download (948.7 kB)
journal contribution
posted on 2022-03-24, 12:00 authored by Guangbao Guo, Yue Sun, Guoqi Qian, Qian Wang

Distributed interval estimation in linear regression may be computationally infeasible in the presence of big data that are normally stored in different computer servers or in cloud. The existing challenge represents the results from the distributed estimation may still contain redundant information about the population characteristics of the data. To tackle this computing challenge, we develop an optimization procedure to select the best subset from the collection of data subsets, based on which we perform interval estimation in the context of linear regression. The procedure is derived based on minimizing the length of the final interval estimator and maximizing the information remained in the selected data subset, thus is named as the LIC criterion. Theoretical performance of the LIC criterion is studied in this paper together with a simulation study and real data analysis.

Funding

This work was supported by a grant from Natural Science Foundation of Shandong Province under project ID ZR2020MA022 and 2020KJI003.

History