The Sparse MLE for Ultrahigh-Dimensional Feature Screening
Feature selection is fundamental for modeling the high-dimensional data, where the number of features can be huge and much larger than the sample size. Since the feature space is so large, many traditional procedures become numerically infeasible. It is hence essential to first remove most apparently noninfluential features before any elaborative analysis. Recently, several procedures have been developed for this purpose, which include the sure-independent-screening (SIS) as a widely used technique. To gain computational efficiency, the SIS screens features based on their individual predicting power. In this article, we propose a new screening method via the sparsity-restricted maximum likelihood estimator (SMLE). The new method naturally takes the joint effects of features in the screening process, which gives itself an edge to potentially outperform the existing methods. This conjecture is further supported by the simulation studies under a number of modeling settings. We show that the proposed method is screening consistent in the context of ultrahigh-dimensional generalized linear models. Supplementary materials for this article are available online.