A Robust Model-Free Feature Screening Method for Ultrahigh-Dimensional Data

2017-05-23T15:54:47Z (GMT) by Jingnan Xue Faming Liang
<p>Feature screening plays an important role in dimension reduction for ultrahigh-dimensional data. In this article, we introduce a new feature screening method and establish its sure independence screening property under the ultrahigh-dimensional setting. The proposed method works based on the nonparanormal transformation and Henze–Zirkler’s test, that is, it first transforms the response variable and features to Gaussian random variables using the nonparanormal transformation and then tests the dependence between the response variable and features using the Henze–Zirkler’s test. The proposed method enjoys at least two merits. First, it is model-free, which avoids the specification of a particular model structure. Second, it is condition-free, which does not require any extra conditions except for some regularity conditions for high-dimensional feature screening. The numerical results indicate that, compared to the existing methods, the proposed method is more robust to the data generated from heavy-tailed distributions and/or complex models with interaction variables. The proposed method is applied to screening of anticancer drug response genes. Supplementary material for this article is available online.</p>