A Cluster-Based Outlier Detection Scheme for Multivariate Data
Detection power of the squared Mahalanobis distance statistic is significantly reduced when several outliers exist within a multivariate dataset of interest. To overcome this masking effect, we propose a computer-intensive cluster-based approach that incorporates a reweighted version of Rousseeuw’s minimum covariance determinant method with a multi-step cluster-based algorithm that initially filters out potential masking points. Compared to the most robust procedures, simulation studies show that our new method is better for outlier detection. Additional real data comparisons are given. Supplementary materials for this article are available online.