False Discovery Versus Familywise Error Rate Approaches to Outlier Detection

2016-06-02T19:17:07Z (GMT) by Yihuan Xu Boris Iglewicz

Outliers, in general, are observations that deviate sufficiently from a base distribution. This study deals with outlier detection approaches for large samples from continuous univariate distributions. Investigated are the properties of a practical newer outlier detection approach based on use of a false discovery rate (FDR) method in conjunction with a robustly estimated Tukey g-and-h base distribution. Compared are the properties of a boxplot type outlier detection approach that controls the familywise error rate (FWER) with a newer FDR approach. These options are compared in terms of error rates and effects of moving the outliers gradually further from base distribution center while using 5% and 1% FDR or FWERs. Two microarray datasets are used as examples where the assumed null distributions do not fit the data well. In such cases, the proposed estimated Tukey g-and-h null distribution approach leads to superior outlier detection performance. Supplementary materials for this article are available online.