Additional file 1: of 2DâEM clustering approach for high-dimensional data through folding feature vectors

Sharma, Alok; Kamola, Piotr; Tsunoda, Tatsuhiko

doi:10.6084/m9.figshare.5742591.v1

12859_2017_1970_MOESM1_ESM.docx (25.83 kB)

Additional file 1: of 2DâEM clustering approach for high-dimensional data through folding feature vectors

journal contribution

posted on 2017-12-28, 05:00 authored by Alok Sharma, Piotr Kamola, Tatsuhiko Tsunoda

In this file the bias of using filtering process is analyzed. Here, we analyzed the effect of applying the filter (which was used for 2DâEM algorithm) to other clustering algorithms. We preprocess data to retain top m 2 features. The m 2 values for all datasets at 0.01 cut-off were as follows: 1156 (SRBCT), 529 (ALL), 6084 (MLL), 1444 (ALL subtype), 15,129 (GCM) and 5625 (Lung Cancer). Then clustering algorithms are applied to see the difference in performance (both in Rand score and adjusted Rand index). Table S1 and Table S2 show the Rand score and adjusted Rand score when filtering step is applied. Table S3 and Table S4 show the variations in Rand score and adjusted Rand score after filtering compared to before filtering process. (DOCX 25Â kb)