Additional file 1: of 2D–EM clustering approach for high-dimensional data through folding feature vectors

In this file the bias of using filtering process is analyzed. Here, we analyzed the effect of applying the filter (which was used for 2D–EM algorithm) to other clustering algorithms. We preprocess data to retain top m 2 features. The m 2 values for all datasets at 0.01 cut-off were as follows: 1156 (SRBCT), 529 (ALL), 6084 (MLL), 1444 (ALL subtype), 15,129 (GCM) and 5625 (Lung Cancer). Then clustering algorithms are applied to see the difference in performance (both in Rand score and adjusted Rand index). Table S1 and Table S2 show the Rand score and adjusted Rand score when filtering step is applied. Table S3 and Table S4 show the variations in Rand score and adjusted Rand score after filtering compared to before filtering process. (DOCX 25 kb)