figshare
Browse
1/3
59 files

FGMD: A novel approach for functional gene module detection in cancer

dataset
posted on 2017-12-15, 18:28 authored by Daeyong Jin, Hyunju Lee

With the increasing availability of multi-dimensional biological datasets for the same samples (i.e., gene expression, microRNAs, copy numbers, mutations, methylations), it has now become possible to systematically understand the regulatory mechanisms operating in a cancer cell. For this task, it is important to discover a set of co-expressed genes with functions, representing a so-called functional gene module, because co-expressed genes tend to be co-regulated by the same regulators, including transcription factors, microRNAs, and copy number aberrations. Several algorithms have been used to identify such gene modules, including hierarchical clustering and non-negative matrix factorization. Although these algorithms have been applied to many microarray datasets, only a few systematic analyses of these algorithms have been performed for RNA-sequencing (RNA-Seq) data to date. Although gene expression levels determined based on microarray and RNA-Seq datasets tend to be highly correlated, the expression levels of some genes differ depending on the platforms used for analysis, which may result in the construction of different gene modules for the same samples. Here, we compare several module detection algorithms applied to both microarray and RNA-seq datasets. We further propose a new functional gene module detection algorithm (FGMD), which is based on a hierarchical clustering algorithm that was modified to reflect actual biological observations, including the fact that a single gene may be involved in multiple biological pathways. Application of existing algorithms and the new FGMD algorithm to breast cancer and ovarian cancer datasets from The Cancer Genome Atlas showed that the FGMD algorithm had the best performance for most of the functional pathway enrichment tests and in the transcription factor enrichment test. We expect that the FGMD algorithm will contribute to improving the identification of functional gene modules related to cancer.

History