figshare
Browse
NGS14-accepted-poster 71.pdf (722.35 kB)

Using exome sequence data and Random Forest analysis to identify functional mutation signatures of 5 cancer differentiation subtypes

Download (0 kB)
poster
posted on 2014-06-26, 20:25 authored by Russel Sutherland, Salvador J. Diaz-CanoSalvador J. Diaz-Cano, Jane Moorhead, Richard Dobson

Introduction
The Pan-Cancer Analysis Project [1] aims to identify the genomic changes present in 12 different cancer types from the Cancer Genome Atlas (TCGA) set [2]. Cancer is a morphologically and genetically highly heterogeneous disease and as such we aimed to identify predictors of the 5 main differentiation subtypes in the Pan-Cancer Analysis [1] based on differences in their patterns of functional mutation. Whole exome sequencing was performed on tumour and normal tissue samples from 3129 patients enabling the identification of cancer related mutations in each patient. Clinical data were also collected for all patients, including gender and age. We used a Random Forest machine learning approach to compare the 5 differentiation subtypes in a pairwise fashion. Our presented results show that we were able to discriminate between Bladder Urotherial Cancer and Acute Myeloid Leukemia in unseen samples with 87.8% accuracy (95% CI : (78.71, 93.99)).

Analysis Pipeline and Methods
Tumour
Germline
Fisher’s exact test reduces gene list from 20775 to

History