figshare
Browse
1/1
3 files

Biomarker Benchmark - GSE10320

Version 5 2016-03-17, 22:20
Version 4 2016-03-16, 16:31
Version 3 2016-02-23, 23:40
Version 2 2016-02-04, 22:41
Version 1 2016-02-02, 21:57
dataset
posted on 2016-03-17, 22:20 authored by Anna GuyerAnna Guyer, Stephen PiccoloStephen Piccolo

[NOTICE: This data set has been deprecated. Please see our new version of the data (and additional data sets) here: https://osf.io/mhk93 ]


"The gene expression patterns of favorable histology Wilms tumors (FHWT) that relapsed were compared with those that did not relapse using oligonucleotide arrays

Description: 250 FHWT of all stages enriched for relapses treated on National Wilms Tumor Study 5 passed quality parameters and were suitable for analysis using oligonucleotide arrays. Relapse risk stratification utilized Support Vector Machine; two and ten fold cross-validation was applied. The number of genes associated with relapse was less than that predicted by chance alone for 106 patients (32 relapses) with stages I and II FHWT and no further analyses were performed. This number was greater than expected by chance for 76 local stage III patients. Cross validation including an additional 68 local stage III patients (total 144 patients, 53 relapses) demonstrated that classifiers for relapse composed of 50 genes were associated with a median sensitivity of 47%, specificity 70%, and total error rate of 38%. Analysis of genes differentially expressed in relapse patients revealed apoptosis, Wnt signaling, IGF pathway, and epigenetic modification to be mechanisms important in relapse. Potential therapeutic targets include FRAP/MTOR and CD40.
Keywords: Classification by microarray analysis"

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE10320

We have included gene-expression data, the outcome (class) being predicted, and any clinical covariates. When gene-expression data were processed in multiple batches, we have provided batch information. Each data set is organized into a file set, where each contains all pertinent files for an individual dataset. The gene expression files have been normalized using both the SCAN and UPC methods using the SCAN.UPC package in Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/SCAN.UPC.html). We summarized the data at the gene level using the BrainArray resource (http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/20.0.0/ensg.asp). We used Ensembl identifiers. The class, clinical, and batch data were hand curated to ensure consistency ("tidy data" formatting). In addition, the data files have been formatted to be imported easily into the ML-Flex machine learning package (http://mlflex.sourceforge.net/).

History

Usage metrics

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC