figshare
Browse

milchevskaya_bc2_poster.pdf

Download (3.29 MB)
poster
posted on 2017-09-08, 10:36 authored by Vladislava MilchevskayaVladislava Milchevskaya
Title: "A pipeline to build up-to-date annotations for Affymetrix GeneChips"
Abstract: Rapid and continuous improvement of the genomic information affects gene and transcript definition, which in turn has an impact on the genome-wide expression profiling measured by Affymetrix GeneChips, a platform widely used in such assays. The original probe groupings used to measure transcript abundance on the GeneChips often become outdated, as genome annotations are refined: presence of probes not matching the target sequence, or matching unspecifically; multiple probe sets being assigned to the same gene. These issues lead to inconsistent results and dependence on the strategy used to aggregate redundant expression measurements into a gene-level value. Here we show that variability in the aggregation methods used in literature may affect results of the data analysis - both, for the outcome of the differential expression tests and downstream analysis. The genes assigned to multiple probe sets in the original GeneChips annotations are those most affected by the difference in processing. Mapping of the probe sequences to the most recent genomes for a number of Affymetrix platforms also supported that accurate re-grouping of the probes is nesessary. As a solution, we have developed a pipeline that generates a novel probe grouping for an Affymetrix GeneChip based on the genome reference provided by the user, and builds an annotation package compatible with standard Bioconductor libraries for further processing. The designed probe sets are gene- or transcript- specific, and contain only probes with specific mappings. Thus, erroneous and redundant measurements frequently present in the original annotations are eliminated, and often the gene coverage of GeneChips is increased. Moreover, using the same gene definition facilitates cross-platform analysis and comparative studies involving different Affymetrix platforms as well as RNA-seq data.

Funding

EMBL International PhD Programme

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC