rmflight_thesis.pdf (16.99 MB)
Download file

A Workflow for the Analysis of DNA Microarray Time-Course Data

Download (0 kB)
posted on 26.10.2012, 12:00 authored by Robert M FlightRobert M Flight

The past two decades have witnessed the increasing use of high-throughput
measurement technologies in biology and the advent of the –omics fields, including genomics,
transcriptomics, proteomics, and metabolomics. These new measurement platforms have
motivated the development of novel data-analysis methods and workflows. Nowhere is this
more true than in transcriptomics, where DNA microarrays are widely used to measure gene
expression. One area that has suffered from a lack of development of new analysis tools is the
application of DNA microarrays to time-course data. The use of DNA microarrays to follow
temporal changes in biological systems is particularly important, allowing the measurement of
dynamic changes in gene expression and providing valuable insight into cellular regulation.
However, there are many challenges to analyzing this type of DNA microarray data that are
distinct from other gene-expression experiments, thus necessitating the development of novel
analysis methods.
This thesis reports the development of a workflow for the analysis of DNA microarray
time-course data. Particular emphasis is focused on the estimation and incorporation of
measurement uncertainties at each step, methods for data visualization and normalization, and
the decomposition of data using biologically meaningful models. The emphasis on measurement
uncertainties led to a study of operator effects (gridding, flagging) on expression ratios, as well
as the validation of a bootstrap method to estimate measurement uncertainties in microarray
data. The application of correlation heat maps to time-course array level data allowed the
visualization and interpretation of transcriptome-wide changes in gene expression, providing
preliminary insights into the data. Microarray normalization was also investigated in the context
of time-course experiments, with a comparison of traditional and novel data normalization
methods. Finally, the application and analysis of multivariate curve resolution using weighted
alternating least squares (MCR-WALS) to time-course data is considered, with the extraction of
biological information using the Gene Ontology. The biological systems investigated in this work
include S. cerevisiae (yeast; cell cycle and exit from stationary phase), P. falciparum (malaria
parasite; intraerythrocytic developmental cycle) and D. melanogaster (fruit fly; life cycle).
Through the implementation of the workflow described in this thesis, putative regulatory
profiles were extracted for each of these systems that were ontologically consistent with the
known biology.