figshare
Browse
biology-1036.pdf (147.68 kB)

The statistical analysis of spatially clustered genes under the maximum gap criterion.

Download (147.68 kB)
journal contribution
posted on 2005-10-01, 00:00 authored by Rose Hoberman, David Sankoff, Dannie Durand

Statistical validation of gene clusters is imperative for many important applications in comparative genomics which depend on the identification of genomic regions that are historically and/or functionally related. We develop the first rigorous statistical treatment of max-gap clusters, a cluster definition frequently used in empirical studies. We present exact expressions for the probability of observing an individual cluster of a set of marked genes in one genome, as well as upper and lower bounds on the probability of observing a cluster of h homologs in a pairwise whole-genome comparison. We demonstrate the utility of our approach by applying it to a whole-genome comparison of E. coli and B. subtilis. Code for statistical tests is available at.

History

Publisher Statement

This is a copy of an article published in the Journal of Computational Biology © 2005 Mary Ann Liebert, Inc.; Journal of Computational Biology is available online at: http://online.liebertpub.com.

Date

2005-10-01