The statistical analysis of spatially clustered genes under the maximum gap criterion.

Hoberman, Rose; Sankoff, David; Durand, Dannie

doi:10.1184/R1/6104504.v1

biology-1036.pdf (147.68 kB)

The statistical analysis of spatially clustered genes under the maximum gap criterion.

journal contribution

posted on 2005-10-01, 00:00 authored by Rose Hoberman, David Sankoff, Dannie Durand

Statistical validation of gene clusters is imperative for many important applications in comparative genomics which depend on the identification of genomic regions that are historically and/or functionally related. We develop the first rigorous statistical treatment of max-gap clusters, a cluster definition frequently used in empirical studies. We present exact expressions for the probability of observing an individual cluster of a set of marked genes in one genome, as well as upper and lower bounds on the probability of observing a cluster of h homologs in a pairwise whole-genome comparison. We demonstrate the utility of our approach by applying it to a whole-genome comparison of E. coli and B. subtilis. Code for statistical tests is available at.

History

Publisher Statement

Date

2005-10-01

Usage metrics

Keywords

Bacillus subtilis Chromosome Mapping Cluster Analysis Computational Biology Data Interpretation Statistical Escherichia coli Evolution Molecular Genes Genome Genomics Models Genetic Multigene Family Probability Sequence Alignment Sequence Analysis DNA

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

The statistical analysis of spatially clustered genes under the maximum gap criterion.

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports