Bioinformatics Advance Access originally published online on August 16, 2005
Bioinformatics 2005 21(20):3832-3839; doi:10.1093/bioinformatics/bti628
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Combining phylogenetic motif discovery and motif clustering to predict co-regulated genes


1Department of Statistics, The Wharton School, University of Pennsylvania PA, USA
2Department of Statistics, Harvard University Cambridge, MA, USA
*To whom correspondence should be addressed.
Motivation: We present a sequence-based framework and algorithm PHYLOCLUS for predicting co-regulated genes. In our approach, de novo discovery methods are used to find motifs conserved by evolution and then a Bayesian hierarchical clustering model is used to cluster these motifs, thereby grouping together genes that are putatively co-regulated. Our clustering procedure allows both the number of clusters and the motif width within each cluster to be unknown.
Results: We use our framework to predict co-regulated genes in the bacterium Bacillus subtilis using six other closely related bacterial species. Our predicted motifs and gene clusters are validated using several external sources and significant clusters are examined in detail. An extension to the discovery and clustering of two-block motifs can be used for inference about synergistic binding relationships between transcription factors.
Availability: Software and Supplementary Materials can be downloaded at http://stat.wharton.upenn.edu/~stjensen/research/phyloclus.html or http://www.fas.harvard.edu/~junliu/phyloclus.html
Contact: stjensen{at}wharton.upenn.edu
Received on April 29, 2005; revised on July 22, 2005; accepted on August 11, 2005
This article has been cited by other articles:
![]() |
J. Liu, X. Xu, and G. D. Stormo The cis-regulatory map of Shewanella genomes Nucleic Acids Res., September 1, 2008; 36(16): 5376 - 5390. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Newberg, W. A. Thompson, S. Conlan, T. M. Smith, L. A. McCue, and C. E. Lawrence A phylogenetic Gibbs sampler that yields centroid solutions for cis-regulatory site prediction Bioinformatics, July 15, 2007; 23(14): 1718 - 1727. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. A. Romer, G.-R. Kayombya, and E. Fraenkel WebMOTIFS: automated discovery, filtering and scoring of DNA sequence motifs using multiple programs and Bayesian approaches Nucleic Acids Res., July 13, 2007; 35(suppl_2): W217 - W220. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. S. Qin Clustering microarray gene expression data using weighted Chinese restaurant process Bioinformatics, August 15, 2006; 22(16): 1988 - 1997. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Wei and S. T. Jensen GAME: detecting cis-regulatory elements using a genetic algorithm Bioinformatics, July 1, 2006; 22(13): 1577 - 1584. [Abstract] [Full Text] [PDF] |
||||

