Bioinformatics Advance Access published online on August 16, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti628
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Statistics, The Wharton School, University of Pennsylvania
* To whom correspondence should be addressed.
Motivation: We present a sequence-based framework and algorithm PHYLOCLUS for predicting co-regulated genes. In our approach, de novo discovery methods are used to find motifs conserved by evolution and then a Bayesian hierarchical clustering model is used to cluster these motifs, thereby grouping together genes that are putatively co-regulated. Our clustering procedure allows both the number of clusters and the motif width within each cluster to be unknown. Results: We use our framework to predict co-regulated genes in the bacteria Bacillus subtilis using six other closely-related bacterial species. Our predicted motifs and gene clusters are validated using several external sources and significant clusters are examined in detail. An extension to the discovery and clustering of two-block motifs can be used for inference about synergistic binding relationships between transcription factors. Availability: Software and supplementary materials can be downloaded at http://stat.wharton.upenn.edu/~stjensen/research/phyloclus.html or http://www.fas.harvard.edu/~junliu/phyloclus.html.
Received April 29, 2005
Revised July 22, 2005
Accepted August 11, 2005
Article
Combining phylogenetic motif discovery and motif clustering to predict co-regulated genes
2 Department of Statistics, Harvard University
Shane T. Jensen, E-mail: stjensen{at}wharton.upenn.edu
![]()
Abstract
*These authors contributed equally to this work.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. Liu, X. Xu, and G. D. Stormo The cis-regulatory map of Shewanella genomes Nucleic Acids Res., September 1, 2008; 36(16): 5376 - 5390. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Newberg, W. A. Thompson, S. Conlan, T. M. Smith, L. A. McCue, and C. E. Lawrence A phylogenetic Gibbs sampler that yields centroid solutions for cis-regulatory site prediction Bioinformatics, July 15, 2007; 23(14): 1718 - 1727. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. A. Romer, G.-R. Kayombya, and E. Fraenkel WebMOTIFS: automated discovery, filtering and scoring of DNA sequence motifs using multiple programs and Bayesian approaches Nucleic Acids Res., July 13, 2007; 35(suppl_2): W217 - W220. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. S. Qin Clustering microarray gene expression data using weighted Chinese restaurant process Bioinformatics, August 15, 2006; 22(16): 1988 - 1997. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Wei and S. T. Jensen GAME: detecting cis-regulatory elements using a genetic algorithm Bioinformatics, July 1, 2006; 22(13): 1577 - 1584. [Abstract] [Full Text] [PDF] |
||||

