Bioinformatics Advance Access originally published online on May 8, 2007
Bioinformatics 2007 23(14):1718-1727; doi:10.1093/bioinformatics/btm241
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A phylogenetic Gibbs sampler that yields centroid solutions for cis-regulatory site prediction


1The Wadsworth Center, New York State Department of Health, Albany, NY 12201, 2Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, 3Center for Computational Molecular Biology and the Division of Applied Mathematics, Brown University, Providence, RI 02912 and 4Pacific Northwest National Laboratory, Richland, WA 99352, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Identification of functionally conserved regulatory elements in sequence data from closely related organisms is becoming feasible, due to the rapid growth of public sequence databases. Closely related organisms are most likely to have common regulatory motifs; however, the recent speciation of such organisms results in the high degree of correlation in their genome sequences, confounding the detection of functional elements. Additionally, alignment algorithms that use optimization techniques are limited to the detection of a single alignment that may not be representative. Comparative-genomics studies must be able to address the phylogenetic correlation in the data and efficiently explore the alignment space, in order to make specific and biologically relevant predictions.
Results: We describe here a Gibbs sampler that employs a full phylogenetic model and reports an ensemble centroid solution. We describe regulatory motif detection using both simulated and real data, and demonstrate that this approach achieves improved specificity, sensitivity, and positive predictive value over non-phylogenetic algorithms, and over phylogenetic algorithms that report a maximum likelihood solution.
Availability: The software is freely available at http://bayesweb.wadsworth.org/gibbs/gibbs.html
Contact: William_Thompson_1{at}brown.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Dmitrij Frishman
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
Received on January 10, 2007; revised on April 27, 2007; accepted on April 28, 2007
This article has been cited by other articles:
![]() |
L. E. Carvalho and C. E. Lawrence Centroid estimation in discrete high-dimensional spaces with applications in biology PNAS, March 4, 2008; 105(9): 3209 - 3214. [Abstract] [Full Text] [PDF] |
||||
