Bioinformatics Advance Access originally published online on February 12, 2004
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics 20(10) © Oxford University Press 2004; all rights reserved.
BioOptimizer: a Bayesian scoring function approach to motif discovery
Department of Statistics, Harvard University, Cambridge, MA 02138-2901, USA
Received on September 21, 2003; revised on December 10, 2003; accepted on January 3, 2004
Advance Access Publication February 12, 2004
Motivation: Transcription factors (TFs) bind directly to short segments on the genome, often within hundreds to thousands of base pairs upstream of gene transcription start sites, to regulate gene expression. The experimental determination of TFs binding sites is expensive and time-consuming. Many motif-finding programs have been developed, but no program is clearly superior in all situations. Practitioners often find it difficult to judge which of the motifs predicted by these algorithms are more likely to be biologically relevant.
Results: We derive a comprehensive scoring function based on a full Bayesian model that can handle unknown site abundance, unknown motif width and two-block motifs with variable-length gaps. An algorithm called BioOptimizer is proposed to optimize this scoring function so as to reduce noise in the motif signal found by any motif-finding program. The accuracy of BioOptimizer, which can be used in conjunction with several existing programs, is shown to be superior to using any of these motif-finding programs alone when evaluated by both simulation studies and application to sets of co-regulated genes in bacteria. In addition, this scoring function formulation enables us to compare objectively different predicted motifs and select the optimal ones, effectively combining the strengths of existing programs.
Availability: BioOptimizer is available for download at www.fas.harvard.edu/~junliu/BioOptimizer/
Contact: jensen{at}stat.harvard.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
T.-M. Chan, K.-S. Leung, and K.-H. Lee TFBS identification based on genetic algorithm with combined representations and adaptive post-processing Bioinformatics, February 1, 2008; 24(3): 341 - 349. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.-C. Liang, X. Wang, and D. Anastassiou A profile-based deterministic sequential Monte Carlo algorithm for motif discovery Bioinformatics, January 1, 2008; 24(1): 46 - 55. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Larsson, D. M. Perlman, D. Fan, C. S. Reilly, M. Peterson, C. Dahlgren, Z. Liang, S. Li, V. A. Polunovsky, C. Wahlestedt, et al. Apoptosis resistance downstream of eIF4E: posttranscriptional activation of an anti-apoptotic transcript carrying a consensus hairpin structure Nucleic Acids Res., September 11, 2006; 34(16): 4375 - 4386. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ranjan, J. Seshadri, V. Vindal, S. Yellaboina, and A. Ranjan iCR: a web tool to identify conserved targets of a regulatory protein across the multiple related prokaryotic species. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W584 - W587. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Wei and S. T. Jensen GAME: detecting cis-regulatory elements using a genetic algorithm Bioinformatics, July 1, 2006; 22(13): 1577 - 1584. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. T. Jensen, L. Shen, and J. S. Liu Combining phylogenetic motif discovery and motif clustering to predict co-regulated genes Bioinformatics, October 15, 2005; 21(20): 3832 - 3839. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Che, S. Jensen, L. Cai, and J. S. Liu BEST: Binding-site Estimation Suite of Tools Bioinformatics, June 15, 2005; 21(12): 2909 - 2911. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Bi and P. K. Rogan Bipartite pattern discovery by entropy minimization-based multiple local alignment Nucleic Acids Res., September 23, 2004; 32(17): 4979 - 4991. [Abstract] [Full Text] [PDF] |
||||

