Bioinformatics Advance Access originally published online on January 18, 2007
Bioinformatics 2007 23(5):531-537; doi:10.1093/bioinformatics/btl662
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Computing exact P-values for DNA motifs
1State Key Laboratory of Intelligent Technology & System, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China, 2Bioinformatics Division, TNLIST and Department of Antomation, Tsinghua University, Beijing 100084, China, 3School of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada, 4CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands and 5Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11274, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Many heuristic algorithms have been designed to approximate P-values of DNA motifs described by position weight matrices, for evaluating their statistical significance. They often significantly deviate from the true P-value by orders of magnitude. Exact P-value computation is needed for ranking the motifs. Furthermore, surprisingly, the complexity of the problem is unknown.
Results: We show the problem to be NP-hard, and present MotifRank, software based on dynamic programming, to calculate exact P-values of motifs. We define the exact P-value on a general and more precise model. Asymptotically, MotifRank is faster than the best exact P-value computing algorithm, and is in fact practical. Our experiments clearly demonstrate that MotifRank significantly improves the accuracy of existing approximation algorithms.
Availability: MotifRank is available from http://bio.dlg.cn
Contact: mzhang{at}cshl.edu mli{at}uwaterloo.ca
Supplementary information: Supplementary data are available at Bioinformatics online.
Received on July 7, 2006; revised on December 1, 2006; accepted on December 22, 2006
This article has been cited by other articles:
![]() |
U. J. Pape, S. Rahmann, and M. Vingron Natural similarity measures between position frequency matrices with an application to clustering Bioinformatics, February 1, 2008; 24(3): 350 - 357. [Abstract] [Full Text] [PDF] |
||||
