Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Shinozaki, D.
Right arrow Articles by Maruyama, O.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Shinozaki, D.
Right arrow Articles by Maruyama, O.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 19 Suppl. 2 2003
pages ii206-ii214
© 2003 Oxford University Press

Finding optimal degenerate patterns in DNA sequences

Daisuke Shinozaki 1, Tatsuya Akutsu 2 and Osamu Maruyama 3,*

1 Graduate School of Mathematics, Kyushu University, Fukuoka 812-8581, Japan
2 Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
3 Faculty of Mathematics, Kyushu University, Fukuoka 812-8581, Japan

Received on March 17, 2003 ; accepted on June 9, 2003

Motivation: The problem of finding transcription factor binding sites in the upstream regions of given genes is algorithmically an interesting and challenging problem in computational biology. A degenerate pattern over a finite alphabet {Sigma} is a sequence of subsets of {Sigma}. A string over IUPAC nucleic acid codes is also a degenerate pattern over {Sigma} = {A, C, G, T}, and is used as one of the major patterns modeling transcription factor binding sites in the upstream regions of genes. However, it is known that the problem of finding a degenerate pattern consistent with both positive and negative string sets is in general NP-complete. Our aim is to devise a heuristic algorithm to find a degenerate pattern which is optimal for positive and negative string sets w.r.t. a given score function.

Results: We have proposed an enumerative algorithm called SUPERPOSITION for finding optimal degenerate patterns with a pruning technique, which works with most all reasonable score functions. The performance score of the algorithm has been compared with those of other popular motif-finding algorithms YMF, MEME and AlignACE on various sets of co-regulated genes of yeast. In the computational experiment, SUPERPOSITION has outperformed the others on several gene sets.

Availability: The python script SUPERPOSITION is available at http://www.math.kyushu-u.ac.jp/~om/softwares.html

Contact: om{at}math.kyushu-u.ac.jp

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.