Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (11)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Blekas, K.
Right arrow Articles by Likas, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Blekas, K.
Right arrow Articles by Likas, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 19 no. 5 2003
Pages 607-617
© 2003 Oxford University Press

Greedy mixture learning for multiple motif discovery in biological sequences

Konstantinos Blekas *, Dimitrios I. Fotiadis and Aristidis Likas

Department of Computer Science, University of Ioannina 45110 Ioannina, Greece and Biomedical Research Institute, Foundation for Research and Technology, Hellas, 45110 Ioannina, Greece

Received on January 24, 2002 ; revised on April 20, 2002 and June 20, 2002 ; accepted on October 7, 2002

Motivation: This paper studies the problem of discovering subsequences, known as motifs, that are common to a given collection of related biosequences, by proposing a greedy algorithm for learning a mixture of motifs model through likelihood maximization. The approach adds sequentially a new motif to a mixture model by performing a combined scheme of global and local search for appropriately initializing its parameters. In addition, a hierarchical partitioning scheme based on kd-trees is presented for partitioning the input dataset in order to speed-up the global searching procedure. The proposed method compares favorably over the well-known MEME approach and treats successfully several drawbacks of MEME.

Results: Experimental results indicate that the algorithm is advantageous in identifying larger groups of motifs characteristic of biological families with significant conservation. In addition, it offers better diagnostic capabilities by building more powerful statistical motif-models with improved classification accuracy.

Availability: Source code in Matlab is available at http://www.cs.uoi.gr/~kblekas/greedy/GreedyEM.html

Contact: kblekas{at}cs.uoi.gr

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
C.-M. Hsu, C.-Y. Chen, and B.-J. Liu
Corrigendum
Nucleic Acids Res., March 27, 2008; 36(4): 1400 - 1406.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Hamada, K. Tsuda, T. Kudo, T. Kin, and K. Asai
Mining frequent stem patterns from unaligned RNA sequences
Bioinformatics, October 15, 2006; 22(20): 2480 - 2487.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C.-M. Hsu, C.-Y. Chen, and B.-J. Liu
MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W356 - W361.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.