Skip Navigation


Bioinformatics Advance Access originally published online on December 15, 2005
Bioinformatics 2006 22(4):445-452; doi:10.1093/bioinformatics/btk008
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/4/445    most recent
btk008v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (39)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Yao, Z.
Right arrow Articles by Ruzzo, W. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yao, Z.
Right arrow Articles by Ruzzo, W. L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

CMfinder—a covariance model based RNA motif finding algorithm

Zizhen Yao 1,*, Zasha Weinberg 1 and Walter L. Ruzzo 1,2

1Department of Computer Science and Engineering, University of Washington Seattle WA 98195-2350, USA
2Department of Genome Sciences, University of Washington Seattle WA 98195-2350, USA

*To whom correspondence should be addressed.

Motivation: The recent discoveries of large numbers of non-coding RNAs and computational advances in genome-scale RNA search create a need for tools for automatic, high quality identification and characterization of conserved RNA motifs that can be readily used for database search. Previous tools fall short of this goal.

Results: CMfinder is a new tool to predict RNA motifs in unaligned sequences. It is an expectation maximization algorithm using covariance models for motif description, featuring novel integration of multiple techniques for effective search of motif space, and a Bayesian framework that blends mutual information-based and folding energy-based approaches to predict structure in a principled way.

Extensive tests show that our method works well on datasets with either low or high sequence similarity, is robust to inclusion of lengthy extraneous flanking sequence and/or completely unrelated sequences, and is reasonably fast and scalable. In testing on 19 known ncRNA families, including some difficult cases with poor sequence conservation and large indels, our method demonstrates excellent average per-base-pair accuracy—79% compared with at most 60% for alternative methods. More importantly, the resulting probabilistic model can be directly used for homology search, allowing iterative refinement of structural models based on additional homologs. We have used this approach to obtain highly accurate covariance models of known RNA motifs based on small numbers of related sequences, which identified homologs in deeply-diverged species.

Availability: Results and web server version are available at http://bio.cs.washington.edu/yzizhen/CMfinder/

Contact: yzizhen{at}cs.washington.edu

Supplementary information: Supplementary technical details are available at http://bio.cs.washington.edu/yzizhen/CMfinder/


Received on June 9, 2005; revised on December 12, 2005; accepted on December 13, 2005

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
RNAHome page
P. Menzel, J. Gorodkin, and P. F. Stadler
The tedious task of finding homologous noncoding RNA genes
RNA, December 1, 2009; 15(12): 2075 - 2082.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
P. P. Gardner
The use of covariance models to annotate RNAs in whole genomes
Briefings in Functional Genomics, November 1, 2009; 8(6): 444 - 450.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
S. H. Bernhart and I. L. Hofacker
From consensus structure prediction to RNA gene finding
Briefings in Functional Genomics, November 1, 2009; 8(6): 461 - 471.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
D. Fan, P. B. Bitterman, and O. Larsson
Regulatory element identification in subsets of transcripts: Comparison and integration of current computational methods
RNA, August 1, 2009; 15(8): 1469 - 1482.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Tabei and K. Asai
A local multiple alignment method for detection of non-coding RNA sequences
Bioinformatics, June 15, 2009; 25(12): 1498 - 1505.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. L. Kolbe and S. R. Eddy
Local RNA structure alignment with incomplete sequence
Bioinformatics, May 15, 2009; 25(10): 1236 - 1243.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. Childs, Z. Nikoloski, P. May, and D. Walther
Identification and classification of ncRNA molecules using graph properties
Nucleic Acids Res., May 1, 2009; 37(9): e66 - e66.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Backlund, K. Paukku, L. Daviet, R. A. De Boer, E. Valo, S. Hautaniemi, N. Kalkkinen, A. Ehsan, K. K. Kontula, and J. Y. A. Lehtonen
Posttranscriptional regulation of angiotensin II type 1 receptor expression by glyceraldehyde 3-phosphate dehydrogenase
Nucleic Acids Res., April 1, 2009; 37(7): 2346 - 2358.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. E. Seemann, J. Gorodkin, and R. Backofen
Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments
Nucleic Acids Res., November 1, 2008; 36(20): 6355 - 6362.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Rabani, M. Kertesz, and E. Segal
Computational prediction of RNA structural motifs involved in posttranscriptional regulatory processes
PNAS, September 30, 2008; 105(39): 14885 - 14890.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
K. Katoh and H. Toh
Recent developments in the MAFFT multiple sequence alignment program
Brief Bioinform, July 1, 2008; 9(4): 286 - 298.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. Torarinsson and S. Lindgreen
WAR: Webserver for aligning structural RNAs
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W79 - W84.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Wilm, D. G. Higgins, and C. Notredame
R-Coffee: a method for multiple alignment of non-coding RNA
Nucleic Acids Res., May 1, 2008; 36(9): e52 - e52.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
E. Torarinsson, Z. Yao, E. D. Wiklund, J. B. Bramsen, C. Hansen, J. Kjems, N. Tommerup, W. L. Ruzzo, and J. Gorodkin
Comparative genomics beyond sequence-based alignments: RNA structures in the ENCODE regions
Genome Res., February 1, 2008; 18(2): 242 - 251.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
I. M. Meyer
A practical guide to the art of RNA gene prediction
Brief Bioinform, November 1, 2007; 8(6): 396 - 414.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
E. S. Andersen, A. Lind-Thomsen, B. Knudsen, S. E. Kristensen, J. H. Havgaard, E. Torarinsson, N. Larsen, C. Zwieb, P. Sestoft, J. Kjems, et al.
Semiautomated improvement of RNA alignments
RNA, November 1, 2007; 13(11): 1850 - 1859.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Khaladkar, V. Bellofatto, J. T. L. Wang, B. Tian, and B. A. Shapiro
RADAR: a web server for RNA data analysis and research
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W300 - W304.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Z. Weinberg, J. E. Barrick, Z. Yao, A. Roth, J. N. Kim, J. Gore, J. X. Wang, E. R. Lee, K. F. Block, N. Sudarsan, et al.
Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline
Nucleic Acids Res., July 9, 2007; (2007) gkm487v1.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
E. Torarinsson, J. H. Havgaard, and J. Gorodkin
Multiple structural alignment and clustering of RNA sequences
Bioinformatics, April 15, 2007; 23(8): 926 - 932.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
E. K. Freyhult, J. P. Bollback, and P. P. Gardner
Exploring genomic dark matter: A critical assessment of the performance of homology search methods on noncoding RNA
Genome Res., January 1, 2007; 17(1): 117 - 125.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
E. Puerta-Fernandez, J. E. Barrick, A. Roth, and R. R. Breaker
Identification of a large noncoding RNA in extremophilic eubacteria
PNAS, December 19, 2006; 103(51): 19490 - 19495.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Hiller, R. Pudimat, A. Busch, and R. Backofen
Using RNA secondary structures to guide sequence motif finding towards single-stranded regions
Nucleic Acids Res., October 18, 2006; 34(17): e117 - e117.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Neph and M. Tompa
MicroFootPrinter: a tool for phylogenetic footprinting in prokaryotic genomes.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W366 - W368.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.