Skip Navigation

Bioinformatics 2006 22(14):e90-e98; doi:10.1093/bioinformatics/btl246
This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Do, C. B.
Right arrow Articles by Batzoglou, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Do, C. B.
Right arrow Articles by Batzoglou, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions@oxfordjournals.org

CONTRAfold: RNA secondary structure prediction without physics-based models

Chuong B. Do 1,*, Daniel A. Woods 1 and Serafim Batzoglou 1

1 Computer Science Department, Stanford University Stanford, CA 94305, USA

*To whom correspondence should be addressed.

Motivation: For several decades, free energy minimization methods have been the dominant strategy for single sequence RNA secondary structure prediction. More recently, stochastic context-free grammars (SCFGs) have emerged as an alternative probabilistic methodology for modeling RNA structure. Unlike physics-based methods, which rely on thousands of experimentally-measured thermodynamic parameters, SCFGs use fully-automated statistical learning algorithms to derive model parameters. Despite this advantage, however, probabilistic methods have not replaced free energy minimization methods as the tool of choice for secondary structure prediction, as the accuracies of the best current SCFGs have yet to match those of the best physics-based models.

Results: In this paper, we present CONTRAfold, a novel secondary structure prediction method based on conditional log-linear models (CLLMs), a flexible class of probabilistic models which generalize upon SCFGs by using discriminative training and feature-rich scoring. In a series of cross-validation experiments, we show that grammar-based secondary structure prediction methods formulated as CLLMs consistently outperform their SCFG analogs. Furthermore, CONTRAfold, a CLLM incorporating most of the features found in typical thermodynamic models, achieves the highest single sequence prediction accuracies to date, outperforming currently available probabilistic and physics-based techniques. Our result thus closes the gap between probabilistic and thermodynamic models, demonstrating that statistical learning procedures provide an effective alternative to empirical measurement of thermodynamic parameters for RNA secondary structure prediction.

Availability: Source code for CONTRAfold is available at http://contra.stanford.edu/contrafold/.

Contact: chuongdo{at}cs.stanford.edu



Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief BioinformHome page
K. Katoh and H. Toh
Recent developments in the MAFFT multiple sequence alignment program
Brief Bioinform, July 1, 2008; 9(4): 286 - 298.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. B. Do, C.-S. Foo, and S. Batzoglou
A max-margin model for efficient simultaneous alignment and folding of RNA sequences
Bioinformatics, July 1, 2008; 24(13): i68 - i76.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Wilm, D. G. Higgins, and C. Notredame
R-Coffee: a method for multiple alignment of non-coding RNA
Nucleic Acids Res., May 1, 2008; 36(9): e52 - e52.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
L. E. Carvalho and C. E. Lawrence
Centroid estimation in discrete high-dimensional spaces with applications in biology
PNAS, March 4, 2008; 105(9): 3209 - 3214.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Kiryu, T. Kin, and K. Asai
Rfold: an exact algorithm for computing local base pairing probabilities
Bioinformatics, February 1, 2008; 24(3): 367 - 373.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
A. Laederach
Informatics challenges in structured RNA
Brief Bioinform, September 1, 2007; 8(5): 294 - 303.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
B. S. Srinivasan, N. H. Shah, J. A. Flannick, E. Abeliuk, A. F. Novak, and S. Batzoglou
Current progress in network research: toward reference networks for key model organisms
Brief Bioinform, September 1, 2007; 8(5): 318 - 332.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
X. Xu, Y. Ji, and G. D. Stormo
RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment
Bioinformatics, August 1, 2007; 23(15): 1883 - 1891.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. R. Gruber, R. Neubock, I. L. Hofacker, and S. Washietl
The RNAz web server: prediction of thermodynamically stable and evolutionarily conserved RNA structures
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W335 - W338.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Andronescu, A. Condon, H. H. Hoos, D. H. Mathews, and K. P. Murphy
Efficient parameter estimation for RNA secondary structure prediction
Bioinformatics, July 1, 2007; 23(13): i19 - i28.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Kiryu, Y. Tabei, T. Kin, and K. Asai
Murlet: a practical multiple alignment tool for structural RNA sequences
Bioinformatics, July 1, 2007; 23(13): 1588 - 1598.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. J. Belfield, R. K. Hughes, N. Tsesmetzis, M. J. Naldrett, and R. Casey
The gateway pDEST17 expression vector encodes a -1 ribosomal frameshifting sequence
Nucleic Acids Res., February 28, 2007; 35(4): 1322 - 1332.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Kiryu, T. Kin, and K. Asai
Robust prediction of consensus secondary structures using averaged base pairing probability matrices
Bioinformatics, February 15, 2007; 23(4): 434 - 441.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.