Bioinformatics Advance Access published online on May 30, 2007
Bioinformatics, doi:10.1093/bioinformatics/btm272
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
RNA Sampler: A new sampling based algorithm for common RNA secondary structure prediction and structural alignment
Department of Genetics, Washington University, School of Medicine, St. Louis, MO 63110, USA.
*To whom correspondence should be addressed. Xing Xu, E-mail: stormo{at}genetics.wustl.edu, xingxu{at}genetics.wustl.edu, yji{at}genetics.wustl.edu
| Abstract |
|---|
Motivation: Non-coding RNA genes and RNA structural regulatory motifs play important roles in gene regulation and other cellular functions. They are often characterized by specific secondary structures that are critical to their functions and are often conserved in phylogenetically or functionally related sequences. Predicting common RNA secondary structures in multiple unaligned sequences remains a challenge in bioinformatics research.
Methods and Results: We present a new sampling based algorithm to predict common RNA secondary structures in multiple unaligned sequences. Our algorithm finds the common structures between two sequences by probabilistically sampling aligned stems based on stem conservation calculated from intrasequence base pairing probabilities and intersequence base alignment probabilities. It iteratively updates these probabilities based on sampled structures and subsequently recalculates stem conservation using the updated probabilities. The iterative process terminates upon convergence of the sampled structures. We extend the algorithm to multiple sequences by a consistency-based method, which iteratively incorporates and reinforces consistent structure information from pairwise comparisons into consensus structures. The algorithm has no limitation on predicting pseudoknots. In extensive testing on real sequence data, our algorithm outperformed other leading RNA structure prediction methods in both sensitivity and specificity with a reasonably fast speed. It also generated better structural alignments than other programs in sequences of a wide range of identities, which more accurately represent the RNA secondary structure conservations.
Availability: The algorithm is implemented in a C program, RNA Sampler, which is available at {{http://ural.wustl.edu/software.html}}
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Prof. John Quackenbush
1Present address: Rosetta Inpharmatics LLC, a wholly owned subsidiary of Merck & Co., Inc., Seattle, WA 98109, USA.
Received on February 9, 2007; revised on May 10, 2007; accepted on May 13, 2007
This article has been cited by other articles:
![]() |
S. H. Bernhart and I. L. Hofacker From consensus structure prediction to RNA gene finding Brief Funct Genomic Proteomic, November 1, 2009; 8(6): 461 - 471. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Fan, P. B. Bitterman, and O. Larsson Regulatory element identification in subsets of transcripts: Comparison and integration of current computational methods RNA, August 1, 2009; 15(8): 1469 - 1482. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. O. Harmanci, G. Sharma, and D. H. Mathews Stochastic sampling of the RNA structural alignment space Nucleic Acids Res., July 1, 2009; 37(12): 4063 - 4075. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Tabei and K. Asai A local multiple alignment method for detection of non-coding RNA sequences Bioinformatics, June 15, 2009; 25(12): 1498 - 1505. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. K. Bradley, L. Pachter, and I. Holmes Specific alignment of structured RNA: stochastic grammars and sequence annealing Bioinformatics, December 1, 2008; 24(23): 2677 - 2683. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Katoh and H. Toh Recent developments in the MAFFT multiple sequence alignment program Brief Bioinform, July 1, 2008; 9(4): 286 - 298. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. B. Do, C.-S. Foo, and S. Batzoglou A max-margin model for efficient simultaneous alignment and folding of RNA sequences Bioinformatics, July 1, 2008; 24(13): i68 - i76. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Moretti, A. Wilm, D. G. Higgins, I. Xenarios, and C. Notredame R-Coffee: a web server for accurately aligning noncoding RNA sequences Nucleic Acids Res., July 1, 2008; 36(suppl_2): W10 - W13. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Torarinsson and S. Lindgreen WAR: Webserver for aligning structural RNAs Nucleic Acids Res., July 1, 2008; 36(suppl_2): W79 - W84. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Wilm, D. G. Higgins, and C. Notredame R-Coffee: a method for multiple alignment of non-coding RNA Nucleic Acids Res., May 1, 2008; 36(9): e52 - e52. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Lindgreen, P. P. Gardner, and A. Krogh MASTR: multiple alignment and structure prediction of non-coding RNAs using simulated annealing Bioinformatics, December 15, 2007; 23(24): 3304 - 3311. [Abstract] [Full Text] [PDF] |
||||




