Bioinformatics Advance Access originally published online on April 25, 2007
Bioinformatics 2007 23(13):1588-1598; doi:10.1093/bioinformatics/btm146
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Murlet: a practical multiple alignment tool for structural RNA sequences
1Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, 2Graduate School of Information Sciences, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192 and 3Department of Computational Biology, Faculty of Frontier Science, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Structural RNA genes exhibit unique evolutionary patterns that are designed to conserve their secondary structures; these patterns should be taken into account while constructing accurate multiple alignments of RNA genes. The Sankoff algorithm is a natural alignment algorithm that includes the effect of base-pair covariation in the alignment model. However, the extremely high computational cost of the Sankoff algorithm precludes its application to most RNA sequences.
Results: We propose an efficient algorithm for the multiple alignment of structural RNA sequences. Our algorithm is a variant of the Sankoff algorithm, and it uses an efficient scoring system that reduces the time and space requirements considerably without compromising on the alignment quality. First, our algorithm computes the match probability matrix that measures the alignability of each position pair between sequences as well as the base pairing probability matrix for each sequence. These probabilities are then combined to score the alignment using the Sankoff algorithm. By itself, our algorithm does not predict the consensus secondary structure of the alignment but uses external programs for the prediction. We demonstrate that both the alignment quality and the accuracy of the consensus secondary structure prediction from our alignment are the highest among the other programs examined. We also demonstrate that our algorithm can align relatively long RNA sequences such as the eukaryotic-type signal recognition particle RNA that is
300 nt in length; multiple alignment of such sequences has not been possible by using other Sankoff-based algorithms. The algorithm is implemented in the software named Murlet.
Availability: The C++ source code of the Murlet software and the test dataset used in this study are available at http://www.ncrna.org/papers/Murlet/
Contact: kiryu-h{at}aist.go.jp
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Martin Bishop
Received on January 24, 2007; revised on March 19, 2007; accepted on April 10, 2007
This article has been cited by other articles:
![]() |
K. Katoh and H. Toh Recent developments in the MAFFT multiple sequence alignment program Brief Bioinform, July 1, 2008; 9(4): 286 - 298. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. B. Do, C.-S. Foo, and S. Batzoglou A max-margin model for efficient simultaneous alignment and folding of RNA sequences Bioinformatics, July 1, 2008; 24(13): i68 - i76. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Asai, H. Kiryu, M. Hamada, Y. Tabei, K. Sato, H. Matsui, Y. Sakakibara, G. Terai, and T. Mituyama Software.ncrna.org: web servers for analyses of RNA sequences Nucleic Acids Res., July 1, 2008; 36(suppl_2): W75 - W78. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Moretti, A. Wilm, D. G. Higgins, I. Xenarios, and C. Notredame R-Coffee: a web server for accurately aligning noncoding RNA sequences Nucleic Acids Res., July 1, 2008; 36(suppl_2): W10 - W13. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Fontaine, A. de Monte, and H. Touzet MAGNOLIA: multiple alignment of protein-coding and structural RNA sequences Nucleic Acids Res., July 1, 2008; 36(suppl_2): W14 - W18. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Wilm, D. G. Higgins, and C. Notredame R-Coffee: a method for multiple alignment of non-coding RNA Nucleic Acids Res., May 1, 2008; 36(9): e52 - e52. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Kiryu, T. Kin, and K. Asai Rfold: an exact algorithm for computing local base pairing probabilities Bioinformatics, February 1, 2008; 24(3): 367 - 373. [Abstract] [Full Text] [PDF] |
||||


