Bioinformatics Advance Access originally published online on January 18, 2005
Bioinformatics 2005 21(9):1815-1824; doi:10.1093/bioinformatics/bti279
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%
1Center for Bioinformatics and Division of Genetics, IBHV, The Royal Veterinary and Agricultural University Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
2Department of Statistics, Oxford University 1 South Parks Road, Oxford, OX1 3TG, UK
3Department of Genetics, Washington University School of Medicine Campus Box 8232, 4566 Scott Avenue, St Louis, MO 63110, USA
*To whom correspondence should be addressed.
Motivation: Searching for non-coding RNA (ncRNA) genes and structural RNA elements (eleRNA) are major challenges in gene finding today as these often are conserved in structure rather than in sequence. Even though the number of available methods is growing, it is still of interest to pairwise detect two genes with low sequence similarity, where the genes are part of a larger genomic region.
Results: Here we present such an approach for pairwise local alignment which is based on FOLDALIGN and the Sankoff algorithm for simultaneous structural alignment of multiple sequences. We include the ability to conduct mutual scans of two sequences of arbitrary length while searching for common local structural motifs of some maximum length. This drastically reduces the complexity of the algorithm. The scoring scheme includes structural parameters corresponding to those available for free energy as well as for substitution matrices similar to RIBOSUM. The new FOLDALIGN implementation is tested on a dataset where the ncRNAs and eleRNAs have sequence similarity <40% and where the ncRNAs and eleRNAs are energetically indistinguishable from the surrounding genomic sequence context. The method is tested in two ways: (1) its ability to find the common structure between the genes only and (2) its ability to locate ncRNAs and eleRNAs in a genomic context. In case (1), it makes sense to compare with methods like Dynalign, and the performances are very similar, but FOLDALIGN is substantially faster. The structure prediction performance for a family is typically around 0.7 using Matthews correlation coefficient. In case (2), the algorithm is successful at locating RNA families with an average sensitivity of 0.8 and a positive predictive value of 0.9 using a BLAST-like hit selection scheme.
Availability: The program is available online at http://foldalign.kvl.dk/
Contact: gorodkin{at}bioinf.kvl.dk
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
H. van Bakel and T. R. Hughes Establishing legitimacy and function in the new transcriptome Brief Funct Genomic Proteomic, November 1, 2009; 8(6): 424 - 436. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. R. Stocsits, H. Letsch, J. Hertel, B. Misof, and P. F. Stadler Accurate and efficient reconstruction of deep phylogenies from structured RNAs Nucleic Acids Res., October 1, 2009; 37(18): 6184 - 6193. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. O. Harmanci, G. Sharma, and D. H. Mathews Stochastic sampling of the RNA structural alignment space Nucleic Acids Res., July 1, 2009; 37(12): 4063 - 4075. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Vyas, S. Chaudhuri, D. W. Leaman, A. A. Komar, A. Musiyenko, S. Barik, and B. Mazumder Genome-Wide Polysome Profiling Reveals an Inflammation-Responsive Posttranscriptional Operon in Gamma Interferon-Activated Monocytes Mol. Cell. Biol., January 15, 2009; 29(2): 458 - 470. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. K. Bradley, L. Pachter, and I. Holmes Specific alignment of structured RNA: stochastic grammars and sequence annealing Bioinformatics, December 1, 2008; 24(23): 2677 - 2683. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Larsson, A. Hinas, D. H. Ardell, L. A. Kirsebom, A. Virtanen, and F. Soderbom De novo search for non-coding RNA genes in the AT-rich genome of Dictyostelium discoideum: Performance of Markov-dependent genome feature scoring Genome Res., June 1, 2008; 18(6): 888 - 899. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Wilm, D. G. Higgins, and C. Notredame R-Coffee: a method for multiple alignment of non-coding RNA Nucleic Acids Res., May 1, 2008; 36(9): e52 - e52. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Lindgreen, P. P. Gardner, and A. Krogh MASTR: multiple alignment and structure prediction of non-coding RNAs using simulated annealing Bioinformatics, December 15, 2007; 23(24): 3304 - 3311. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. M. Meyer A practical guide to the art of RNA gene prediction Brief Bioinform, November 1, 2007; 8(6): 396 - 414. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Andersen, A. Lind-Thomsen, B. Knudsen, S. E. Kristensen, J. H. Havgaard, E. Torarinsson, N. Larsen, C. Zwieb, P. Sestoft, J. Kjems, et al. Semiautomated improvement of RNA alignments RNA, November 1, 2007; 13(11): 1850 - 1859. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Washietl, J. S. Pedersen, J. O. Korbel, C. Stocsits, A. R. Gruber, J. Hackermuller, J. Hertel, M. Lindemeyer, K. Reiche, A. Tanzer, et al. Structured RNAs in the ENCODE selected regions of the human genome Genome Res., June 1, 2007; 17(6): 852 - 864. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Torarinsson, J. H. Havgaard, and J. Gorodkin Multiple structural alignment and clustering of RNA sequences Bioinformatics, April 15, 2007; 23(8): 926 - 932. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. K. Freyhult, J. P. Bollback, and P. P. Gardner Exploring genomic dark matter: A critical assessment of the performance of homology search methods on noncoding RNA Genome Res., January 1, 2007; 17(1): 117 - 125. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Lindgreen, P. P. Gardner, and A. Krogh Measuring covariation in RNA alignments: physical realism improves information measures Bioinformatics, December 15, 2006; 22(24): 2988 - 2995. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Voss Structural analysis of aligned RNAs Nucleic Acids Res., November 14, 2006; 34(19): 5471 - 5481. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Tabei, K. Tsuda, T. Kin, and K. Asai SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments Bioinformatics, July 15, 2006; 22(14): 1723 - 1729. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Torarinsson, M. Sawera, J. H. Havgaard, M. Fredholm, and J. Gorodkin Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure Genome Res., July 1, 2006; 16(7): 885 - 889. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Dalli, A. Wilm, I. Mainz, and G. Steger STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time Bioinformatics, July 1, 2006; 22(13): 1593 - 1599. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Yao, Z. Weinberg, and W. L. Ruzzo CMfinder--a covariance model based RNA motif finding algorithm Bioinformatics, February 15, 2006; 22(4): 445 - 452. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Steffen, B. Voss, M. Rehmsmeier, J. Reeder, and R. Giegerich RNAshapes: an integrated RNA analysis package based on abstract shapes Bioinformatics, February 15, 2006; 22(4): 500 - 503. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. Havgaard, R. B. Lyngso, and J. Gorodkin The FOLDALIGN web server for pairwise structural RNA alignment and mutual motif search Nucleic Acids Res., July 1, 2005; 33(suppl_2): W650 - W653. [Abstract] [Full Text] [PDF] |
||||






