Bioinformatics Advance Access originally published online on November 2, 2005
Bioinformatics 2006 22(1):35-39; doi:10.1093/bioinformatics/bti743
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sequence-based heuristics for faster annotation of non-coding RNA families
1Department of Computer Science & Engineering Seattle, WA 98195, USA
2Department of Genome Sciences, University of Washington Seattle, WA 98195, USA
*To whom correspondence should be addressed.
Motivation: Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be.
Results: In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution thatunlike family-specific solutionscan scale to hundreds of ncRNA families.
Availability: The source code is available under GNU Public License at the supplementary web site.
Contact: zasha{at}cs.washington.edu
Supplementary information: http://bio.cs.washington.edu/supplements/zasha-HeurHmm-2004/ (Technical details, results, C++ code)
Received on December 13, 2004; revised on October 13, 2005; accepted on October 22, 2005
This article has been cited by other articles:
![]() |
P. Menzel, J. Gorodkin, and P. F. Stadler The tedious task of finding homologous noncoding RNA genes RNA, December 1, 2009; 15(12): 2075 - 2082. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Mosig, L. Zhu, and P. F. Stadler Customized strategies for discovering distant ncRNA homologs Briefings in Functional Genomics, November 1, 2009; 8(6): 451 - 460. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. P. Gardner The use of covariance models to annotate RNAs in whole genomes Briefings in Functional Genomics, November 1, 2009; 8(6): 444 - 450. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. P. Nawrocki, D. L. Kolbe, and S. R. Eddy Infernal 1.0: inference of RNA alignments Bioinformatics, May 15, 2009; 25(10): 1335 - 1337. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Wang, Z. Huang, Y. Wu, R. L. Malmberg, and L. Cai RNATOPS-W: a web server for RNA structure searches of genomes Bioinformatics, April 15, 2009; 25(8): 1080 - 1081. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Kaczkowski, E. Torarinsson, K. Reiche, J. H. Havgaard, P. F. Stadler, and J. Gorodkin Structural profiles of human miRNA families from pairwise clustering Bioinformatics, February 1, 2009; 25(3): 291 - 294. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Huang, Y. Wu, J. Robertson, L. Feng, R. L. Malmberg, and L. Cai Fast and accurate search for non-coding RNA pseudoknot structures in genomes Bioinformatics, October 15, 2008; 24(20): 2281 - 2287. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Weinberg, E. E. Regulski, M. C. Hammond, J. E. Barrick, Z. Yao, W. L. Ruzzo, and R. R. Breaker The aptamer core of SAM-IV riboswitches mimics the ligand-binding site of SAM-I riboswitches RNA, May 1, 2008; 14(5): 822 - 828. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Torarinsson, Z. Yao, E. D. Wiklund, J. B. Bramsen, C. Hansen, J. Kjems, N. Tommerup, W. L. Ruzzo, and J. Gorodkin Comparative genomics beyond sequence-based alignments: RNA structures in the ENCODE regions Genome Res., February 1, 2008; 18(2): 242 - 251. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. M. Meyer A practical guide to the art of RNA gene prediction Brief Bioinform, November 1, 2007; 8(6): 396 - 414. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. N. Kim, A. Roth, and R. R. Breaker Guanine riboswitch variants from Mesoplasma florum selectively recognize 2'-deoxyguanosine PNAS, October 9, 2007; 104(41): 16092 - 16097. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Weinberg, J. E. Barrick, Z. Yao, A. Roth, J. N. Kim, J. Gore, J. X. Wang, E. R. Lee, K. F. Block, N. Sudarsan, et al. Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline Nucleic Acids Res., July 9, 2007; (2007) gkm487v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Welz and R. R. Breaker Ligand binding and gene control characteristics of tandem riboswitches in Bacillus anthracis RNA, April 1, 2007; 13(4): 573 - 582. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Puerta-Fernandez, J. E. Barrick, A. Roth, and R. R. Breaker Identification of a large noncoding RNA in extremophilic eubacteria PNAS, December 19, 2006; 103(51): 19490 - 19495. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Sudarsan, M. C. Hammond, K. F. Block, R. Welz, J. E. Barrick, A. Roth, and R. R. Breaker Tandem Riboswitch Architectures Exhibit Complex Gene Control Functions Science, October 13, 2006; 314(5797): 300 - 304. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.R. EDDY Computational Analysis of RNAs Cold Spring Harb Symp Quant Biol, January 1, 2006; 71(0): 117 - 128. [Abstract] [PDF] |
||||








