Bioinformatics Advance Access published online on July 12, 2006
Bioinformatics, doi:10.1093/bioinformatics/btl368
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Computational Biology Unit, Bergen Centre for Computational Sciences, University of Bergen
* To whom correspondence should be addressed.
Motivation: Repeat sequences in ESTs are a source of problems, in particular for clustering. ESTs are therefore commonly masked against a library of known repeats. High quality repeat libraries are available for the widely studied organisms, but for most other organisms the lack of such libraries is likely to compromise the quality of EST analysis. Results: We present a fast, flexible, and library-less method for masking repeats in EST sequences, based on match statistics within the EST collection. The method is not linked to a particular clustering algorithm. Extensive testing on data sets using different clustering methods and a genomic mapping as reference shows that this method gives results that are better than or as good as those obtained using RepeatMasker with a repeat library. Availability: The implementation of RBR is available under the terms of the GPL from http://www.ii.uib.no/~ketil/bioinformatics.
Received April 19, 2006
Revised June 29, 2006
Accepted July 3, 2006
Article
RBR: library-less repeat detection for ESTs
Ketil Malde 1 *,
Korbinian Schneeberger 2,
Eivind Coward 3,
and
Inge Jonassen 4
2 Genome-Oriented Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München
3 Department of Informatics, University of Bergen
4 Computational Biology Unit, Bergen Centre for Computational Sciences, University of Bergen; Department of Informatics, University of Bergen
Ketil Malde, E-mail: ketil.malde{at}bccs.uib.no
![]()
Abstract
Associate Editor: Alex Bateman
![]()
CiteULike
Connotea
Del.icio.us What's this?