Bioinformatics Vol. 19 no. 10 2003
Pages 1221-1226
© 2003 Oxford University Press
Fast sequence clustering using a suffix array algorithm
Department of Informatics, University of Bergen, HIB, N5020 Norway
Received on September 19, 2002
; revised on November 15, 2002 and January 16, 2003
; accepted on January 21, 2003
Motivation: Efficient clustering is important for handling the large amount of available EST sequences. Most contemporary methods are based on some kind of all-against-all comparison, resulting in a quadratic time complexity. A different approach is needed to keep up with the rapid growth of EST data.
Results: A new, fast EST clustering algorithm is presented. Sub-quadratic time complexity is achieved by using an algorithm based on suffix arrays. A prototype implementation has been developed and run on a benchmark data set. The produced clusterings are validated by comparing them to clusterings produced by other methods, and the results are quite promising.
Availability: The source code for the prototype implementation is available under a GPL license from http://www.ii.uib.no/~ketil/bio/
Contact: ketil{at}ii.uib.no
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Hazelhurst, W. Hide, Z. Liptak, R. Nogueira, and R. Starfield An overview of the wcd EST clustering tool Bioinformatics, July 1, 2008; 24(13): 1542 - 1546. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Malde, K. Schneeberger, E. Coward, and I. Jonassen RBR: library-less repeat detection for ESTs Bioinformatics, September 15, 2006; 22(18): 2232 - 2236. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Malde, E. Coward, and I. Jonassen A graph based algorithm for generating EST consensus sequences Bioinformatics, April 15, 2005; 21(8): 1371 - 1375. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Schneeberger, K. Malde, E. Coward, and I. Jonassen Masking repeats while clustering ESTs Nucleic Acids Res., April 14, 2005; 33(7): 2176 - 2180. [Abstract] [Full Text] [PDF] |
||||

