Bioinformatics Advance Access published online on October 20, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn547
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Effective Cluster-Based Seed Design for Cross-Species Sequence Comparisons
aDepartment of Computer Science, George Washington University, Washington DC 20052
*To whom correspondence should be addressed. Leming Zhou, E-mail: lmzhou{at}gwu.edu
*To whom correspondence should be addressed. Prof Liliana Florea, E-mail: florea{at}gwu.edu
| Abstract |
|---|
Summary: To annotate newly sequenced organisms, cross-species sequence comparison algorithms can be applied to align gene sequences to the genome of a related species. To improve the accuracy of alignment, spaced seeds must be optimized for each comparison. As the number and diversity of genomes increases, an efficient alternative is to cluster pairwise comparisons into groups and identify seeds for groups instead of individual comparisons. Here we investigate a measure of comparison closeness and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed.
Availability: Source code is freely available at http://dna.cs.gwu.edu and from Bioinformatics Online.
Contact: lmzhou{at}gwu.edu, florea{at}gwu.edu
Supplementary information: http://dna.cs.gwu.edu and Bioinformatics Online.
Associate Editor: Prof. John Quackenbush
Received on August 5, 2008; revised on October 10, 2008; accepted on October 17, 2008
This article has been cited by other articles:
![]() |
L. Zhou, M. Pertea, A. L. Delcher, and L. Florea Sim4cc: a cross-species spliced alignment program Nucleic Acids Res., June 1, 2009; 37(11): e80 - e80. [Abstract] [Full Text] [PDF] |
||||
