Bioinformatics Advance Access originally published online on May 14, 2004
Bioinformatics 2004 20(16):2676-2684; doi:10.1093/bioinformatics/bth308
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics vol. 20 issue 16 © Oxford University Press 2004; all rights reserved.
An efficient algorithm for optimizing whole genome alignment with noise
Department of Computer Science, University of Hong Kong, Hong Kong
Received on October 20, 2003; revised on April 13, 2004; accepted on April 28, 2004
Advance Access Publication May 14, 2004
Motivation: This paper is concerned with algorithms for aligning two whole genomes so as to identify regions that possibly contain conserved genes. Motivated by existing heuristic-based software tools, we initiate the study of an optimization problem that attempts to uncover conserved genes with a global concern. Another interesting feature in our formulation is the tolerance of noise, which also complicates the optimization problem. A brute-force approach takes time exponential in the noise level.
Results: We show how an insight into the optimization structure can lead to a drastic improvement in the time and space requirement [precisely, to O(k2n2) and O(k2n), respectively, where n is the size of the input and k is the noise level]. The reduced space requirement allows us to implement the new algorithm, called MaxMinCluster, on a PC. It is exciting to see that when tested with different real data sets, MaxMinCluster consistently uncovers a high percentage of conserved genes that have been published by GenBank. Its performance is indeed favorably compared to MUMmer (perhaps the most popular software tool for uncovering conserved genes in a whole-genome scale).
Availability: The source code is available from the website http://www.csis.hku.hk/~colly/maxmincluster/. Detailed proof of the propositions can also be found there.
Contact: whwong{at}cs.hku.hk
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
H. L. Chan, T. W. Lam, W. K. Sung, P. W. H. Wong, S. M. Yiu, and X. Fan The mutated subsequence problem and locating conserved genes Bioinformatics, May 15, 2005; 21(10): 2271 - 2278. [Abstract] [Full Text] [PDF] |
||||
