Efficient computation of close lower and upper bounds on the minimum number of recombinations in biological sequence evolution
Department of Computer Science, University of California Davis, CA 95616, USA
*To whom correspondence should be addressed.
Motivation: We are interested in studying the evolution of DNA single nucleotide polymorphism sequences which have undergone (meiotic) recombination. For a given set of sequences, computing the minimum number of recombinations needed to explain the sequences (with one mutation per site) is a standard question of interest, but it has been shown to be NP-hard, and previous algorithms that compute it exactly work either only on very small datasets or on problems with special structure.
Results: In this paper, we present efficient, practical methods for computing both upper and lower bounds on the minimum number of needed recombinations, and for constructing evolutionary histories that explain the input sequences. We study in detail the efficiency and accuracy of these algorithms on both simulated and real data sets. The algorithms produce very close upper and lower bounds, which match exactly in a surprisingly wide range of data. Thus, with the use of new, very effective lower bounding methods and an efficient algorithm for computing upper bounds, this approach allows the efficient, exact computation of the minimum number of needed recombinations, with high frequency in a large range of data. When upper and lower bounds match, evolutionary histories found by our algorithm correspond to the most parsimonious histories.
Availability: HapBound and SHRUB, programs implementing the new algorithms discussed in this paper, are available at http://wwwcsif.cs.ucdavis.edu/~gusfield/lu.html
Contact: yssong{at}cs.ucdavis.edu
Received on January 15, 2005; accepted on March 27, 2005
This article has been cited by other articles:
![]() |
D. L. Aylor, E. W. Price, and I. Carbone SNAP: Combine and Map modules for multilocus population genetic analysis Bioinformatics, June 1, 2006; 22(11): 1399 - 1401. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Maydt and T. Lengauer Recco: recombination analysis using cost optimization Bioinformatics, May 1, 2006; 22(9): 1064 - 1071. [Abstract] [Full Text] [PDF] |
||||
