Bioinformatics Vol. 18 no. 12 2002
Pages 1673-1680
© 2002 Oxford University Press
A hierarchical approach to aligning collinear regions of genomes
1 Institute of Mathematical Problems in Biology,
Pushchino, Moscow Region 142290, Russia
2 National Center for Biotechnology Information,
NIH, 45 Center Drive, Bethesda, MD 20892-6510, USA
Received on February 4, 2002
; revised on May 9, 2002
; accepted on May 21, 2002
Motivation: As a first approximation, similarity between two long orthologous regions of genomes can be represented by a chain of local similarities. Within such a chain, pairs of successive similarities are collinear (non-conflicting), i.e. segments involved in the nth similarity precede in both sequences segments involved in the (n+1)th similarity. However, when all similarities between two long sequences are considered, usually there are many conflicts between them. Although some conflicts can be avoided by masking transposons or low-complexity sequences, selecting only those similarities that reflect orthology and, thus, belong to the evolutionarily true chain is not trivial.
Results: We propose a simple, hierarchical algorithm of finding the true chain of local similarities. Starting from similarities with low P-values, we resolve each pairwise conflict by deleting a similarity with a higher P-value. This greedy approach constructs a chain of similarities faster than when a chain optimal with respect to some global criterion is sought, and makes more sense biologically.
Availability: A software tool OWEN based on the proposed approach is described in the accompanying note and is freely available at ftp://ftp.ncbi.nih.gov/pub/kondrashov/owen
Contact: kondrashov{at}ncbi.nlm.nih.gov
Supplementary information: Algorithm Chain and examples of chains of local similarities are available at ftp://ftp.ncbi.nih.gov/pub/kondrashov/owen/extra
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Y. Ogurtsov, S. A. Shabalina, A. S. Kondrashov, and M. A. Roytberg Analysis of internal loops within the RNA secondary structure in almost quadratic time Bioinformatics, June 1, 2006; 22(11): 1317 - 1324. [Abstract] [Full Text] [PDF] |
||||
