Bioinformatics Vol. 17 no. 4 2001
Pages 327-337
© 2001 Oxford University Press
Original Paper |
A new approach to sequence comparison: normalized sequence alignment
ecio
lu 1
1 Department of Computer Science, University
of California, Santa Barbara, Santa Barbara, CA 93106, USA
2 Department of Computer Science and
Engineering, University of California, San Diego, San Diego, CA
92093, USA
Received on August 11, 2000
; revised on November 20, 2000
; accepted on November 22, 2000
The SmithWaterman algorithm for local sequence
alignment is one of the most important techniques in computational
molecular biology. This ingenious dynamic programming approach was
designed to reveal the highly conserved fragments by discarding
poorly conserved initial and terminal segments. However, the existing
notion of local similarity has a serious flaw: it does not discard
poorly conserved intermediate segments. The SmithWaterman
algorithm finds the local alignment with maximal score but
it is unable to find local alignment with maximum degree of
similarity (e.g. maximal percent of matches). Moreover, there is
still no efficient algorithm that answers the following natural
question: do two sequences share a (sufficiently long) fragment with
more than 70% of similarity? As a result, the local alignment
sometimes produces a mosaic of well-conserved fragments artificially
connected by poorly-conserved or even unrelated fragments. This may
lead to problems in comparison of long genomic sequences and
comparative gene prediction as recently pointed out by Zhang et
al. (Bioinformatics , 15, 10121019,
1999). In this paper we propose a new sequence comparison algorithm
(normalized local alignment ) that reports the regions with
maximum degree of similarity. The algorithm is based on fractional
programming and its running time is
. In practice, normalized local
alignment is only 35 times slower than the standard
SmithWaterman algorithm.
Contact: {arslan,omer}@cs.ucsb.edu; ppevzner{at}cs.ucsd.edu
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y.-F. Chang, Y.-L. Huang, and C. L. Lu SARSA: a web tool for structural alignment of RNA using a structural alphabet Nucleic Acids Res., July 1, 2008; 36(suppl_2): W19 - W24. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Szklarczyk and J. Heringa AuberGene--a sensitive genome alignment tool Bioinformatics, June 15, 2006; 22(12): 1431 - 1436. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Nozaki and M. Bellgard Statistical evaluation and comparison of a pairwise alignment algorithm that a priori assigns the number of gaps rather than employing gap penalties Bioinformatics, April 15, 2005; 21(8): 1421 - 1428. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Chevreux, T. Pfisterer, B. Drescher, A. J. Driesel, W. E.G. Muller, T. Wetter, and S. Suhai Using the miraEST Assembler for Reliable and Automated mRNA Transcript Assembly and SNP Detection in Sequenced ESTs Genome Res., June 1, 2004; 14(6): 1147 - 1159. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. N. Arslan and O. Egecioglu Dynamic Programming Based Approximation Algorithms for Sequence Alignment with Constraints INFORMS Journal on Computing, January 1, 2004; 16(4): 441 - 458. [Abstract] [PDF] |
||||
![]() |
A. J. Mackey, T. A. J. Haystead, and W. R. Pearson Getting More from Less: Algorithms for Rapid Protein Identification with Multiple Short Peptide Sequences Mol. Cell. Proteomics, February 1, 2002; 1(2): 139 - 147. [Abstract] [Full Text] [PDF] |
||||




