© IRL Press at Oxford University Press
Improved sensitivity of biological sequence database searches
Department of Biochemistry, Beckman Center, Stanford University School of Medicine Stanford, CA, 94305
1IntelliGenetics Inc., 700 East E1 Camino Real, Mountain View, CA 94040, USA
We have increased the sensitivity ofDNA and protein sequence database searches by allowing similar but non-identical amino acids or nucleotides to match. In addition, one can match k-tuples or words instead of matching individual residues in order to speed the search. A matching matrix specifies which k-tuples match each other. The matching matrix can be calculated from a similarity matrix of amino acids and a threshold of similarity required for matching. This permits amino acid similarity matrices or replacement matrices (PAM matrices) to be used in the first step of a sequence comparison rather than in a secondary scoring phase. The concept of matching non-identical k-tuples also increases the power ofDNA database searches. For example, a matrix that specifies that any 3-tuple in a DNA sequence can match any other 3-tuple encoding the same amino acid permits a DNA database search using a DNA query sequence for regions that would encode a similar amino acid sequence.
Received on October 10, 1989; accepted on May 1, 1990
This article has been cited by other articles:
![]() |
W. Rothwell, P Fogarty, C. Field, and W Sullivan Nuclear-fallout, a Drosophila protein that cycles from the cytoplasm to the centrosomes, regulates cortical microfilament organization Development, January 4, 1998; 125(7): 1295 - 1303. [Abstract] [PDF] |
||||
![]() |
M. Parisi and D. Clayton Similarity of human mitochondrial transcription factor 1 to high mobility group proteins Science, May 17, 1991; 252(5008): 965 - 969. [Abstract] [PDF] |
||||

