Rapid protein fragment search using hash functions based on the Fourier transform
1Human Genome Center, Institute of Medical Science, University of Tokyo 4-6-1 Shirokanedai Minato-ku Tokyo 108, Japan
2Informution and Network Research Laboratory, Martushita Research Institute Tokyo, lnc. Higashimita, Tamu-Ku, Kawasaki 214, Japan
3Meiji University 1-1, Kandasurugudai.Chiyoda-Ku, Tokyo 101, Japan
4 To whom correspondence should be addressed
MOTIVATION:: Since the protein structure database has been growing very rapidly in recent years, the development of eficient methods for searching for similar structures is very important.
RESULTS:: Results: This paper presents a novel method for searching for similar fragments of proteins. In this method, a hash vector (a vector of real numbers) is associated with each fixed-length fragment of three-dimensional protein structure. Each vector consists of low-frequency components of the Fourier-like spectrum for the distances between C
atoms and the centroid. Then, we can analyze the similarity between fragments by evaluating the difference between hash vectors. The novel aspect of the method is that the following property is proved theoretically: if the root mean square distance between two fragments is small, then the distance between the hash vectors is small. Several variants of this method were compared with a naive method and a previous method using PDB data. The results show that the fastest one among the variants is 18-80 times faster than the naive method, and 3-10 times faster than the previous method.
CONTACT:: E-mail: takutsu{at}ims.u-tokyo.ac.jp
Received on October 8, 1996; accepted on February 5, 1997
This article has been cited by other articles:
![]() |
A. O. Samson and M. Levitt Protein segment finder: an online search engine for segment motifs in the PDB Nucleic Acids Res., January 1, 2009; 37(suppl_1): D224 - D228. [Abstract] [Full Text] [PDF] |
||||
