Bioinformatics Vol. 19 Suppl. 1 2003
Pages i81-i83
© 2003 Oxford University Press
PSI: indexing protein structures for fast similarity search
Department of Computer Science University of California, Santa Barbara, CA 93106, USA
Received on January 6, 2003
; accepted on February 20, 2003
Motivation: We consider the problem of finding similarities in protein structure databases. Current techniques sequentially compare the given query protein to all of the proteins in the database to find similarities. Therefore, the cost of similarity queries increases linearly as the volume of the protein databases increase. As the sizes of experimentally determined and theoretically estimated protein structure databases grow, there is a need for scalable searching techniques.
Results: Our techniques extract feature vectors on triplets of SSEs (Secondary Structure Elements). Later, these feature vectors are indexed using a multidimensional index structure. For a given query protein, this index structure is used to quickly prune away unpromising proteins in the database. The remaining proteins are then aligned using a popular alignment tool such as VAST. We also develop a novel statistical model to estimate the goodness of a match using the SSEs. Experimental results show that our techniques improve the pruning time of VAST 3 to 3.5 times while maintaining similar sensitivity.
Contact: orhan{at}cs.ucsb.edu; tamer{at}cs.ucsb.edu; ambuj{at}cs.ucsb.edu.
Keywords: Protein structures, feature vectors, indexing
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Shi, Y. Zhong, I. Majumdar, S. Sri Krishna, and N. V. Grishin Searching for three-dimensional secondary structural patterns in proteins with ProSMoS Bioinformatics, June 1, 2007; 23(11): 1331 - 1338. [Abstract] [Full Text] [PDF] |
||||
