Bioinformatics Vol. 18 no. 2 2002
Pages 362-367
© 2002 Oxford University Press
Structure motif discovery and mining the PDB
1 Department of Informatics, University of
Bergen, HIB, N5020 Bergen, Norway
2 ZymoGenetics Inc., 1201 Eastlake Avenue
East, Seattle, WA 98102, USA
3 Division of Mathematical Biology, National
Institute of Medical Research, Mill Hill, London, UK
Received on March 2, 2001
; revised on September 6, 2001
; accepted on September 6, 2001
Motivation: Many of the most interesting functional and evolutionary relationships among proteins are so ancient that they cannot be reliably detected through sequence analysis and are apparent only through a comparison of the tertiary structures. The conserved features can often be described as structural motifs consisting of a few single residues or Secondary Structure (SS) elements. Confidence in such motifs is greatly boosted when they are found in more than a pair of proteins.
Results: We describe an algorithm for the automatic discovery of recurring patterns in protein structures. The patterns consist of individual residues having a defined order along the proteins backbone that come close together in the structure and whose spatial conformations are similar. The residues in a pattern need not be close in the proteins sequence. The work described in this paper builds on an earlier reported algorithm for motif discovery. This paper describes a significant improvement of the algorithm which makes it very efficient. The improved efficiency allows us to use it for doing unsupervised learning of patterns occurring in small subsets in a large set of structures, a non-redundant subset of the Protein Data Bank (PDB) database of all known protein structures.
Availability: The program is freely available to academia, requests can be sent to Inge.Jonassen{at}ii.uib.no.
Contact: Inge.Jonassen{at}ii.uib.no
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. L. Jensen, M. P. Styczynski, I. Rigoutsos, and G. N. Stephanopoulos A generic motif discovery algorithm for sequential data Bioinformatics, January 1, 2006; 22(1): 21 - 28. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Wang and R. Samudrala FSSA: a novel method for identifying functional signatures from structural alignments Bioinformatics, July 1, 2005; 21(13): 2969 - 2977. [Abstract] [Full Text] [PDF] |
||||
