Bioinformatics Advance Access published online on January 29, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth031
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Automated Scheduling, Optimisation and Planning Group University of Nottingham, Nottingham, NG8 1BB, UK
* To whom correspondence should be addressed. E-mail: dpelta{at}ugr.es.
Motivation As an increasing number of protein structures become available, the need for algorithms that can quantify the similarity between protein structures increases as well. Thus, the comparison of proteins' structures, and their clustering accordingly to a given similarity measure, is at the core of today's biomedical research. In this paper we show how an algorithmic information theory inspired Universal Similarity Metric can be used to calculate similarities between protein pairs. The method, besides being theoretically supported, is surprisingly simple to implement and computationally efficient. Results Structural similarity between proteins in four different data sets was measured using the Universal Similarity Metric. The sample employed represented alpha, beta, alpha-beta, tim-barrel, globins and serpine protein types. The use of the proposed metric allows for a correct measurement of similarity and classification of the proteins in the four data sets. Availability All the scripts and programs used for the preparation of this paper are available at http://www.cs.nott.ac.uk/~nxk/USM/protocol.html In that web-page the reader will find a brief description on how to use the various scripts and programs. Supplementary Information The protein data sets used are collected in http://www.cs.nott.ac.uk/~nxk/USM/datasets.html The calculated similarity values for the proteins used in this paper can be found in http://www.cs.nott.ac.uk/~nxk/USM/similar.htnil The clustering of the data set based on these similarity values can be found in http://www.cs.nott.ac.uk/~nxk/USM/clustering.html
Article
Measuring the similarity of protein structures by means of the Universal Similarity Metric
2 Department of Computer Science and Artificial Intelligence E.T.S.I. Informatica, Umversidad de Granada, 18071, Granada, Spain
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Kocsor, A. Kertesz-Farkas, L. Kajan, and S. Pongor Application of compression-based distance measures to protein sequence classification: a methodological study Bioinformatics, February 15, 2006; 22(4): 407 - 412. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Handl, J. Knowles, and D. B. Kell Computational cluster validation in post-genomic data analysis Bioinformatics, August 1, 2005; 21(15): 3201 - 3212. [Abstract] [Full Text] [PDF] |
||||
