Bioinformatics Advance Access published online on February 5, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth054
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Biomolecular Sciences Department, UMIST, P.O. Box 88, Manchester, M60 1QD, UK
* To whom correspondence should be addressed. E-mail: y.cai{at}umist.ac.uk.
Motivation: The localization of a protein in a cell is closely correlated with its biological function. With the number of sequences entering into databanks has been rapidly increasing, the importance of developing a powerful high-throughput tool to determine protein subcellular location has become self-evident. In view of this, the Nearest Neighbour Algorithm was developed for predicting the protein subcellular location using the strategy by hybridizing the information derived from the recent development in gene ontology with that from the functional domain composition [Chou, K.C. and Cai, Y.D. (2002) J. Biol. Chem. 277, 45765-45769] as well as the pseudo amino acid composition [Chou, K.C. (2001) Proteins Struct. Funct. Genet, 43, 246-255; Erratum: ibid. (2001) 44, 60]. Results: As a showcase, the same plant and non-plant protein datasets as investigated by the previous investigators [Emanuelsson, O., Nielsen, H., Brunak, S., and von Heijne, G. (2000) J. Mol. Biol. 300, 1005-1016] were used for demonstration. The overall success rate by the jackknife test for the plant protein dataset was 86%, and that for the non-plant protein dataset 91.2%. These are so far the highest success rates achieved for the two datasets by following a rigorous cross validation test procedure, suggesting that such a hybrid approach (particularly by incorporating the knowledge of gene ontology) may become a very useful high-throughput tool in the area of bioinformatics, proteomics, as well as molecular cell biology. Availability: The software is available by sending a request to the authors.
Accepted December 5, 2003
Article
Predicting subcellular localization of proteins in a hybridization space
2 Gordon Life Science Institute, San Diego, CA 92130, USA; Tianjin Research Institute of Bioinformatics and Drug Discovery (TRIBD), Tianjin, China
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Q.-B. Gao and Z.-Z. Wang Classification of G-protein coupled receptors at four levels Protein Eng. Des. Sel., November 1, 2006; 19(11): 511 - 516. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Hoglund, P. Donnes, T. Blum, H.-W. Adolph, and O. Kohlbacher MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition Bioinformatics, May 15, 2006; 22(10): 1158 - 1165. [Abstract] [Full Text] [PDF] |
||||

