Bioinformatics Vol. 19 no. 13 2003
Pages 1656-1663
© 2003 Oxford University Press
Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs
Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
Received on November 6, 2002 ; revised on January 24, 2003 and March 17, 2003
Motivation: The subcellular location of a protein is closely correlated to its function. Thus, computational prediction of subcellular locations from the amino acid sequence information would help annotation and functional prediction of protein coding genes in complete genomes. We have developed a method based on support vector machines (SVMs).
Results: We considered 12 subcellular locations in eukaryotic cells: chloroplast, cytoplasm, cytoskeleton, endoplasmic reticulum, extracellular medium, Golgi apparatus, lysosome, mitochondrion, nucleus, peroxisome, plasma membrane, and vacuole. We constructed a data set of proteins with known locations from the SWISS-PROT database. A set of SVMs was trained to predict the subcellular location of a given protein based on its amino acid, amino acid pair, and gapped amino acid pair compositions. The predictors based on these different compositions were then combined using a voting scheme. Results obtained through 5-fold cross-validation tests showed an improvement in prediction accuracy over the algorithm based on the amino acid composition only. This prediction method is available via the Internet.
Availability: http://www.genome.ad.jp/SIT/ploc.html
Supplementary information: http://web.kuicr.kyoto-u.ac.jp/~park/Seqdata/
Contact: kanehisa{at}kuicr.kyoto-u.ac.jp
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Ma and J. Huang Penalized feature selection and classification in bioinformatics Brief Bioinform, September 1, 2008; 9(5): 392 - 403. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-B. Shen and K.-C. Chou Nuc-PLoc: a new web-server for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM Protein Eng. Des. Sel., November 10, 2007; (2007) gzm057v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu, S. Kang, C. Tang, L. B.M. Ellis, and T. Li Meta-prediction of protein subcellular localization with reduced voting Nucleic Acids Res., August 1, 2007; (2007) gkm562v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Jiang, H. Wu, J. Wei, F. Sang, X. Sun, and Z. Lu RF-DYMHC: detecting the yeast meiotic recombination hotspots and coldspots by random forest model using gapped dinucleotide composition features Nucleic Acids Res., July 13, 2007; 35(suppl_2): W47 - W51. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Shatkay, A. Hoglund, S. Brady, T. Blum, P. Donnes, and O. Kohlbacher SherLoc: high-accuracy prediction of protein subcellular localization by integrating text and protein sequence data Bioinformatics, June 1, 2007; 23(11): 1410 - 1417. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Budagyan and R. Abagyan Weighted quality estimates in machine learning Bioinformatics, November 1, 2006; 22(21): 2597 - 2603. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Lee, D.-W. Kim, D. Na, K. H. Lee, and D. Lee PLPD: reliable protein localization prediction from imbalanced and overlapped datasets Nucleic Acids Res., October 18, 2006; 34(17): 4655 - 4666. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Hoglund, P. Donnes, T. Blum, H.-W. Adolph, and O. Kohlbacher MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition Bioinformatics, May 15, 2006; 22(10): 1158 - 1165. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.-J. Park, M. M. Gromiha, P. Horton, and M. Suwa Discrimination of outer membrane proteins using support vector machines Bioinformatics, December 1, 2005; 21(23): 4223 - 4229. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Matsuda, J.-P. Vert, H. Saigo, N. Ueda, H. Toh, and T. Akutsu A novel representation of protein sequences for prediction of subcellular location using support vector machines Protein Sci., November 1, 2005; 14(11): 2804 - 2813. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Xie, A. Li, M. Wang, Z. Fan, and H. Feng LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST Nucleic Acids Res., July 1, 2005; 33(suppl_2): W105 - W110. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Boden and J. Hawkins Prediction of subcellular localization using sequence-biased recurrent networks Bioinformatics, May 15, 2005; 21(10): 2279 - 2286. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.-C. Chou and Y.-D. Cai Predicting protein localization in budding Yeast Bioinformatics, April 1, 2005; 21(7): 944 - 950. [Abstract] [Full Text] [PDF] |
||||




