Bioinformatics Advance Access published online on January 22, 2004
Bioinformatics, doi:10.1093/bioinformatics/btg447
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Computing Science, University of Alberta, Edmonton, AB, Canada, T6G 2E8
Motivation: Identifying the destination or localization of proteins is key to understanding their function and facilitating their purification. A number of existing computational prediction methods are based on sequence analysis. However, these methods are limited in scope, accuracy and most particularly breadth of coverage. Rather than using sequence information alone, we have explored the use of database text annotations from homologs and machine learning to substantially improve the prediction of subcellular location. Results: We have constructed five machine-learning classifiers for predicting subcellular localization of proteins from animals, plants, fungi, Gram-negative bacteria and Gram-positive bacteria, which are 81% accurate for fungi and 92% to 94% accurate for the other four categories. These are the most accurate subcellular predictors across the widest set of organisms ever published. Our predictors are part of the Proteome Analyst (PA) web-service. Availability: http://www.cs.ualberta.ca/~bioinfo/PA/Sub http://www.cs.ualberta.ca/~bioinfo/PA Supplementary Information: http://www.cs.ualberta.ca/~bioinfo/PA/Subcellular
Accepted September 25, 2003
Article
Predicting subcellular localization of proteins using machine-learned classifiers
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
B. Bostan, R. Greiner, D. Szafron, and P. Lu Predicting homologous signaling pathways using machine learning Bioinformatics, November 15, 2009; 25(22): 2913 - 2920. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. R. Montor, J. Huang, Y. Hu, E. Hainsworth, S. Lynch, J.-W. Kronish, C. L. Ordonez, T. Logvinenko, S. Lory, and J. LaBaer Genome-Wide Study of Pseudomonas aeruginosa Outer Membrane Protein Immunogenicity Using Self-Assembling Protein Microarrays Infect. Immun., November 1, 2009; 77(11): 4877 - 4886. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-Q. Shen, B. F. Lang, and G. Burger Diversity and dispersal of a ubiquitous protein family: acyl-CoA dehydrogenases Nucleic Acids Res., September 1, 2009; 37(17): 5619 - 5631. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Liu, D. Li, J. Wang, H. Xie, Y. Zhu, and F. He Proteome-wide Prediction of Signal Flow Direction in Protein Interaction Networks Based on Interacting Domains Mol. Cell. Proteomics, September 1, 2009; 8(9): 2063 - 2070. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Fyshe, Y. Liu, D. Szafron, R. Greiner, and P. Lu Improving subcellular localization prediction using text classification and the gene ontology Bioinformatics, November 1, 2008; 24(21): 2512 - 2517. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Alves-Pereira, J. Canales, A. Cabezas, P. Martin Cordero, M. J. Costas, and J. C. Cameselle CDP-Alcohol Hydrolase, a Very Efficient Activity of the 5'-Nucleotidase/UDP-Sugar Hydrolase Encoded by the ushA Gene of Yersinia intermedia and Escherichia coli J. Bacteriol., September 15, 2008; 190(18): 6153 - 6161. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Y. Busot, B. McClure, C. P. Ibarra-Sanchez, K. Jimenez-Duran, S. Vazquez-Santana, and F. Cruz-Garcia Pollination in Nicotiana alata stimulates synthesis and transfer to the stigmatic surface of NaStEP, a vacuolar Kunitz proteinase inhibitor homologue J. Exp. Bot., August 1, 2008; 59(11): 3187 - 3201. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Galka, S. N. Wai, H. Kusch, S. Engelmann, M. Hecker, B. Schmeck, S. Hippenstiel, B. E. Uhlin, and M. Steinert Proteomic Characterization of the Whole Secretome of Legionella pneumophila and Functional Analysis of Outer Membrane Vesicles Infect. Immun., May 1, 2008; 76(5): 1825 - 1836. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Kopriva, K. Fritzemeier, G. Wiedemann, and R. Reski The Putative Moss 3'-Phosphoadenosine-5'-phosphosulfate Reductase Is a Novel Form of Adenosine-5'-phosphosulfate Reductase without an Iron-Sulfur Cluster J. Biol. Chem., August 3, 2007; 282(31): 22930 - 22938. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu, S. Kang, C. Tang, L. B.M. Ellis, and T. Li Meta-prediction of protein subcellular localization with reduced voting Nucleic Acids Res., August 1, 2007; (2007) gkm562v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Barsky, J. L. Gardy, R. E. W. Hancock, and T. Munzner Cerebral: a Cytoscape plugin for layout of and interaction with biological networks using subcellular localization annotation Bioinformatics, April 15, 2007; 23(8): 1040 - 1042. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Seringhaus, A. Paccanaro, A. Borneman, M. Snyder, and M. Gerstein Predicting essential genes in fungal genomes Genome Res., September 1, 2006; 16(9): 1126 - 1135. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Ricoult, L. O. Echeverria, J.-B. Cliquet, and A. M. Limami Characterization of alanine aminotransferase (AlaAT) multigene family and hypoxic response in young seedlings of the model legume Medicago truncatula J. Exp. Bot., September 1, 2006; 57(12): 3079 - 3089. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-J. Han and S. Y. Lee The Escherichia coli Proteome: Past, Present, and Future Prospects Microbiol. Mol. Biol. Rev., June 1, 2006; 70(2): 362 - 439. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-C. Kim, X. Qian, C. Leang, M. V. Coppi, and D. R. Lovley Two Putative c-Type Multiheme Cytochromes Required for the Expression of OmcB, an Outer Membrane Protein Essential for Optimal Fe(III) Reduction in Geobacter sulfurreducens. J. Bacteriol., April 1, 2006; 188(8): 3138 - 3142. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Brunke and B. Hube MfLIP1, a gene encoding an extracellular lipase of the lipid-dependent fungus Malassezia furfur Microbiology, February 1, 2006; 152(2): 547 - 554. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Fan, Y.-M. She, R. D. Bagshaw, J. W. Callahan, H. Schachter, and D. J. Mahuran Identification of the hydrophobic glycoproteins of Caenorhabditis elegans Glycobiology, October 1, 2005; 15(10): 952 - 964. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Xie, A. Li, M. Wang, Z. Fan, and H. Feng LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST Nucleic Acids Res., July 1, 2005; 33(suppl_2): W105 - W110. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. V. HarshaRani, S. J. Vayttaden, and U. S. Bhalla Electronic Data Sources for Kinetic Models of Cell Signaling J. Biochem., June 1, 2005; 137(6): 653 - 657. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Boden and J. Hawkins Prediction of subcellular localization using sequence-biased recurrent networks Bioinformatics, May 15, 2005; 21(10): 2279 - 2286. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Atalay and R. Cetin-Atalay Implicit motif distribution based hybrid computational kernel for sequence classification Bioinformatics, April 15, 2005; 21(8): 1429 - 1436. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Gardy, M. R. Laird, F. Chen, S. Rey, C. J. Walsh, M. Ester, and F. S. L. Brinkman PSORTb v.2.0: Expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis Bioinformatics, March 1, 2005; 21(5): 617 - 623. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Lu, D. Szafron, R. Greiner, D. S. Wishart, A. Fyshe, B. Pearcy, B. Poulin, R. Eisner, D. Ngo, and N. Lamb PA-GOSUB: a searchable database of model organism protein sequences with their predicted Gene Ontology molecular function and subcellular localization Nucleic Acids Res., January 1, 2005; 33(suppl_1): D147 - D153. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rey, M. Acab, J. L. Gardy, M. R. Laird, K. deFays, C. Lambert, and F. S. L. Brinkman PSORTdb: a protein subcellular localization database for bacteria Nucleic Acids Res., January 1, 2005; 33(suppl_1): D164 - D168. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Reumers, J. Schymkowitz, J. Ferkinghoff-Borg, F. Stricher, L. Serrano, and F. Rousseau SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs Nucleic Acids Res., January 1, 2005; 33(suppl_1): D527 - D532. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Scott, D. Y. Thomas, and M. T. Hallett Predicting Subcellular Localization via Protein Motif Co-Occurrence Genome Res., October 1, 2004; 14(10a): 1957 - 1966. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Szafron, P. Lu, R. Greiner, D. S. Wishart, B. Poulin, R. Eisner, Z. Lu, J. Anvik, C. Macdonell, A. Fyshe, et al. Proteome Analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations Nucleic Acids Res., July 1, 2004; 32(suppl_2): W365 - W371. [Abstract] [Full Text] [PDF] |
||||











