Bioinformatics Advance Access originally published online on March 28, 2007
Bioinformatics 2007 23(11):1410-1417; doi:10.1093/bioinformatics/btm115
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SherLoc: high-accuracy prediction of protein subcellular localization by integrating text and protein sequence data
1School of Computing, Queens University, Kingston, Ontario, Canada and 2Division for Simulation of Biological Systems, ZBIT/WSI, University of Tübingen, Germany
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Knowing the localization of a protein within the cell helps elucidate its role in biological processes, its function and its potential as a drug target. Thus, subcellular localization prediction is an active research area. Numerous localization prediction systems are described in the literature; some focus on specific localizations or organisms, while others attempt to cover a wide range of localizations.
Results: We introduce SherLoc, a new comprehensive system for predicting the localization of eukaryotic proteins. It integrates several types of sequence and text-based features. While applying the widely used support vector machines (SVMs), SherLocs main novelty lies in the way in which it selects its text sources and features, and integrates those with sequence-based features. We test SherLoc on previously used datasets, as well as on a new set devised specifically to test its predictive power, and show that SherLoc consistently improves on previous reported results. We also report the results of applying SherLoc to a large set of yet-unlocalized proteins.
Availability: SherLoc, along with Supplementary Information, is available at: http://www-bs.informatik.uni-tuebingen.de/Services/SherLoc/
Contact: shatkay{at}cs.queensu.ca
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Alfonso Valencia
Received on September 11, 2006; revised on March 17, 2007; accepted on March 17, 2007
This article has been cited by other articles:
![]() |
S. Mintz-Oron, A. Aharoni, E. Ruppin, and T. Shlomi Network-based prediction of metabolic enzymes' subcellular localization Bioinformatics, June 15, 2009; 25(12): i247 - i1252. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Winnenburg, T. Wachter, C. Plake, A. Doms, and M. Schroeder Facts from text: can text mining help to scale-up high-quality manual curation of gene products with ontologies? Brief Bioinform, December 6, 2008; (2008) bbn043v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Yoshihara, K. Inoue, D. Schichnes, S. Ruzin, W. Inwood, and S. Kustu An Rh1-GFP Fusion Protein Is in the Cytoplasmic Membrane of a White Mutant Strain of Chlamydomonas reinhardtii Mol Plant, November 14, 2008; (2008) ssn074v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Lee, H.-Y. Chuang, A. Beyer, M.-K. Sung, W.-K. Huh, B. Lee, and T. Ideker Protein networks markedly improve prediction of subcellular localization in multiple eukaryotic species Nucleic Acids Res., November 1, 2008; 36(20): e136 - e136. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Lee, M. K. Chan, and R. Bundschuh Simple is beautiful: a straightforward approach to improve the delineation of true and false positives in PSI-BLAST searches Bioinformatics, June 1, 2008; 24(11): 1339 - 1343. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Casadio, P. L. Martelli, and A. Pierleoni The prediction of protein subcellular localization from sequence: a shortcut to functional genome annotation Brief Funct Genomic Proteomic, February 18, 2008; (2008) eln003v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Schmidt von Braun, A. Sabetti, P. J. Hanic-Joyce, J. Gu, E. Schleiff, and P. B. M. Joyce Dual targeting of the tRNA nucleotidyltransferase in plants: not just the signal J. Exp. Bot., December 1, 2007; 58(15-16): 4083 - 4093. [Abstract] [Full Text] [PDF] |
||||





