Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Nair, R.
Right arrow Articles by Rost, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nair, R.
Right arrow Articles by Rost, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 18 no. 90001 2002
Pages S78-S86
© 2002 Oxford University Press

Inferring sub-cellular localization through automated lexical analysis

Rajesh Nair 1,2 and Burkhard Rost 1,3,*

1 CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168th Street BB217, New York, NY 10032, USA
2 Department of Physics, Columbia University, 538 West 120th Street, New York, NY 10027, USA
3 Columbia University Center for Computational Biology and Bioinformatics (C2B2), Russ Berrie Pavilion, 1150 St. Nicholas Avenue, New York, NY 10032, USA

Received on January 24, 2002 ; revised on March 29, 2002 ; accepted on March 29, 2002

Motivation: The SWISS-PROT sequence database contains keywords of functional annotations for many proteins. In contrast, information about the sub-cellular localization is available for only a few proteins. Experts can often infer localization from keywords describing protein function. We developed LOCkey, a fully automated method for lexical analysis of SWISS-PROT keywords that assigns sub-cellular localization. With the rapid growth in sequence data, the biochemical characterisation of sequences has been falling behind. Our method may be a useful tool for supplementing functional information already automatically available.

Results: The method reached a level of more than 82% accuracy in a full cross-validation test. Due to a lack of functional annotations, we could infer localization for fewer than half of all proteins in SWISS-PROT. We applied LOCkey to annotate five entirely sequenced proteomes, namely Saccharomyces cerevisiae (yeast), Caenorhabditis elegans (worm), Drosophila melanogaster (fly), Arabidopsis thaliana (plant) and a subset of all human proteins. LOCkey found about 8000 new annotations of sub-cellular localization for these eukaryotes.

Availability: Annotations of localization for eukaryotes at: http://cubic.bioc.columbia.edu/services/LOCkey

Contact: nair{at}cubic.bioc.coplumbia.edu rost{at}columbia.edu

Keywords: genome sequence analysis; predicting sub-cellular localization; protein function; lexical analysis.

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief Funct Genomic ProteomicHome page
R. Casadio, P. L. Martelli, and A. Pierleoni
The prediction of protein subcellular localization from sequence: a shortcut to functional genome annotation
Brief Funct Genomic Proteomic, February 18, 2008; (2008) eln003v1.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Shatkay, A. Hoglund, S. Brady, T. Blum, P. Donnes, and O. Kohlbacher
SherLoc: high-accuracy prediction of protein subcellular localization by integrating text and protein sequence data
Bioinformatics, June 1, 2007; 23(11): 1410 - 1417.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Lee, D.-W. Kim, D. Na, K. H. Lee, and D. Lee
PLPD: reliable protein localization prediction from imbalanced and overlapped datasets
Nucleic Acids Res., October 18, 2006; 34(17): 4655 - 4666.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Guda
pTARGET: a web server for predicting protein subcellular localization.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W210 - W213.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Luz and M. Vingron
Family specific rates of protein evolution
Bioinformatics, May 15, 2006; 22(10): 1166 - 1171.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Guda and S. Subramaniam
TARGET: a new method for predicting protein subcellular localization in eukaryotes
Bioinformatics, November 1, 2005; 21(21): 3963 - 3969.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Rey, M. Acab, J. L. Gardy, M. R. Laird, K. deFays, C. Lambert, and F. S. L. Brinkman
PSORTdb: a protein subcellular localization database for bacteria
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D164 - D168.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. Rost, G. Yachdav, and J. Liu
The PredictProtein server
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W321 - W326.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Nair and B. Rost
LOCnet and LOCtarget: sub-cellular localization for structural genomics targets
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W517 - W521.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Nair and B. Rost
LOC3D: annotate sub-cellular localization for protein structures
Nucleic Acids Res., July 1, 2003; 31(13): 3337 - 3340.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.