Bioinformatics Advance Access originally published online on April 6, 2005
Bioinformatics 2005 21(11):2759-2765; doi:10.1093/bioinformatics/bti390
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Literature mining and database annotation of protein phosphorylation using a rule-based system
1Department of Biochemistry and Molecular Biology, Georgetown University Medical Center Washington, DC 20057, USA
2AU-KBC Research Centre, Anna University Chennai 600044, India
3Department of Computer and Information Sciences, University of Delaware Newark, DE 19716, USA
*To whom correspondence should be addressed.
Motivation: A large volume of experimental data on protein phosphorylation is buried in the fast-growing PubMed literature. While of great value, such information is limited in databases owing to the laborious process of literature-based curation. Computational literature mining holds promise to facilitate database curation.
Results: A rule-based system, RLIMS-P (Rule-based LIterature Mining System for Protein Phosphorylation), was used to extract protein phosphorylation information from MEDLINE abstracts. An annotation-tagged literature corpus developed at PIR was used to evaluate the system for finding phosphorylation papers and extracting phosphorylation objects (kinases, substrates and sites) from abstracts. RLIMS-P achieved a precision and recall of 91.4 and 96.4% for paper retrieval, and of 97.9 and 88.0% for extraction of substrates and sites. Coupling the high recall for paper retrieval and high precision for information extraction, RLIMS-P facilitates literature mining and database annotation of protein phosphorylation.
Availability: The program is available on request from the authors. The phosphorylation patterns and datasets used in this study are available at http://pir.georgetown.edu/iprolink/
Contact: zh9{at}georgetown.edu
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
T. Theodosiou, N. Darzentas, L. Angelis, and C. A. Ouzounis PuReD-MCL: a graph-based PubMed document clustering methodology Bioinformatics, September 1, 2008; 24(17): 1935 - 1941. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Diella, C. M. Gould, C. Chica, A. Via, and T. J. Gibson Phospho.ELM: a database of phosphorylation sites update 2008 Nucleic Acids Res., January 11, 2008; 36(suppl_1): D240 - D244. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. E. Crangle, J. M. Cherry, E. L. Hong, and A. Zbyslaw Mining experimental evidence of molecular function claims from the literature Bioinformatics, December 1, 2007; 23(23): 3232 - 3240. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-H. Kim, A. Mitchell, T. K. Attwood, and M. Hilario Learning to extract relations for protein annotation Bioinformatics, July 1, 2007; 23(13): i256 - i263. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Fundel, R. Kuffner, and R. Zimmer RelEx--Relation extraction using dependency parse trees Bioinformatics, February 1, 2007; 23(3): 365 - 371. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Han, Z. Obradovic, Z.-Z. Hu, C. H. Wu, and S. Vucetic Substring selection for biomedical document classification Bioinformatics, September 1, 2006; 22(17): 2136 - 2142. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Yuan, Z. Z. Hu, H. T. Wu, M. Torii, M. Narayanaswamy, K. E. Ravikumar, K. Vijay-Shanker, and C. H. Wu An online literature mining tool for protein phosphorylation Bioinformatics, July 1, 2006; 22(13): 1668 - 1669. [Abstract] [Full Text] [PDF] |
||||

