Bioinformatics Advance Access originally published online on July 29, 2004
Bioinformatics 2004 20(18):3604-3612; doi:10.1093/bioinformatics/bth451
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics vol. 20 issue 18 © Oxford University Press 2004; all rights reserved.
Discovering patterns to extract proteinprotein interactions from full texts
1 State Key Laboratory of Intelligent Technology and Systems (LITS), Department of Computer Science and Technology, University of Tsinghua, Beijing, 100084, China 2 Rigel Pharmaceuticals Inc, 1180 Veterans. Blvd, South San Francisco, CA 94080, USA and 3 Bioinformatics Laboratory, School of Computer Science, University of Waterloo, N2L 3G1, Ontario, Canada
Received on April 2, 2004; revised on June 22, 2004; accepted on July 7, 2004
Advance Access Publication July 29, 2004
Motivation: Although there are several databases storing proteinprotein interactions, most such data still exist only in the scientific literature. They are scattered in scientific literature written in natural languages, defying data mining efforts. Much time and labor have to be spent on extracting protein pathways from literature. Our aim is to develop a robust and powerful methodology to mine proteinprotein interactions from biomedical texts.
Results: We present a novel and robust approach for extracting proteinprotein interactions from literature. Our method uses a dynamic programming algorithm to compute distinguishing patterns by aligning relevant sentences and key verbs that describe protein interactions. A matching algorithm is designed to extract the interactions between proteins. Equipped only with a dictionary of protein names, our system achieves a recall rate of 80.0% and precision rate of 80.5%.
Availability: The program is available on request from the authors.
Contact: zxy-dcs{at}tsinghua.edu.cn; mli{at}uwaterloo.ca
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. P. Diz, E. Dudley, B. W. MacDonald, B. Pina, E. L. R. Kenchington, E. Zouros, and D. O. F. Skibinski Genetic Variation Underlying Protein Expression in Eggs of the Marine Mussel Mytilus edulis Mol. Cell. Proteomics, January 1, 2009; 8(1): 132 - 144. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Cheng, C. Knox, N. Young, P. Stothard, S. Damaraju, and D. S. Wishart PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites Nucleic Acids Res., July 1, 2008; 36(suppl_2): W399 - W405. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Kim, J. Yoon, and J. Yang Kernel approaches for genic interaction extraction Bioinformatics, January 1, 2008; 24(1): 118 - 126. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Fundel, R. Kuffner, and R. Zimmer RelEx--Relation extraction using dependency parse trees Bioinformatics, February 1, 2007; 23(3): 365 - 371. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Hao, X. Zhu, M. Huang, and M. Li Discovering patterns to extract protein-protein interactions from the literature: Part II Bioinformatics, August 1, 2005; 21(15): 3294 - 3300. [Abstract] [Full Text] [PDF] |
||||


