Bioinformatics Advance Access originally published online on November 14, 2007
Bioinformatics 2008 24(1):118-126; doi:10.1093/bioinformatics/btm544
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Kernel approaches for genic interaction extraction
1Department of Computer Science, Sogang University and 2Daumsoft Inc., Se-Ah Venture Tower, Seoul, Korea
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Automatic knowledge discovery and efficient information access such as named entity recognition and relation extraction between entities have recently become critical issues in the biomedical literature. However, the inherent difficulty of the relation extraction task, mainly caused by the diversity of natural language, is further compounded in the biomedical domain because biomedical sentences are commonly long and complex. In addition, relation extraction often involves modeling long range dependencies, discontiguous word patterns and semantic relations for which the pattern-based methodology is not directly applicable.
Results: In this article, we shift the focus of biomedical relation extraction from the problem of pattern extraction to the problem of kernel construction. We suggest four kernels: predicate, walk, dependency and hybrid kernels to adequately encapsulate information required for a relation prediction based on the sentential structures involved in two entities. For this purpose, we view the dependency structure of a sentence as a graph, which allows the system to deal with an essential one from the complex syntactic structure by finding the shortest path between entities. The kernels we suggest are augmented gradually from the flat features descriptions to the structural descriptions of the shortest paths. As a result, we obtain a very promising result, a 77.5 F-score with the walk kernel on the Language Learning in Logic (LLL) 05 genic interaction shared task.
Availability: The used algorithms are free for use for academic research and are available from our Web site http://mllab.sogang.ac.kr/
shkim/LLL05.tar.gz.
Contact: shkim{at}lex.yonsei.ac.kr
Associate Editor: John Quackenbush
Received on May 14, 2007; revised on September 21, 2007; accepted on October 25, 2007
This article has been cited by other articles:
![]() |
R. Chowdhary, J. Zhang, and J. S. Liu Bayesian inference of protein-protein interactions from biological literature Bioinformatics, June 15, 2009; 25(12): 1536 - 1542. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Miyao, K. Sagae, R. Saetre, T. Matsuzaki, and J. Tsujii Evaluating contributions of natural language parsers to protein-protein interaction extraction Bioinformatics, February 1, 2009; 25(3): 394 - 400. [Abstract] [Full Text] [PDF] |
||||
