Skip Navigation



Bioinformatics Advance Access published online on October 22, 2009

Bioinformatics, doi:10.1093/bioinformatics/btp602
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Niu, Y.
Right arrow Articles by Jurisica, I.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Niu, Y.
Right arrow Articles by Jurisica, I.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author(s) 2009. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Evaluation of linguistic features useful in extraction of interactions from PubMed; Application to annotating known, high-throughput and predicted interactions in I2D

Yun Niu 1,*, David Otasek 1 and Igor Jurisica 1,2,*

1 Ontario Cancer Institute, UHN, 101 College Street, Toronto, Ontario M5G 1L7, Canada
2 University of Toronto, Departments of Computer Science and Medical Biophysics, Canada

*To whom correspondence should be addressed. Dr. Igor Jurisica E-mail: juris{at}ai.utoronto.ca


   Abstract

Motivation: Identification and characterization of protein-protein interactions (PPIs) is one of the key aims in biological research. While previous research in text mining has made substantial progress in automatic PPI detection from literature, the need to improve the precision and recall of the process remains. More accurate PPI detection will also improve the ability to extract experimental data related to PPIs and provide multiple evidence for each interaction.

Results: We developed an interaction detection method and explored usefulness of various features in automatically identifying PPIs in text. The results show that our approach outperforms other systems using the AImed dataset. In the tests where our system achieves better precision with reduced recall, we discuss possible approaches for improvement. In addition to test datasets, we evaluated performance on interactions from five human-curated databases—BIND, DIP, HPRD, IntAct and MINT—where our system consistently identified evidence for about 60% of interactions when both proteins appear in at least one sentence in the PubMed abstract. We then applied the system to extract articles from PubMed to annotate known, high-throughput and interologous interactions in I2D.

Availability: The data and software are available at: http://www.cs.utoronto.ca/~juris/data/BI09/.

Contact: yniu{at}uhnres.utoronto.ca, juris{at}ai.utoronto.ca

Associate Editor: Prof. Burkhard Rost


Received on November 26, 2008; revised on October 2, 2009; accepted on October 16, 2009

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.