Skip Navigation

Bioinformatics 2005 21(Suppl 2):ii245-ii251; doi:10.1093/bioinformatics/bti1141
This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhu, S.
Right arrow Articles by Mamitsuka, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhu, S.
Right arrow Articles by Mamitsuka, H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oxfordjournals.org

A probabilistic model for mining implicit ‘chemical compound–gene’ relations from literature

Shanfeng Zhu 1, Yasushi Okuno 2, Gozoh Tsujimoto 2 and Hiroshi Mamitsuka 1,*

1Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji 611-0011, Japan
2Graduate School of Pharmaceutical Sciences, Kyoto University Sakyo-ku, Kyoto 606-8501, Japan

*To whom correspondence should be addressed.

Motivation: The importance of chemical compounds has been emphasized more in molecular biology, and ‘chemical genomics’ has attracted a great deal of attention in recent years. Thus an important issue in current molecular biology is to identify biological-related chemical compounds (more specifically, drugs) and genes. Co-occurrence of biological entities in the literature is a simple, comprehensive and popular technique to find the association of these entities. Our focus is to mine implicit ‘chemical compound and gene’ relations from the co-occurrence in the literature.

Results: We propose a probabilistic model, called the mixture aspect model (MAM), and an algorithm for estimating its parameters to efficiently handle different types of co-occurrence datasets at once. We examined the performance of our approach not only by a cross-validation using the data generated from the MEDLINE records but also by a test using an independent human-curated dataset of the relationships between chemical compounds and genes in the ChEBI database. We performed experimentation on three different types of co-occurrence datasets (i.e. compound–gene, gene–gene and compound–compound co-occurrences) in both cases. Experimental results have shown that MAM trained by all datasets outperformed any simple model trained by other combinations of datasets with the difference being statistically significant in all cases. In particular, we found that incorporating compound–compound co-occurrences is the most effective in improving the predictive performance. We finally computed the likelihoods of all unknown compound–gene (more specifically, drug–gene) pairs using our approach and selected the top 20 pairs according to the likelihoods. We validated them from biological, medical and pharmaceutical viewpoints.

Contact: mami{at}kuicr.kyoto-u.ac.jp



Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Anesth. Analg.Home page
T. Seto, H. Isogai, M. Ozaki, and S. Nosaka
Noble Gas Binding to Human Serum Albumin Using Docking Simulation: Nonimmobilizers and Anesthetics Bind to Different Sites
Anesth. Analg., October 1, 2008; 107(4): 1223 - 1228.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
P. Agarwal and D. B. Searls
Literature mining in support of drug discovery
Brief Bioinform, September 27, 2008; (2008) bbn035v1.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Yamanishi, M. Araki, A. Gutteridge, W. Honda, and M. Kanehisa
Prediction of drug-target interaction networks from the integration of chemical and genomic spaces
Bioinformatics, July 1, 2008; 24(13): i232 - i240.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Li, L. Wu, and Z. Zhang
Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach
Bioinformatics, September 1, 2006; 22(17): 2143 - 2150.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.