Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (94)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Marcotte, E. M.
Right arrow Articles by Eisenberg, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Marcotte, E. M.
Right arrow Articles by Eisenberg, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 17 no. 4 2001
Pages 359-363
© 2001 Oxford University Press


Original Paper

Mining literature for protein–protein interactions

Edward M. Marcotte 1,2,3,*, Ioannis Xenarios 1,* and David Eisenberg 1

1 Molecular Biology Institute, UCLA-DOE Laboratory of Structural Biology & Molecular Medicine, University of California at Los Angeles, PO Box 951570, Los Angeles, CA 90095-1570, USA
2 Protein Pathways Inc., 1145 Gayley Avenue, Ste. 304, Los Angeles, CA 90024, USA
3 Institute of Cellular and Molecular Biology, Department of Chemistry and Biochemistry, University of Texas at Austin, 2500 Speedway, Austin, TX 78712, USA

Received on August 3, 2000 ; revised on November 16, 2000 ; accepted on November 22, 2000

Motivation: A central problem in bioinformatics is how to capture information from the vast current scientific literature in a form suitable for analysis by computer. We address the special case of information on protein–protein interactions, and show that the frequencies of words in Medline abstracts can be used to determine whether or not a given paper discusses protein–protein interactions. For those papers determined to discuss this topic, the relevant information can be captured for the Database of Interacting Proteins. Furthermore, suitable gene annotations can also be captured.

Results: Our Bayesian approach scores Medline abstracts for probability of discussing the topic of interest according to the frequencies of discriminating words found in the abstract. More than 80 discriminating words (e.g. complex, interaction, two-hybrid) were determined from a training set of 260 Medline abstracts corresponding to previously validated entries in the Database of Interacting Proteins. Using these words and a log likelihood scoring function, 2000 Medline abstracts were identified as describing interactions between yeast proteins. This approach now forms the basis for the rapid expansion of the Database of Interacting Proteins.

Contact: marcotte{at}icmb.utexas.edu; ixenario{at}mbi.ucla.edu; david{at}mbi.ucla.edu

* These authors contributed equally to this work.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
E. R. Jefferson, T. P. Walsh, T. J. Roberts, and G. J. Barton
SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein-Protein Interactions
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D580 - D589.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Han, Z. Obradovic, Z.-Z. Hu, C. H. Wu, and S. Vucetic
Substring selection for biomedical document classification
Bioinformatics, September 1, 2006; 22(17): 2136 - 2142.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Saric, L. J. Jensen, R. Ouzounova, I. Rojas, and P. Bork
Extraction of regulatory gene/protein networks from Medline
Bioinformatics, March 15, 2006; 22(6): 645 - 650.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
A. Ma'ayan and R. Iyengar
From components to regulatory motifs in signalling networks
Brief Funct Genomic Proteomic, March 1, 2006; 5(1): 57 - 61.



Home page
Nucleic Acids ResHome page
X. Wu, L. Zhu, J. Guo, D.-Y. Zhang, and K. Lin
Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations.
Nucleic Acids Res., January 1, 2006; 34(7): 2137 - 2150.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Hao, X. Zhu, M. Huang, and M. Li
Discovering patterns to extract protein-protein interactions from the literature: Part II
Bioinformatics, August 1, 2005; 21(15): 3294 - 3300.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Liu, N. Liu, and H. Zhao
Inferring protein-protein interactions through high-throughput interaction data from diverse organisms
Bioinformatics, August 1, 2005; 21(15): 3279 - 3285.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Goetz and C.-W. von der Lieth
PubFinder: a tool for improving retrieval rate of relevant PubMed abstracts
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W774 - W778.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Ding, K. Viswanathan, D. Berleant, L. Hughes, E. S. Wurtele, D. Ashlock, J. A. Dickerson, A. Fulmer, and P. S. Schnable
Using the biological taxonomy to access biological literature with PathBinderH
Bioinformatics, May 15, 2005; 21(10): 2560 - 2562.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
O. Hofmann and D. Schomburg
Concept-based annotation of enzyme classes
Bioinformatics, May 1, 2005; 21(9): 2059 - 2066.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Santos, D. Eggle, and David. J. States
Wnt pathway curation using automated natural language processing: combining statistical methods with partial and full parse for knowledge extraction
Bioinformatics, April 15, 2005; 21(8): 1653 - 1658.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
V. Prigent, J. C. Thierry, O. Poch, and F. Plewniak
DbW: automatic update of a functional family-specific multiple alignment
Bioinformatics, April 15, 2005; 21(8): 1437 - 1442.
[Abstract] [Full Text] [PDF]


Home page
J Mol EndocrinolHome page
A. Droit, G. G Poirier, and J. M Hunter
Experimental and bioinformatic approaches for interrogating protein-protein interactions to determine protein function
J. Mol. Endocrinol., April 1, 2005; 34(2): 263 - 280.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. D. Wren, J. T. Chang, J. Pustejovsky, E. Adar, H. R. Garner, and R. B. Altman
Biomedical term mapping databases
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D289 - D293.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. von Mering, L. J. Jensen, B. Snel, S. D. Hooper, M. Krupp, M. Foglierini, N. Jouffre, M. A. Huynen, and P. Bork
STRING: known and predicted protein-protein associations, integrated and transferred across organisms
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D433 - D437.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
U. Karaoz, T. M. Murali, S. Letovsky, Y. Zheng, C. Ding, C. R. Cantor, and S. Kasif
Whole-genome annotation by using evidence integration in functional-linkage networks
PNAS, March 2, 2004; 101(9): 2888 - 2893.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Rebholz-Schuhmann, S. Marcel, S. Albert, R. Tolle, G. Casari, and H. Kirsch
Automatic extraction of mutations from Medline and cross-validation with OMIM
Nucleic Acids Res., January 2, 2004; 32(1): 135 - 142.
[Abstract] [Full Text] [PDF]


Home page
Mol. Endocrinol.Home page
S. Albert, S. Gaudan, H. Knigge, A. Raetsch, A. Delgado, B. Huhse, H. Kirsch, M. Albers, D. Rebholz-Schuhmann, and M. Koegl
Computer-Assisted Generation of a Protein-Interaction Database for Nuclear Receptors
Mol. Endocrinol., August 1, 2003; 17(8): 1555 - 1567.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
X. J. Duan, I. Xenarios, and D. Eisenberg
Describing Biological Protein Interactions in Terms of Protein States and State Transitions : THE LiveDIP DATABASE
Mol. Cell. Proteomics, February 1, 2002; 1(2): 104 - 116.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. C. Mellor, I. Yanai, K. H. Clodfelter, J. Mintseris, and C. DeLisi
Predictome: a database of putative functional links between proteins
Nucleic Acids Res., January 1, 2002; 30(1): 306 - 309.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
D. Greenbaum, N. M. Luscombe, R. Jansen, J. Qian, and M. Gerstein
Interrelating Different Types of Genomic Data, from Proteome to Secretome: 'Oming in on Function
Genome Res., September 1, 2001; 11(9): 1463 - 1468.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.