Bioinformatics Advance Access originally published online on July 21, 2005
Bioinformatics 2005 21(18):3658-3664; doi:10.1093/bioinformatics/bti586
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Resolving abbreviations to their senses in Medline
European Bioinformatics Institute Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
*To whom correspondence should be addressed.
Motivation: Biological literature contains many abbreviations with one particular sense in each document. However, most abbreviations do not have a unique sense across the literature. Furthermore, many documents do not contain the long forms of the abbreviations. Resolving an abbreviation in a document consists of retrieving its sense in use. Abbreviation resolution improves accuracy of document retrieval engines and of information extraction systems.
Results: We combine an automatic analysis of Medline abstracts and linguistic methods to build a dictionary of abbreviation/sense pairs. The dictionary is used for the resolution of abbreviations occurring with their long forms. Ambiguous global abbreviations are resolved using support vector machines that have been trained on the context of each instance of the abbreviation/sense pairs, previously extracted for the dictionary set-up. The system disambiguates abbreviations with a precision of 98.9% for a recall of 98.2% (98.5% accuracy). This performance is superior in comparison with previously reported research work.
Availability: The abbreviation resolution module is available at http://www.ebi.ac.uk/Rebholz/software.html
Contact: gaudan{at}ebi.ac.uk
Received on April 1, 2005; revised on June 20, 2005; accepted on July 14, 2005
This article has been cited by other articles:
![]() |
I. Solt, D. Tikk, V. Gal, and Z. T Kardkovacs Semantic Classification of Diseases in Discharge Summaries Using a Context-aware Rule-based Classifier JAMIA, July 1, 2009; 16(4): 580 - 584. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Solt, D. Tikk, V. Gal, and Z. T. Kardkovacs Semantic Classification of Diseases in Discharge Summaries Using a Context-aware Rule-based Classifier J. Am. Med. Inform. Assoc., July 1, 2009; 16(4): 580 - 584. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Nakazato, H. Bono, H. Matsuda, and T. Takagi Gendoo: Functional profiling of gene and disease features using MeSH vocabulary Nucleic Acids Res., July 1, 2009; 37(suppl_2): W166 - W169. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Winnenburg, T. Wachter, C. Plake, A. Doms, and M. Schroeder Facts from text: can text mining help to scale-up high-quality manual curation of gene products with ontologies? Brief Bioinform, December 6, 2008; (2008) bbn043v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Rebholz-Schuhmann, H. Kirsch, M. Arregui, S. Gaudan, M. Riethoven, and P. Stoehr EBIMed--text crunching to gather facts for proteins from Medline Bioinformatics, January 15, 2007; 23(2): e237 - e244. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Okazaki and S. Ananiadou Building an abbreviation dictionary using a term recognition approach Bioinformatics, December 15, 2006; 22(24): 3089 - 3095. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhou, V. I. Torvik, and N. R. Smalheiser ADAM: another database of abbreviations in MEDLINE Bioinformatics, November 15, 2006; 22(22): 2813 - 2818. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Liu, Z.-Z. Hu, M. Torii, C. Wu, and C. Friedman Quantitative Assessment of Dictionary-based Protein Named Entity Tagging J. Am. Med. Inform. Assoc., September 1, 2006; 13(5): 497 - 507. [Abstract] [Full Text] [PDF] |
||||




