Bioinformatics Advance Access published online on July 21, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti586
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
* To whom correspondence should be addressed.
Motivation: Biological literature contains many abbreviations with one particular sense in each document. However, most abbreviations do not have a unique sense across the literature. Furthermore, many documents do not contain the long-forms of the abbreviations. Resolving an abbreviation in a document consists of retrieving its sense in use. Abbreviation resolution improves accuracy of document retrieval engines and of information extraction systems. Results: We combine an automatic analysis of Medline abstracts and linguistic methods to build a dictionary of abbreviation/sense pairs. The dictionary is used for the resolution of abbreviations occurring with their long-forms. Ambiguous global abbreviations are resolved using Support Vector Machines that have been trained on the context of each instance of the abbreviation/sense pairs, previously extracted for the dictionary setup. The system disambiguates abbreviations with a precision of 98.9% for a recall of 98.2% (98.5% accuracy). This performance is superior in comparison to previously reported research work. Availability: The abbreviation resolution module is available at http://www.ebi.ac.uk/Rebholz/software.html.
Received April 1, 2005
Revised June 20, 2005
Accepted July 14, 2005
Article
Resolving abbreviations to their senses in Medline
S. Gaudan, E-mail: gaudan{at}ebi.ac.uk
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. Rebholz-Schuhmann, H. Kirsch, M. Arregui, S. Gaudan, M. Riethoven, and P. Stoehr EBIMed--text crunching to gather facts for proteins from Medline Bioinformatics, January 15, 2007; 23(2): e237 - e244. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Okazaki and S. Ananiadou Building an abbreviation dictionary using a term recognition approach Bioinformatics, December 15, 2006; 22(24): 3089 - 3095. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhou, V. I. Torvik, and N. R. Smalheiser ADAM: another database of abbreviations in MEDLINE Bioinformatics, November 15, 2006; 22(22): 2813 - 2818. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Liu, Z.-Z. Hu, M. Torii, C. Wu, and C. Friedman Quantitative Assessment of Dictionary-based Protein Named Entity Tagging J. Am. Med. Inform. Assoc., September 1, 2006; 13(5): 497 - 507. [Abstract] [Full Text] [PDF] |
||||

