Integrating image data into biomedical text categorization
School of Computing, Queen's University Kingston, Ontario, Canada
*To whom correspondence should be addressed.
Categorization of biomedical articles is a central task for supporting various curation efforts. It can also form the basis for effective biomedical text mining. Automatic text classification in the biomedical domain is thus an active research area. Contests organized by the KDD Cup (2002) and the TREC Genomics track (since 2003) defined several annotation tasks that involved document classification, and provided training and test data sets. So far, these efforts focused on analyzing only the text content of documents. However, as was noted in the KDD'02 text mining contestwhere figure-captions proved to be an invaluable feature for identifying documents of interestimages often provide curators with critical information. We examine the possibility of using information derived directly from image data, and of integrating it with text-based classification, for biomedical document categorization. We present a method for obtaining features from images and for using themboth alone and in combination with textto perform the triage task introduced in the TREC Genomics track 2004. The task was to determine which documents are relevant to a given annotation task performed by the Mouse Genome Database curators. We show preliminary results, demonstrating that the method has a strong potential to enhance and complement traditional text-based categorization methods.
Contact: shatkay{at}cs.queensu.ca
This article has been cited by other articles:
![]() |
S. Xu, J. McCusker, and M. Krauthammer Yale Image Finder (YIF): a new search engine for retrieving biomedical images Bioinformatics, September 1, 2008; 24(17): 1968 - 1970. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Qian and R. F. Murphy Improved recognition of figures containing fluorescence microscope images in online journal articles using graphical models Bioinformatics, February 15, 2008; 24(4): 569 - 576. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Zweigenbaum, D. Demner-Fushman, H. Yu, and K. B. Cohen Frontiers of biomedical text mining: current progress Brief Bioinform, October 30, 2007; (2007) bbm045v1. [Abstract] [Full Text] [PDF] |
||||

