Skip Navigation


Bioinformatics Advance Access originally published online on August 22, 2006
Bioinformatics 2006 22(18):2298-2304; doi:10.1093/bioinformatics/btl388
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/18/2298    most recent
btl388v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (4)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lewis, J.
Right arrow Articles by Garner, H. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lewis, J.
Right arrow Articles by Garner, H. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Text similarity: an alternative way to search MEDLINE

James Lewis , Stephan Ossowski , Justin Hicks , Mounir Errami and Harold R. Garner *

University of Texas Southwestern Medical Center, Eugene McDermott Center for Human Growth and Development, Division for Translational Research 5323 Harry Hines Boulevard, Dallas, TX 75390, USA

*To whom correspondence should be addressed.

Motivation: The most widely used literature search techniques, such as those offered by NCBI's PubMed system, require significant effort on the part of the searcher, and inexperienced searchers do not use these systems as effectively as experienced users. Improved literature search engines can save researchers time and effort by making it easier to locate the most important and relevant literature.

Results: We have created and optimized a new, hybrid search system for Medline that takes natural text as input and then delivers results with high precision and recall. The combination of a fast, low-sensitivity weighted keyword-based first pass algorithm to cast a wide net to gather an initial set of literature, followed by a unique sentence-alignment based similarity algorithm to rank order those results was developed that is sensitive, fast and easy to use. Several text similarity search algorithms, both standard and novel, were implemented and tested in order to determine which obtained the best results in information retrieval exercises.

Availability: Literature searching algorithms are implemented in a system called eTBLAST, freely accessible over the web at http://invention.swmed.edu. A variety of other derivative systems and visualization tools provides the user with an enhanced experience and additional capabilities.

Contact: Harold.Garner{at}UTSouthwestern.edu


Received on May 22, 2006; revised on July 5, 2006; accepted on July 7, 2006

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
J.-F. Fontaine, A. Barbosa-Silva, M. Schaefer, M. R. Huska, E. M. Muro, and M. A. Andrade-Navarro
MedlineRanker: flexible ranking of biomedical literature
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W141 - W146.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Errami, Z. Sun, T. C. Long, A. C. George, and H. R. Garner
Deja vu: a database of highly similar citations in the scientific literature
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D921 - D924.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
J.-j. Kim and D. Rebholz-Schuhmann
Categorization of services for seeking information in biomedical literature: a typology for improvement of practice
Brief Bioinform, November 1, 2008; 9(6): 452 - 465.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Errami, J. M. Hicks, W. Fisher, D. Trusty, J. D. Wren, T. C. Long, and H. R. Garner
Deja vu A study of duplicate citations in Medline
Bioinformatics, January 15, 2008; 24(2): 243 - 249.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. A. Hearst, A. Divoli, H. Guturu, A. Ksikes, P. Nakov, M. A. Wooldridge, and J. Ye
BioText Search Engine: beyond abstract search
Bioinformatics, August 15, 2007; 23(16): 2196 - 2197.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Errami, J. D. Wren, J. M. Hicks, and H. R. Garner
eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W12 - W15.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.