Bioinformatics Advance Access published online on October 27, 2004
Bioinformatics, doi:10.1093/bioinformatics/bti087
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Electrical and Computer Engineering, Iowa State University, Ames, IA 50011, USA
* To whom correspondence should be addressed.
Summary: MEDLINE/PubMed is one of the most important information sources for bioinformatics text mining. However, there remain limitations in working with MEDLINE/PubMed citations. For example, PubMed imposes an upper limit of 10,000 for downloading PMID list or citations; and MEDLINE files are too large for most off-the-shelf XML parsers. We developed a Java package, MedKit, to work-around the limitations, as well as provide other useful functionalities, e.g. random sampling. Its four modules (querier, sampler, fetcher and parser) can work independently, or be pipelined in various combinations. It can be used as a stand-alone GUI application, or integrated into other text mining systems. Text mining researchers and others may download and use the toolkit free for non-commercial purposes. Availability: http://metnetdb.gdcb.iastate.edu/medkit.
Revised September 13, 2004
Accepted October 7, 2004
Applications note
MedKit: A helper toolkit for automatic mining of MEDLINE/PubMed citations
Daniel Berleant, E-mail: berleant{at}iastate.edu
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?