Bioinformatics Advance Access published online on March 5, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn086
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
e-LiSe – an online tool for finding needles in the "(Medline) haystack"
1Bioinformatics Department, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, ul. Pawinskiego 5a, 02-106 Warszawa,Poland
2Plant Molecular Biology Department, Warsaw University, Warszawa, Poland
*To whom correspondence should be addressed. Piotr Zielenkiewicz, E-mail: piotr{at}ibb.waw.pl
| Abstract |
|---|
Summary: Using literature databases one can find not only known and true relations between processes but also less studied, non-obvious associations. The main problem with discovering such type of relevant biological information is "selection". The ability to distinguish between a true correlation (e.g. between different types of biological processes) and random chance that this correlation is statistically significant is crucial for any bio-medical research, literature mining being no exception. This problem is especially visible when searching for information which hasn't been studied and described in many publications. Therefore a novel bio-linguistic statistical method is required, capable of "selecting" true correlations, even when they are low-frequency associations.
In this paper we present such statistical approach based on Z-score and implemented in a web-based application "e-LiSe".
Availability: The software is available at http://miron.ibb.waw.pl/elise/
Supplementary materials are available at http://miron.ibb.waw.pl/elise.supplementary/
Contact: piotr{at}ibb.waw.pl
Associate Editor: Prof. Alfonso Valencia
Received on August 2, 2007; revised on February 25, 2008; accepted on March 3, 2008