Bioinformatics Advance Access published online on January 28, 2009
Bioinformatics, doi:10.1093/bioinformatics/btp062
Parallelized prediction error estimation for evaluation of high-dimensional models
Department of Medical Biometry and Statistics, Institute of Medical Biometry and Medical Informatics, University Medical Center Freiburg, 79104 Freiburg, Germany
*To whom correspondence should be addressed. Ms. Christine Porzelius, E-mail: cp{at}fdm.uni-freiburg.de
| Abstract |
|---|
Summary: There is a multitude of new techniques that promise to extract predictive information in bioinformatics applications. It has been recognized that a first step for validation of the resulting model fits should rely on proper use of resampling techniques. However, this advice is frequently not followed, potential reasons being difficulty of correct implementation and computational demand. This is addressed by the R package peperr, which is designed for reliable prediction error estimation through resampling, potentially accelerated by parallel execution on a compute cluster. Its interface allows for easy connection to newly developed model fitting routines. Performance evaluation of the latter is furthermore guided by diagnostic plots, which helps to detect specific problems due to high-dimensional data structures.
Availability: http://cran.r-project.org, http://www.imbi.uni-freiburg.de/parallel.
Contact: cp{at}fdm.uni-freiburg.de
Supplementary information: Supplementary material is available at Bioinformatics online
Associate Editor: Prof. David Rocke
Received on July 23, 2008; revised on January 21, 2009; accepted on January 26, 2009