Bioinformatics Advance Access first published online on January 15, 2009
This version published online on January 19, 2009
Bioinformatics, doi:10.1093/bioinformatics/btp022
MSMAD: A computationally efficient method for the analysis of noisy array CGH data.
1Institute of Biostatistics and Analyses, Masaryk University, Kamenice 126/3, 625 00 Brno, Czech Republic
2Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Auenbruggerplatz 2, 8036 Graz, Austria
*To whom correspondence should be addressed. Eva Budinska, E-mail: budinska{at}iba.muni.cz
| Abstract |
|---|
Motivation: Genome analysis has become one of the most important tools for understanding the complex process of cancerogenesis. With increasing resolution of CGH arrays, the demand for computationally efficient algorithms arises, which are effective in the detection of aberrations even in very noisy data.
Results: We developed a rather simple, nonparametric technique of high computational efficiency for CGH array analysis that adopts a median absolute deviation concept for breakpoint detection, comprising median smoothing for preprocessing. The resulting algorithm has the potential to outperform any single smoothing approach as well as several recently proposed segmentation techniques. We show its performance through the application of simulated and real data sets in comparison to three other methods for array CGH analysis.
Implementation Our approach is implemented in the R language and environment for statistical computing (version 2.6.1 for Windows, R-project, 2007). The code is available at: http://www.iba.muni.cz/~budinska/msmad.html
Contact: budinska{at}iba.muni.cz
Associate Editor: Dr. Alex Bateman
Received on June 4, 2008; revised on November 5, 2008; accepted on January 8, 2009