Bioinformatics Advance Access published online on July 4, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn321
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A Probe-Density Based Analysis Method for Array CGH data: Simulation, Normalization and Centralization
1Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan 106, 2Research Center for Medical Excellence, National Taiwan University, Taipei, Taiwan 100, 3Genetics Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA, 4Institute of Biotechnology, Center for Systems Biology and Bioinformatics, National Taiwan University, Taipei, Taiwan 106, 5College of Medicine, National Taiwan University, Taipei, Taiwan 100, 6Department of Life Science, National Taiwan University, Taipei, Taiwan 106, 7Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan 106, 8Graduate Institute of Epidemiology, National Taiwan University, Taipei, Taiwan 106
*To whom correspondence should be addressed. Eric Y. Chuang, E-mail: chuangey{at}ntu.edu.tw
| Abstract |
|---|
Motivation: Genomic instability is one of the fundamental factors in tumorigenesis and tumor progression. Many studies have shown that copy-number abnormalities at the DNA level are important in the pathogenesis of cancer. Array Comparative Genomic Hybridization (array CGH), developed based on expression microarray technology, can reveal the chromosomal aberrations in segmental copies at a high-resolution. However, due to the nature of array CGH, many standard expression data processing tools, such as data normalization, often fail to yield satisfactory results.
Results: We demonstrated a novel array CGH normalization algorithm, which provides an accurate array CGH data normalization by utilizing the dependency of neighboring probe measurements in array CGH experiments. To facilitate the study, we have developed a Hidden Markov Model (HMM) to simulate a series of array CGH experiments with random DNA copy number alterations that are used to validate the performance of our normalization. In addition, we applied the proposed normalization algorithm to an array CGH study of lung cancer cell lines. By using the proposed algorithm, data quality and the reliability of experimental results are significantly improved, and the distinct patterns of DNA copy number alternations are observed among those lung cancer cell lines.
Supplementary Information: Source codes and figures may be found at http://ntumaps.cgm.ntu.edu.tw/aCGH_supplementary.
Contact: chuangey{at}ntu.edu.tw
Key Word: aCGH, Normalization, Centralization, Simulation, Hidden Markov Model.
Associate Editor: Prof. John Quackenbush
Received on January 25, 2008; revised on May 28, 2008; accepted on June 18, 2008