Comparative analysis of microarray normalization procedures: effects on reverse engineering gene networks
1Department of Biomedical Informatics, Columbia University, 622 West 168th Street, Vanderbilt Clinic 5th Floor and 2Center for Computational Biology and Bioinformatics, Columbia University, 1130 Saint Nicholas Avenue, New York, NY 10032, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: An increasingly common application of gene expression profile data is the reverse engineering of cellular networks. However, common procedures to normalize expression profiles generated using the Affymetrix GeneChips technology were originally developed for a rather different purpose, namely the accurate measure of differential gene expression between two or more phenotypes. As a result, current evaluation strategies lack comprehensive metrics to assess the suitability of available normalization procedures for reverse engineering and, in general, for measuring correlation between the expression profiles of a gene pair.
Results: We benchmark four commonly used normalization procedures (MAS5, RMA, GCRMA and Li-Wong) in the context of established algorithms for the reverse engineering of protein–protein and protein–DNA interactions. Replicate sample, randomized and human B-cell data sets are used as an input. Surprisingly, our study suggests that MAS5 provides the most faithful cellular network reconstruction. Furthermore, we identify a crucial step in GCRMA responsible for introducing severe artifacts in the data leading to a systematic overestimate of pairwise correlation. This has key implications not only for reverse engineering but also for other methods, such as hierarchical clustering, relying on accurate measurements of pairwise expression profile correlation. We propose an alternative implementation to eliminate such side effect.
Contect: califano{at}c2b2.columbia.edu
This article has been cited by other articles:
![]() |
T. Obayashi and K. Kinoshita Rank of Correlation Coefficient as a Comparable Measure for Biological Significance of Gene Coexpression DNA Res, October 1, 2009; 16(5): 249 - 260. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Cordero, M. Botta, and R. A. Calogero Microarray data analysis and mining approaches Brief Funct Genomic Proteomic, January 22, 2008; (2008) elm034v1. [Abstract] [Full Text] [PDF] |
||||

