Bioinformatics Advance Access published online on March 5, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn083
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Merging Two Gene Expression Studies via Cross Platform Normalization
1Department of Statistics and Operations Research, University of North Carolina at Chapel Hill
2Department of Mathematical Sciences, Norwegian University of Science and Technology
3Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill
4Department of Pathology and Laboratory Medicine, University of North Carolina at Chapel Hill
5Department of Genetics, University of North Carolina at Chapel Hill
*To whom correspondence should be addressed. Andrey A Shabalin, E-mail: shabalin{at}email.unc.edu
| Abstract |
|---|
Motivation: Gene expression microarrays are currently being applied in a variety of biomedical applications. This paper considers the problem of how to merge data sets arising from different geneexpression studies of a common organism and phenotype. Of particular interest is how to merge data from different technological platforms.
Results: The paper makes two contributions to the problem. The first is a simple cross-study normalization method, which is based on linked gene/sample clustering of the given data sets. The second is the introduction and description of several general validation measures that can be used to assess and compare cross-study normalization methods. The proposed normalization method is applied to three existing breast cancer data sets, and is compared to several competing normalization methods using the proposed validation measures.
Availability: The Supplementary Materials and XPN Matlab code are publicly available at website: https://genome.unc.edu/xpn
Contact: shabalin{at}email.unc.edu
Associate Editor: Prof. David Rocke
Received on November 28, 2007; revised on February 7, 2008; accepted on March 1, 2008