Data-adaptive test statistics for microarray data
1Department of Engineering Science, University of Oxford UK
2Division of Biostatistics, School of Public Health, University of California Berkeley, USA
*To whom correspondence should be addressed.
Motivation: An important task in microarray data analysis is the selection of genes that are differentially expressed between different tissue samples, such as healthy and diseased. However, microarray data contain an enormous number of dimensions (genes) and very few samples (arrays), a mismatch which poses fundamental statistical problems for the selection process that have defied easy resolution.
Results: In this paper, we present a novel approach to the selection of differentially expressed genes in which test statistics are learned from data using a simple notion of reproducibility in selection results as the learning criterion. Reproducibility, as we define it, can be computed without any knowledge of the ground-truth, but takes advantage of certain properties of microarray data to provide an asymptotically valid guide to expected loss under the true data-generating distribution. We are therefore able to indirectly minimize expected loss, and obtain results substantially more robust than conventional methods. We apply our method to simulated and oligonucleotide array data.
Availability: By request to the corresponding author.
Contact: sach{at}robots.ox.ac.uk
This article has been cited by other articles:
![]() |
L. L. Elo, J. Hiissa, J. Tuimala, A. Kallio, E. Korpelainen, and T. Aittokallio Optimized detection of differential expression in global profiling experiments: case studies in clinical transcriptomic and quantitative proteomic datasets Brief Bioinform, September 1, 2009; 10(5): 547 - 555. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.-L. Boulesteix and M. Slawski Stability and aggregation of ranked gene lists Brief Bioinform, September 1, 2009; 10(5): 556 - 568. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Tarraga, I. Medina, J. Carbonell, J. Huerta-Cepas, P. Minguez, E. Alloza, F. Al-Shahrour, S. Vegas-Azcarate, S. Goetz, P. Escobar, et al. GEPAS, a web-based tool for microarray data analysis and interpretation Nucleic Acids Res., July 1, 2008; 36(suppl_2): W308 - W314. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Montaner, J. Tarraga, J. Huerta-Cepas, J. Burguet, J. M. Vaquerizas, L. Conde, P. Minguez, J. Vera, S. Mukherjee, J. Valls, et al. Next station in microarray data analysis: GEPAS. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W486 - W491. [Abstract] [Full Text] [PDF] |
||||

