Bioinformatics Advance Access published online on October 28, 2004
Bioinformatics, doi:10.1093/bioinformatics/bti092
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Yuan Ji is Assistant Professor, Department of Biostatistics and Applied Mathematics, The University of Texas M.D. Anderson Cancer Center, Houston, TX 77030
* To whom correspondence should be addressed.
The classification of samples using gene expression profiles is an important application in areas such as cancer research and environmental health studies. However, the classification is usually based on a small number of samples, and each sample is a long vector of thousands of gene expression levels. An important issue in parametric modeling for so many gene expression levels is the control of the number of nuisance parameters in the model. Large models often lead to intensive or even intractable computation, while small models may be inadequate for complex data. We propose a two-step empirical Bayes classification method as a solution to this issue. At the first step, we use the model-based cluster algorithm with a non-traditional purpose of assigning gene expression levels to form abundance groups. At the second step, by assuming the same variance for all the genes in the same group, we substantially reduce the number of nuisance parameters in our statistical model. The resulting parsimonious model leads to efficient computation under an empirical Bayes estimation procedure. We consider two real examples and simulate data using our method. Desired low classification error rates are obtained even when a large number of genes are pre-selected for class prediction.
Revised September 22, 2004
Accepted October 7, 2004
Article
A novel means of using gene clusters in a two-step empirical Bayes method for predicting lasses of samples
2 Kam-Wah Tsui is Professor, Department of Statistics, The University of Wisconsin - Madison, Madison, WI 53706
3 KyungMann Kim is Professor, Department of Biostatistics and Medical Informatics, The University of Wisconsin - Madison, Madison, WI 53792
Yuan Ji, E-mail: yuanji{at}mdanderson.org
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?