Skip Navigation


Bioinformatics Advance Access originally published online on October 28, 2004
Bioinformatics 2005 21(7):1055-1061; doi:10.1093/bioinformatics/bti092
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/7/1055    most recent
bti092v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ji, Y.
Right arrow Articles by Kim, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ji, Y.
Right arrow Articles by Kim, K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

A novel means of using gene clusters in a two-step empirical Bayes method for predicting classes of samples

Yuan Ji 1,*, Kam-Wah Tsui 2 and KyungMann Kim 3

1Department of Biostatistics and Applied Mathematics, The University of Texas M.D. Anderson Cancer Center Houston, TX 77030, USA
2Department of Statistics, The University of Wisconsin-Madison Madison, WI 53706, USA
3Department of Biostatistics and Medical Informatics, The University of Wisconsin-Madison Madison, WI 53792, USA

*To whom correspondence should be addressed.

Motivation: The classification of samples using gene expression profiles is an important application in areas such as cancer research and environmental health studies. However, the classification is usually based on a small number of samples, and each sample is a long vector of thousands of gene expression levels. An important issue in parametric modeling for so many gene expression levels is the control of the number of nuisance parameters in the model. Large models often lead to intensive or even intractable computation, while small models may be inadequate for complex data.

Methodology: We propose a two-step empirical Bayes classification method as a solution to this issue. At the first step, we use the model-based cluster algorithm with a non-traditional purpose of assigning gene expression levels to form abundance groups. At the second step, by assuming the same variance for all the genes in the same group, we substantially reduce the number of nuisance parameters in our statistical model.

Results: The proposed model is more parsimonious, which leads to efficient computation under an empirical Bayes estimation procedure. We consider two real examples and simulate data using our method. Desired low classification error rates are obtained even when a large number of genes are pre-selected for class prediction.

Availability: Supplemental materials including technical details are available at "http://odin.mdacc.tmc.edu/~yuanj/papers/sup.pdf". An R program for computation is available upon request by email to Yuan Ji (yuanji{at}mdanderson.org)

Contact: yuanji{at}mdanderson.org


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.