Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (58)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Oba, S.
Right arrow Articles by Ishii, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Oba, S.
Right arrow Articles by Ishii, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 19 no. 16 2003
pages 2088-2096
© 2003 Oxford University Press

A Bayesian missing value estimation method for gene expression profile data

Shigeyuki Oba 1, Masa-aki Sato 2,5, Ichiro Takemasa 3, Morito Monden 3, Ken-ichi Matsubara 4 and Shin Ishii 1,5,*

1 Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma 630-0192, Japan, 2 ATR Human Information Science Laboratories, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan, 3 Graduate School of Medicine, Osaka University, 2-2 Yamadaoka, Suita, Osaka, Japan, 4 DNA Chip Research Institute, 134 Kobecho, Hodogayaku, Yokohama, Japan and 5 CREST, Japan Science and Technology Corporation

Received on March 10, 2003 ; revised on May 6, 2003 ; accepted on May 9, 2003

Motivation: Gene expression profile analyses have been used in numerous studies covering a broad range of areas in biology. When unreliable measurements are excluded, missing values are introduced in gene expression profiles. Although existing multivariate analysis methods have difficulty with the treatment of missing values, this problem has received little attention. There are many options for dealing with missing values, each of which reaches drastically different results. Ignoring missing values is the simplest method and is frequently applied. This approach, however, has its flaws. In this article, we propose an estimation method for missing values, which is based on Bayesian principal component analysis (BPCA). Although the methodology that a probabilistic model and latent variables are estimated simultaneously within the framework of Bayes inference is not new in principle, actual BPCA implementation that makes it possible to estimate arbitrary missing variables is new in terms of statistical methodology.

Results: When applied to DNA microarray data from various experimental conditions, the BPCA method exhibited markedly better estimation ability than other recently proposed methods, such as singular value decomposition and K-nearest neighbors. While the estimation performance of existing methods depends on model parameters whose determination is difficult, our BPCA method is free from this difficulty. Accordingly, the BPCA method provides accurate and convenient estimation for missing values.

Availability: The software is available at http://hawaii.aist-nara.ac.jp/~shige-o/tools/

Contact: ishii{at}is.aist-nara.ac.jp

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
W. Stacklies, H. Redestig, M. Scholz, D. Walther, and J. Selbig
pcaMethods a bioconductor package providing PCA methods for incomplete data
Bioinformatics, May 1, 2007; 23(9): 1164 - 1167.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. S. V. Wong, F. K. Wong, and G. R. Wood
A multi-stage approach to clustering and imputation of gene expression profiles
Bioinformatics, April 15, 2007; 23(8): 998 - 1005.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Hua and Y. Lai
An ensemble approach to microarray data-based gene prioritization after missing value imputation
Bioinformatics, March 15, 2007; 23(6): 747 - 754.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Wang, Y. Lv, Z. Guo, X. Li, Y. Li, J. Zhu, D. Yang, J. Xu, C. Wang, S. Rao, et al.
Effects of replacing the unreliable cDNA microarray measurements on the disease classification based on gene expression profiles and functional modules
Bioinformatics, December 1, 2006; 22(23): 2883 - 2889.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
X. Gan, A. W.-C. Liew, and H. Yan
Microarray missing data imputation based on a set theoretic framework and biological knowledge
Nucleic Acids Res., March 20, 2006; 34(5): 1608 - 1619.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
M. Rattray, X. Liu, G. Sanguinetti, M. Milo, and N. D. Lawrence
Propagating uncertainty in microarray data analysis
Brief Bioinform, March 1, 2006; 7(1): 37 - 47.



Home page
BioinformaticsHome page
J. Tuikkala, L. Elo, O. S. Nevalainen, and T. Aittokallio
Improving missing value estimation in microarray data with gene ontology
Bioinformatics, March 1, 2006; 22(5): 566 - 572.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. Scheel, M. Aldrin, I. K. Glad, R. Sorum, H. Lyng, and A. Frigessi
The influence of missing value imputation on detection of differentially expressed genes from microarray data
Bioinformatics, December 1, 2005; 21(23): 4272 - 4279.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Jornsten, H.-Y. Wang, W. J. Welsh, and M. Ouyang
DNA microarray data imputation and significance analysis of differential expression
Bioinformatics, November 15, 2005; 21(22): 4155 - 4161.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Scholz, F. Kaplan, C. L. Guy, J. Kopka, and J. Selbig
Non-linear PCA: a missing data approach
Bioinformatics, October 15, 2005; 21(20): 3887 - 3895.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Sanguinetti, M. Milo, M. Rattray, and N. D. Lawrence
Accounting for probe-level noise in principal component analysis of microarray data
Bioinformatics, October 1, 2005; 21(19): 3748 - 3754.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. S. B. Sehgal, I. Gondal, and L. S. Dooley
Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data
Bioinformatics, May 15, 2005; 21(10): 2417 - 2423.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Kim, G. H. Golub, and H. Park
Missing value estimation for DNA microarray gene expression data: local least squares imputation
Bioinformatics, January 15, 2005; 21(2): 187 - 198.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.