Skip Navigation


Bioinformatics Advance Access originally published online on August 27, 2004
Bioinformatics 2005 21(2):187-198; doi:10.1093/bioinformatics/bth499
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow A corrigendum has been published
Right arrow All Versions of this Article:
21/2/187    most recent
bth499v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (45)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Kim, H.
Right arrow Articles by Park, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kim, H.
Right arrow Articles by Park, H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics vol. 21 issue 2 © Oxford University Press 2005; all rights reserved.

Missing value estimation for DNA microarray gene expression data: local least squares imputation

Hyunsoo Kim 1, Gene H. Golub 2 and Haesun Park 1,3,*

1 Department of Computer Science and Engineering, University of Minnesota Twin Cities, 200 Union Street S.E., Minneapolis, MN 55455, USA
2 Computer Science Department, Stanford University Gates Building 2B #280, Stanford, CA 94305-9025, USA
3 The National Science Foundation 4201 Wilson Boulevard, Arlington, VA 22230, USA

*To whom correspondence should be addressed.

Motivation: Gene expression data often contain missing expression values. Effective missing value estimation methods are needed since many algorithms for gene expression data analysis require a complete matrix of gene array values. In this paper, imputation methods based on the least squares formulation are proposed to estimate missing values in the gene expression data, which exploit local similarity structures in the data as well as least squares optimization process.

Results: The proposed local least squares imputation method (LLSimpute) represents a target gene that has missing values as a linear combination of similar genes. The similar genes are chosen by k-nearest neighbors or k coherent genes that have large absolute values of Pearson correlation coefficients. Non-parametric missing values estimation method of LLSimpute are designed by introducing an automatic k-value estimator. In our experiments, the proposed LLSimpute method shows competitive results when compared with other imputation methods for missing value estimation on various datasets and percentages of missing values in the data.

Availability: The software is available at http://www.cs.umn.edu/~hskim/tools.html

Contact: hpark{at}cs.umn.edu


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol. Cell. ProteomicsHome page
N. Pavelka, M. L. Fournier, S. K. Swanson, M. Pelizzola, P. Ricciardi-Castagnoli, L. Florens, and M. P. Washburn
Statistical Similarities between Transcriptomics and Quantitative Shotgun Proteomics Data
Mol. Cell. Proteomics, April 1, 2008; 7(4): 631 - 644.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W. Stacklies, H. Redestig, M. Scholz, D. Walther, and J. Selbig
pcaMethods a bioconductor package providing PCA methods for incomplete data
Bioinformatics, May 1, 2007; 23(9): 1164 - 1167.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. S. V. Wong, F. K. Wong, and G. R. Wood
A multi-stage approach to clustering and imputation of gene expression profiles
Bioinformatics, April 15, 2007; 23(8): 998 - 1005.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Hua and Y. Lai
An ensemble approach to microarray data-based gene prioritization after missing value imputation
Bioinformatics, March 15, 2007; 23(6): 747 - 754.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Wang, Y. Lv, Z. Guo, X. Li, Y. Li, J. Zhu, D. Yang, J. Xu, C. Wang, S. Rao, et al.
Effects of replacing the unreliable cDNA microarray measurements on the disease classification based on gene expression profiles and functional modules
Bioinformatics, December 1, 2006; 22(23): 2883 - 2889.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
X. Chen, M. Chen, and K. Ning
BNArray: an R package for constructing gene regulatory networks from microarray data by using Bayesian network
Bioinformatics, December 1, 2006; 22(23): 2952 - 2954.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
X. Gan, A. W.-C. Liew, and H. Yan
Microarray missing data imputation based on a set theoretic framework and biological knowledge
Nucleic Acids Res., March 20, 2006; 34(5): 1608 - 1619.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Tuikkala, L. Elo, O. S. Nevalainen, and T. Aittokallio
Improving missing value estimation in microarray data with gene ontology
Bioinformatics, March 1, 2006; 22(5): 566 - 572.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
O. Alter and G. H. Golub
Reconstructing the pathways of a cellular system from genome-scale signals by using matrix and tensor computations
PNAS, December 6, 2005; 102(49): 17559 - 17564.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. Scheel, M. Aldrin, I. K. Glad, R. Sorum, H. Lyng, and A. Frigessi
The influence of missing value imputation on detection of differentially expressed genes from microarray data
Bioinformatics, December 1, 2005; 21(23): 4272 - 4279.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Jornsten, H.-Y. Wang, W. J. Welsh, and M. Ouyang
DNA microarray data imputation and significance analysis of differential expression
Bioinformatics, November 15, 2005; 21(22): 4155 - 4161.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.