Bioinformatics Vol. 17 no. 12 2001
Pages 1143-1151
© 2001 Oxford University Press
Statistical estimation of cluster boundaries in gene expression profile data


1 Laboratory of Mathematics, Saga Medical
School, 5-1-1 Nabeshima, Saga, Saga 849-8501, Japan
2 Department of Bioinformatics, Biomolecular
Engineering Research Institute, 6-2-3 Furuedai, Suita, Osaka
565-0874, Japan
Received on December 12, 2000
; revised on May 17, 2001
; accepted on June 11, 2001
Motivation: Gene expression profile data are rapidly accumulating due to advances in microarray techniques. The abundant data are analyzed by clustering procedures to extract the useful information about the genes inherent in the data. In the clustering analyses, the systematic determination of the boundaries of gene clusters, instead of by visual inspection and biological knowledge, still remains challenging.
Results: We propose a statistical procedure to estimate the number of clusters in the hierarchical clustering of the expression profiles. Following the hierarchical clustering, the statistical property of the profiles at the node in the dendrogram is evaluated by a statistics-based value: the variance inflation factor in the multiple regression analysis. The evaluation leads to an automatic determination of the cluster boundaries without any additional analyses and any biological knowledge of the measured genes. The performance of the present procedure is demonstrated on the profiles of 2467 yeast genes, with very promising results.
Availability: A set of programs will be electronically sent upon request.
Contact: horimoto{at}post.saga-med.ac.jp; toh{at}beri.co.jp
* To whom all correspondence should be addressed.
These authors
contributed equally to this work.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Aburatani, K. Goto, S. Saito, H. Toh, and K. Horimoto ASIAN: a web server for inferring a regulatory network framework from gene expression profiles Nucleic Acids Res., July 1, 2005; 33(suppl_2): W659 - W664. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Dozmorov, N. Knowlton, Y. Tang, A. Shields, P. Pathipvanich, J. N. Jarvis, and M. Centola Hypervariable genes--experimental error or hidden dynamics Nucleic Acids Res., October 28, 2004; 32(19): e147 - e147. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Raychaudhuri, J. T. Chang, F. Imam, and R. B. Altman The computational analysis of scientific literature to define and recognize gene expression clusters Nucleic Acids Res., August 1, 2003; 31(15): 4553 - 4560. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Wicker, D. Dembele, W. Raffelsberger, and O. Poch Density of points clustering, application to transcriptomic data analysis Nucleic Acids Res., September 15, 2002; 30(18): 3992 - 4000. [Abstract] [Full Text] [PDF] |
||||
