Bioinformatics Vol. 18 no. 90001 2002
Pages S136-S144
© 2002 Oxford University Press
Discovering statistically significant biclusters in gene expression data
School Of Computer Science, Tel-Aviv University, Ramat-Aviv, Tel-Aviv, 69978, Israel
Received on January 24, 2002
; revised on March 31, 2002
; accepted on March 31, 2002
In gene expression data, a bicluster is a subset of the genes exhibiting consistent patterns over a subset of the conditions. We propose a new method to detect significant biclusters in large expression datasets. Our approach is graph theoretic coupled with statistical modelling of the data. Under plausible assumptions, our algorithm is polynomial and is guaranteed to find the most significant biclusters. We tested our method on a collection of yeast expression profiles and on a human cancer dataset. Cross validation results show high specificity in assigning function to genes based on their biclusters, and we are able to annotate in this way 196 uncharacterized yeast genes. We also demonstrate how the biclusters lead to detecting new concrete biological associations. In cancer data we are able to detect and relate finer tissue types than was previously possible. We also show that the method outperforms the biclustering algorithm of Cheng and Church (2000).
Contact: amos{at}tau.ac.il; roded{at}tau.ac.il; rshamir{at}tau.ac.il
* These authors contributed equally to this work.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
I. G. Costa, S. Roepcke, C. Hafemeister, and A. Schliep Inferring differentiation pathways from gene expression Bioinformatics, July 1, 2008; 24(13): i156 - i164. [Abstract] [PDF] |
||||
![]() |
A. Pati, Y. Jin, K. Klage, R. F. Helm, L. S. Heath, and N. Ramakrishnan CMGSDB: integrating heterogeneous Caenorhabditis elegans data sources using compositional data mining Nucleic Acids Res., January 11, 2008; 36(suppl_1): D69 - D76. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Hibbs, D. C. Hess, C. L. Myers, C. Huttenhower, K. Li, and O. G. Troyanskaya Exploring the functional landscape of gene expression: directed search of large microarray compendia Bioinformatics, October 15, 2007; 23(20): 2692 - 2699. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Buness, R. Kuner, M. Ruschhaupt, A. Poustka, H. Sultmann, and A. Tresch Identification of aberrant chromosomal regions from gene expression microarray studies applied to human breast cancer Bioinformatics, September 1, 2007; 23(17): 2273 - 2280. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lu, X. He, and S. Zhong Cross-species microarray analysis with the OSCAR system suggests an INSR->Pax6->NQO1 neuro-protective pathway in aging and Alzheimer's disease Nucleic Acids Res., July 13, 2007; 35(suppl_2): W105 - W114. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Shi, M. Klustein, I. Simon, T. Mitchell, and Z. Bar-Joseph Continuous hidden process model for time series expression experiments Bioinformatics, July 1, 2007; 23(13): i459 - i467. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Yosef, Z. Yakhini, A. Tsalenko, V. Kristensen, A.-L. Borresen-Dale, E. Ruppin, and R. Sharan A supervised approach for identifying discriminating genotype patterns and its application to breast cancer data Bioinformatics, January 15, 2007; 23(2): e91 - e98. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Prelic, S. Bleuler, P. Zimmermann, A. Wille, P. Buhlmann, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler A systematic comparison and evaluation of biclustering methods for gene expression data Bioinformatics, May 1, 2006; 22(9): 1122 - 1129. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Dolinski and D. Botstein Changing perspectives in yeast research nearly a decade after the genome sequence Genome Res., December 1, 2005; 15(12): 1611 - 1619. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Flaherty, G. Giaever, J. Kumm, M. I. Jordan, and A. P. Arkin A latent variable model for chemogenomic profiling Bioinformatics, August 1, 2005; 21(15): 3286 - 3293. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tanay, R. Sharan, M. Kupiec, and R. Shamir Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data PNAS, March 2, 2004; 101(9): 2981 - 2986. [Abstract] [Full Text] [PDF] |
||||



