Bioinformatics Vol. 17 no. 2 2001
Pages 126-136
© 2001 Oxford University Press
Original Paper |
A hierarchical unsupervised growing neural network for clustering gene expression patterns
1 Bioinformatics, CNIO, Ctra.
Majadahonda-Pozuelo, Km 2, Majadahonda, 28220 Madrid
2 Protein Design Group CNB-CSIC, 28049
Madrid, Spain
Received on August 6, 2000
; revised on September 29, 2000
; accepted on October 6, 2000
Motivation: We describe a new approach to the analysis of gene expression data coming from DNA array experiments, using an unsupervised neural network. DNA array technologies allow monitoring thousands of genes rapidly and efficiently. One of the interests of these studies is the search for correlated gene expression patterns, and this is usually achieved by clustering them. The Self-Organising Tree Algorithm, (SOTA) (Dopazo,J. and Carazo,J.M. (1997) J. Mol. Evol. , 44, 226233), is a neural network that grows adopting the topology of a binary tree. The result of the algorithm is a hierarchical cluster obtained with the accuracy and robustness of a neural network.
Results: SOTA clustering confers several advantages over classical hierarchical clustering methods. SOTA is a divisive method: the clustering process is performed from top to bottom, i.e. the highest hierarchical levels are resolved before going to the details of the lowest levels. The growing can be stopped at the desired hierarchical level. Moreover, a criterion to stop the growing of the tree, based on the approximate distribution of probability obtained by randomisation of the original data set, is provided. By means of this criterion, a statistical support for the definition of clusters is proposed. In addition, obtaining average gene expression patterns is a built-in feature of the algorithm. Different neurons defining the different hierarchical levels represent the averages of the gene expression patterns contained in the clusters.
Since SOTA runtimes are approximately linear with the number of items to be classified, it is especially suitable for dealing with huge amounts of data. The method proposed is very general and applies to any data providing that they can be coded as a series of numbers and that a computable measure of similarity between data items can be used.
Availability: A server running the program can be found at: http://bioinfo.cnio.es/sotarray
Contact: jdopazo{at}cnio.es
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. Alexandre, Y. Moller-Steinbach, N. Schonrock, W. Gruissem, and L. Hennig Arabidopsis MSI1 Is Required for Negative Regulation of the Response to Drought Stress Mol Plant, July 1, 2009; 2(4): 675 - 687. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Liu, A. Zhu, L. Chai, W. Zhou, K. Yu, J. Ding, J. Xu, and X. Deng Transcriptome analysis of a spontaneous mutant in sweet orange [Citrus sinensis (L.) Osbeck] during fruit development J. Exp. Bot., March 1, 2009; 60(3): 801 - 813. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Horcajadas, P. Minguez, J. Dopazo, F. J. Esteban, F. Dominguez, L. C. Giudice, A. Pellicer, and C. Simon Controlled Ovarian Stimulation Induces a Functional Genomic Delay of the Endometrium with Potential Clinical Implications J. Clin. Endocrinol. Metab., November 1, 2008; 93(11): 4500 - 4510. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. M. Ratner, J. Cui, M. Steffen, L. L. Moore, P. W. Robbins, and J. Samuelson Changes in the N-Glycome, Glycoproteins with Asn-Linked Glycans, of Giardia lamblia with Differentiation from Trophozoites to Cysts Eukaryot. Cell, November 1, 2008; 7(11): 1930 - 1940. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. DURDA and L. BUCHANAN WINDSORS: Windsor improved norms of distance and similarity of representations of semantics Behav Res Methods, August 1, 2008; 40(3): 705 - 712. [Abstract] [PDF] |
||||
![]() |
J. Tarraga, I. Medina, J. Carbonell, J. Huerta-Cepas, P. Minguez, E. Alloza, F. Al-Shahrour, S. Vegas-Azcarate, S. Goetz, P. Escobar, et al. GEPAS, a web-based tool for microarray data analysis and interpretation Nucleic Acids Res., July 1, 2008; 36(suppl_2): W308 - W314. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Brehelin, O. Gascuel, and O. Martin Using repeated measurements to validate hierarchical gene clusters Bioinformatics, March 1, 2008; 24(5): 682 - 688. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Moco, E. Capanoglu, Y. Tikunov, R. J. Bino, D. Boyacioglu, R. D. Hall, J. Vervoort, and R. C. H. De Vos Tissue specialization at the metabolite level is perceived during the development of tomato fruit J. Exp. Bot., December 7, 2007; (2007) erm271v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Madoz-Gurpide, M. Canamero, L. Sanchez, J. Solano, P. Alfonso, and J. I. Casal A Proteomics Analysis of Cell Signaling Alterations in Colorectal Cancer Mol. Cell. Proteomics, December 1, 2007; 6(12): 2150 - 2164. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Kerhornou and R. Guigo BioMoby web services to support clustering of co-regulated genes based on similarity of promoter configurations Bioinformatics, July 15, 2007; 23(14): 1831 - 1833. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Nueda, A. Conesa, J. A. Westerhuis, H. C. J. Hoefsloot, A. K. Smilde, M. Talon, and A. Ferrer Discovering gene expression patterns in time course microarray experiments by ANOVA SCA Bioinformatics, July 15, 2007; 23(14): 1792 - 1800. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Pihur, S. Datta, and S. Datta Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach Bioinformatics, July 1, 2007; 23(13): 1607 - 1615. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Mendes-Ferreira, M. del Olmo, J. Garcia-Martinez, E. Jimenez-Marti, A. Mendes-Faia, J. E. Perez-Ortin, and C. Leao Transcriptional Response of Saccharomyces cerevisiae to Different Nitrogen Concentrations during Alcoholic Fermentation Appl. Envir. Microbiol., May 1, 2007; 73(9): 3049 - 3060. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. V. Wong, F. K. Wong, and G. R. Wood A multi-stage approach to clustering and imputation of gene expression profiles Bioinformatics, April 15, 2007; 23(8): 998 - 1005. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Sanchez-Aguilera, C. Montalban, P. de la Cueva, L. Sanchez-Verde, M. M. Morente, M. Garcia-Cosio, J. Garcia-Larana, C. Bellas, M. Provencio, V. Romagosa, et al. Tumor microenvironment and mitotic checkpoint are key factors in the outcome of classic Hodgkin lymphoma Blood, July 15, 2006; 108(2): 662 - 668. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Montaner, J. Tarraga, J. Huerta-Cepas, J. Burguet, J. M. Vaquerizas, L. Conde, P. Minguez, J. Vera, S. Mukherjee, J. Valls, et al. Next station in microarray data analysis: GEPAS. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W486 - W491. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Conesa, M. J. Nueda, A. Ferrer, and M. Talon maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments Bioinformatics, May 1, 2006; 22(9): 1096 - 1102. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Cummings, H. J. Bootsma, D. A. Relman, and J. F. Miller Species- and Strain-Specific Control of a Complex, Flexible Regulon by Bordetella BvgAS. J. Bacteriol., March 1, 2006; 188(5): 1775 - 1785. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Larranaga, B. Calvo, R. Santana, C. Bielza, J. Galdiano, I. Inza, J. A. Lozano, R. Armananzas, G. Santafe, A. Perez, et al. Machine learning in bioinformatics Brief Bioinform, March 1, 2006; 7(1): 86 - 112. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Lopez-Bigas, B. J. Blencowe, and C. A. Ouzounis Highly consistent patterns for inherited human diseases at the molecular level Bioinformatics, February 1, 2006; 22(3): 269 - 277. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Alba, P. Payton, Z. Fei, R. McQuinn, P. Debbie, G. B. Martin, S. D. Tanksley, and J. J. Giovannoni Transcriptome and Selected Metabolite Analyses Reveal Multiple Points of Ethylene Control during Tomato Fruit Development PLANT CELL, November 1, 2005; 17(11): 2954 - 2965. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. P. Romijn, C. Christis, M. Wieffer, J. W. Gouw, A. Fullaondo, P. van der Sluijs, I. Braakman, and A. J. R. Heck Expression Clustering Reveals Detailed Co-expression Patterns of Functionally Related Proteins during B Cell Differentiation: A Proteomic Study Using a Combination of One-Dimensional Gel Electrophoresis, LC-MS/MS, and Stable Isotope Labeling by Amino Acids in Cell Culture (SILAC) Mol. Cell. Proteomics, September 1, 2005; 4(9): 1297 - 1310. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Handl, J. Knowles, and D. B. Kell Computational cluster validation in post-genomic data analysis Bioinformatics, August 1, 2005; 21(15): 3201 - 3212. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Vaquerizas, L. Conde, P. Yankilevich, A. Cabezon, P. Minguez, R. Diaz-Uriarte, F. Al-Shahrour, J. Herrero, and J. Dopazo GEPAS, an experiment-oriented pipeline for the analysis of microarray gene expression data Nucleic Acids Res., July 1, 2005; 33(suppl_2): W616 - W620. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Martinez, M. Sanchez-Beato, A. Carnero, V. Moneo, J. C. Tercero, I. Fernandez, M. Navarrete, J. Jimeno, and M. A. Piris Transcriptional signature of Ecteinascidin 743 (Yondelis, Trabectedin) in human sarcoma cells explanted from chemo-naive patients Mol. Cancer Ther., May 1, 2005; 4(5): 814 - 823. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Wild, R. C. Krieg, J. Seidl, R. Stoehr, K. Reher, C. Hofmann, J. Louhelainen, A. Rosenthal, A. Hartmann, C. Pilarsky, et al. RNA expression profiling of normal and tumor cells following photodynamic therapy with 5-aminolevulinic acid-induced protoporphyrin IX in vitro Mol. Cancer Ther., April 1, 2005; 4(4): 516 - 528. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Bolshakova, F. Azuaje, and Pád. Cunningham An integrated tool for microarray data clustering and cluster validity assessment Bioinformatics, February 15, 2005; 21(4): 451 - 455. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Hanifi-Moghaddam, S. C. J. P. Gielen, H. J. Kloosterboer, M. E. De Gooyer, A. M. Sijbers, A. J. van Gool, M. Smid, M. Moorhouse, F. H. van Wijk, C. W. Burger, et al. Molecular Portrait of the Progestagenic and Estrogenic Actions of Tibolone: Behavior of Cellular Networks in Response to Tibolone J. Clin. Endocrinol. Metab., February 1, 2005; 90(2): 973 - 983. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Martinez-Delgado, B. Melendez, M. Cuadros, J. Alvarez, J. M. Castrillo, A. Ruiz de la Parte, M. Mollejo, C. Bellas, R. Diaz, L. Lombardia, et al. Expression Profiling of T-Cell Lymphomas Differentiates Peripheral and Lymphoblastic Lymphomas and Defines Survival Related Genes Clin. Cancer Res., August 1, 2004; 10(15): 4971 - 4982. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. I. Castillo-Davis, F. A. Kondrashov, D. L. Hartl, and R. J. Kulathinal The Functional Genomic Distribution of Protein Divergence in Two Animal Phyla: Coevolution, Genomic Conflict, and Constraint Genome Res., May 1, 2004; 14(5): 802 - 811. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Tracey, R. Villuendas, A. M. Dotor, I. Spiteri, P. Ortiz, J. F. Garcia, J. L. R. Peralto, M. Lawler, and M. A. Piris Mycosis fungoides shows concurrent deregulation of multiple genes involved in the TNF signaling pathway: an expression profile study Blood, August 1, 2003; 102(3): 1042 - 1050. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Herrero, F. Al-Shahrour, R. Diaz-Uriarte, A. Mateos, J. M. Vaquerizas, J. Santoyo, and J. Dopazo GEPAS: a web-based resource for microarray gene expression data analysis Nucleic Acids Res., July 1, 2003; 31(13): 3461 - 3467. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Ressom, D. Wang, and P. Natarajan Clustering gene expression data using adaptive double self-organizing map Physiol Genomics, June 24, 2003; 14(1): 35 - 46. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Honys and D. Twell Comparative Analysis of the Arabidopsis Pollen Transcriptome Plant Physiology, June 1, 2003; 132(2): 640 - 652. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Martin, J. Herrero, A. Mateos, and J. Dopazo Comparing Bacterial Genomes Through Conservation Profiles Genome Res., May 1, 2003; 13(5): 991 - 998. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. TURECI, J. DING, H. HILTON, H. BIAN, H. OHKAWA, M. BRAXENTHALER, G. SEITZ, L. RADDRIZZANI, H. FRIESS, M. BUCHLER, et al. Computational dissection of tissue contamination for identification of colon cancer-specific expression profiles FASEB J, March 1, 2003; 17(3): 376 - 385. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Mateos, J. Dopazo, R. Jansen, Y. Tu, M. Gerstein, and G. Stolovitzky Systematic Learning of Gene Functional Classes From DNA Array Expression Data by Using Multilayer Perceptrons Genome Res., November 1, 2002; 12(11): 1703 - 1715. [Abstract] [Full Text] [PDF] |
||||


















