Bioinformatics Vol. 18 no. 11 2002
Pages 1446-1453
© 2002 Oxford University Press
Gene expression data analysis with a dynamically extended self-organized map that exploits class information
1 Department of Medical Physics,
School of Medicine, University of Patras, 26500 Patras, Greece
2 Department of Information Management,
Technological Educational Institute of Kavala, 65404 Kavala, Greece
Received on September 17, 2001
; revised on March 20, 2002
; accepted on May 9, 2002
Motivation: Currently the most popular approach to analyze genome-wide expression data is clustering. One of the major drawbacks of most of the existing clustering methods is that the number of clusters has to be specified a priori. Furthermore, by using pure unsupervised algorithms prior biological knowledge is totally ignored Moreover, most current tools lack an effective framework for tight integration of unsupervised and supervised learning for the analysis of high-dimensional expression data and only very few multi-class supervised approaches are designed with the provision for effectively utilizing multiple functional class labeling.
Results: The paper adapts a novel Self-Organizing map called supervised Network Self-Organized Map (sNet-SOM) to the peculiarities of multi-labeled gene expression data. The sNet-SOM determines adaptively the number of clusters with a dynamic extension process. This process is driven by an inhomogeneous measure that tries to balance unsupervised, supervised and model complexity criteria. Nodes within a rectangular grid are grown at the boundary nodes, weights rippled from the internal nodes towards the outer nodes of the grid, and whole columns inserted within the map The appropriate level of expansion is determined automatically. Multiple sNet-SOM models are constructed dynamically each for a different unsupervised/supervised balance and model selection criteria are used to select the one optimum one. The results indicate that sNet-SOM yields competitive performance to other recently proposed approaches for supervised classification at a significantly reduced computational cost and it provides extensive exploratory analysis potentiality within the analysis framework. Furthermore, it explores simple design decisions that are easier to comprehend and computationally efficient.
Availability: The source code of the algorithms presented in the paper can be downloaded from http://heart.med.upatras.gr. The implementation is in Borland C++ Builder 5.0.
Contact: severina{at}heart.med.upatras.gr
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
P. J. Woolf, W. Prudhomme, L. Daheron, G. Q. Daley, and D. A. Lauffenburger Bayesian analysis of signaling networks governing embryonic stem cell fate decisions Bioinformatics, March 15, 2005; 21(6): 741 - 753. [Abstract] [Full Text] [PDF] |
||||
