Skip Navigation


Bioinformatics Advance Access originally published online on November 16, 2004
Bioinformatics 2005 21(7):1280-1281; doi:10.1093/bioinformatics/bti141
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/7/1280    most recent
bti141v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (13)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Caraux, G.
Right arrow Articles by Pinloche, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Caraux, G.
Right arrow Articles by Pinloche, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

PermutMatrix: a graphical environment to arrange gene expression profiles in optimal linear order

Gilles Caraux 1,2,3,* and Sylvie Pinloche 1

1Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier 161 rue Ada, 34392 Montpellier cedex 5, France
2Ecole Nationale Supérieure Agronomique de Montpellier 2, Place P.Viala, 34060 Montpellier Cedex 02, France
3Research School of Biological Sciences, Australian National University Canberra, Australia

*To whom correspondence should be addressed.


    Abstract
 TOP
 Abstract
 INTRODUCTION
 CONCLUSION
 REFERENCES
 

Summary: PermutMatrix is a work space designed to graphically explore gene expression data. It relies on the graphical approach introduced by Eisen and also offers several methods for the optimal reorganization of rows and columns of a numerical dataset. For example, several methods are proposed for optimal reorganization of the leaves of a hierarchical clustering tree, along with several seriation or unidimensional scaling methods that do not require any preliminary hierarchical clustering. This program, developed for MS Windows, with MS-Visual C++, has a clear and efficient graphical interface. Large datasets can be thoroughly and quickly analyzed.

Availability: http://www.lirmm.fr/~caraux/PermutMatrix/

Contact: caraux{at}lirmm.fr


    INTRODUCTION
 TOP
 Abstract
 INTRODUCTION
 CONCLUSION
 REFERENCES
 
To analyze DNA microarray data, it is very useful to organize and cluster genes according to similarities in their expression profiles. Standard hierarchical clustering methods are appropriate for this operation, as they provide a tree that symbolizes the structure of similarities and in which clusters can be defined. Simultaneous display of the clustering tree and the colored representation of the data matrix, as proposed by Eisen et al. (1998), is a very useful feature, as it can be readily interpreted by biologists. The simplicity and efficacy of this representation has made it very successful. In PermutMatrix, this approach is supplemented with several optimal linear reordering methods, such as reorganization of the leaves of a clustering tree, unidimensional scaling and seriation.

Optimal reorganization of the leaves. In the standard hierarchical clustering approach, the gene order is the order in which the leaves of the clustering tree are enumerated. However, this enumeration is not unique, as the inversion of any subtree leaves does not change the general topology of the clustering tree. The number of possible enumerations is 2n–1, where n is the total number of leaves in the tree. Then the question arises as to whether it is possible to choose the best possible organization of the leaves of the tree, in order to obtain the best graphical display of the data with the Eisen approach. Several methods have been proposed (Bar-Joseph et al., 2001; Degerman, 1982; Gruvaeus and Wainer, 1972) and are available in PermutMatrix (Fig. 1c). They differ with respect to the criterion to be optimized and the optimization algorithm.



View larger version (42K):
[in this window]
[in a new window]
 
Fig. 1 PermutMatrix has a standard multiwindow operating environment. Various clustering and seriation methods can be selected via the toolbar and parameter options. Each result is presented in a separate window. The above figure, shows the results of a sample treatment. Window (a) shows the initial data (log2-ratio) prior to clustering, window (b) displays the results of a zero-constraint seriation operation (Gelfand, 1971), window (c) contains a hierarchical clustering result, after optimal reorganization (Bar-Joseph et al., 2001) and window (d) shows a reorganized set of classes. In this figure, there are also interpretation aids: the mean expression profile of a class of genes in given in windows (c and d) whereas a classified list of designated gene labels is shown in window (e).

 
Unidimensional scaling and seriation. Hierarchical clustering is not aimed at reordering rows and columns of a dataset, as clustering is not the same operation as ordering. Other methods are specifically designed for ordering objects, such as unidimensional scaling (Hubert and Arabie, 1986) or seriation (Kendall, 1982). Unidimensional scaling methods involve placing a set of objects along a row so that the distances between points best reflect the dissimilarities between objects. Seriation methods (Fig. 1b) assume that there is an unknown order between the objects, and they attempt to infer this order. These methods, some of which are old, have been widely developed and implemented in archaeology and psychology. For example, they were successfully used to establish the chronology of appearance of ancient objects (Marquardt, 1978). However, these methods are relatively unknown in biology. Five of them are available in PermutMatrix and are described in detail on the program website. The criteria to be optimized in the seriation methods are the same as those used in optimal reordering of the leaves of a tree. However, the algorithms implemented here are heuristics, because the research space is too wide and unstructured. There are n! ways to reorganize a set of n objects.

Identification and reorganization of classes. In PermutMatrix, the clustering tree methods and seriation or unidimensional scaling methods can be combined. It is also possible to define a class by aggregating, in the clustering tree, the leaves derived from the same node. Subsequently, the tree is no longer completely ramified (Fig. 1d). This takes into account that some terminal ramifications are not significant, and associated leaves can be reordered without tree constraint. The tree reorganization occurs on two levels: the classes are reordered as the leaves of a tree, and the leaves within each class are linearly reordered by seriation or unidimensional scaling.

Manual operations. The PermutMatrix graphical interface allows several manual operations: inversion, permutation, sorting, etc. These operations can be used to refine or locally explore the optimal solutions obtained by the methods given above.


    CONCLUSION
 TOP
 Abstract
 INTRODUCTION
 CONCLUSION
 REFERENCES
 
PermutMatrix is a user-friendly and exploratory work space in which the graphical Eisen approach can be easily used and extended to optimal reorganization methods, which are less utilized than hierarchical clustering. These methods usually yield different and complementary results, therefore contributing to the understanding and identification of different gene expression profiles. It was designed for MS Windows and accepts any input data file in a standard text file format or in Eisen's Cluster format.


    Acknowledgments
 
This program was partly developed during a sabbatical position at the Research School of Biological Sciences of the Australian National University. We are grateful for all support and friendship during this period. We have also been supported by Montpellier L-R Genopole.

Received on June 10, 2004; revised on September 3, 2004; accepted on October 4, 2004

    REFERENCES
 TOP
 Abstract
 INTRODUCTION
 CONCLUSION
 REFERENCES
 

    Bar-Joseph, Z., Gifford, D., Jaakkola, T.S. (2001) Fast optimal leaf ordering for hierarchical clustering. Bioinformatics, 17, S22–S29[Abstract].

    Degerman, R. (1982) Ordered binary trees constructed through an application of Kendall's tau. Psychometrika, 47, 523–527[CrossRef].

    Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci., USA, 95, 14863–14868[Abstract/Free Full Text].

    Gelfand, A.E. (1971) Rapid seriation methods with archaeological application. In Hodson, F.R., Kendall, D.G., Tautu, D.G. (Eds.). Mathematics in the Archaeological and Historical Sciences, , Edinburgh Edinburgh University Press, pp. 186–201.

    Gruvaeus, G. and Wainer, H. (1972) Two additions to hierarchical cluster analysis. Br. J. Math. Stat. Psychol., 25, 200–206.

    Hubert, L.J. and Arabie, P. (1986) Unidimensional scaling and combinatorial optimization. In de Leeuw, J., Heiser, W., Meulman, J., Critchley, F. (Eds.). Multidimensional Data Analysis, , Leiden, The Netherlands DSWO Press, pp. 181–196.

    Kendall, D.G. (1982) Seriation. In Kotz, S. and Johnson, N.L. (Eds.). Encyclopedia of Stastistical Sciences, , New York, NY Wiley-Interscience Vol. 8, , pp. 417–424.

    Marquardt, W.H. (1978) Advances in archaeological seriation. In Schiffer, M.B. (Ed.). Advances in Archaeological Method and Theory, , Orlando, FL Academic Press Vol. 1, , pp. 257–314.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Physiol. GenomicsHome page
E. de la Vega, M. R. Hall, K. J. Wilson, A. Reverter, R. G. Woods, and B. M. Degnan
Stress-induced gene expression profiling in the black tiger shrimp Penaeus monodon
Physiol Genomics, September 11, 2007; 31(1): 126 - 138.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
T. Vuocolo, K. Byrne, J. White, S. McWilliam, A. Reverter, N. E. Cockett, and R. L. Tellam
Identification of a gene network contributing to hypertrophy in callipyge skeletal muscle
Physiol Genomics, February 12, 2007; 28(3): 253 - 272.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Reverter, A. Ingham, S. A. Lehnert, S.-H. Tan, Y. Wang, A. Ratnakumar, and B. P. Dalrymple
Simultaneous identification of differential gene expression and connectivity in inflammation, adipogenesis and cancer
Bioinformatics, October 1, 2006; 22(19): 2396 - 2404.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/7/1280    most recent
bti141v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (13)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Caraux, G.
Right arrow Articles by Pinloche, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Caraux, G.
Right arrow Articles by Pinloche, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?