Bioinformatics Advance Access originally published online on October 27, 2004
Bioinformatics 2005 21(6):730-740; doi:10.1093/bioinformatics/bti067
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Relational patterns of gene expression via non-metric multidimensional scaling analysis
1Department of Physics, Faculty of Science and Technology, Chuo University 1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan
2Institute for Science and Technology, Chuo University 1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan
3Department of Physics 1110 W. Green Street, Urbana, IL 61801, USA
4Institute for Genomic Biology, University of Illinois at Urbana-Champaign Urbana, IL 61801, USA
*To whom correspondence should be addressed.
Motivation: Microarray experiments result in large-scale data sets that require extensive mining and refining to extract useful information. We demonstrate the usefulness of (non-metric) multidimensional scaling (MDS) method in analyzing a large number of genes. Applying MDS to the microarray data is certainly not new, but the existing works are all on small numbers (<100) of points to be analyzed. We have been developing an efficient novel algorithm for non-metric MDS (nMDS) analysis for very large data sets as a maximally unsupervised data mining device. We wish to demonstrate its usefulness in the context of bioinformatics (unraveling relational patterns among genes from time series data in this paper).
Results: The Pearson correlation coefficient with its sign flipped is used to measure the dissimilarity of the gene activities in transcriptional response of cell-cycle-synchronized human fibroblasts to serum. These dissimilarity data have been analyzed with our nMDS algorithm to produce an almost circular relational pattern of the genes. The obtained pattern expresses a temporal order in the data in this example; the temporal expression pattern of the genes rotates along this circular arrangement and is related to the cell cycle. For the data we analyze in this paper we observe the following. If an appropriate preparation procedure is applied to the original data set, linear methods such as the principal component analysis (PCA) could achieve reasonable results, but without data preprocessing linear methods such as PCA cannot achieve a useful picture. Furthermore, even with an appropriate data preprocessing, the outcomes of linear procedures are not as clear-cut as those by nMDS without preprocessing.
Availability: The FORTRAN source code of the method used in this analysis (pure nMDS) is available at http://www.granular.com/MDS/
Contact: tag{at}granular.com
Supplementary information: http://www.granular.com/MDS/B1_2005.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. Zhu and J. Yu Nonmetric Multidimensional Scaling Corrects for Population Structure in Association Mapping With Different Sample Types Genetics, July 1, 2009; 182(3): 875 - 888. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rajaram A novel meta-analysis method exploiting consistency of high-throughput experiments Bioinformatics, March 1, 2009; 25(5): 636 - 642. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. F. M. Krabbe, J. A. Salomon, and C. J. L. Murray Quantification of Health States with Rank-Based Nonmetric Multidimensional Scaling Med Decis Making, August 1, 2007; 27(4): 395 - 405. [Abstract] [PDF] |
||||
![]() |
M. A. Zapala and N. J. Schork Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables PNAS, December 19, 2006; 103(51): 19430 - 19435. [Abstract] [Full Text] [PDF] |
||||



