Bioinformatics Advance Access originally published online on March 29, 2005
Bioinformatics 2005 21(11):2789-2790; doi:10.1093/bioinformatics/bti394
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MADE4: an R package for multivariate analysis of gene expression data
1Bioinformatics, Conway Institute, University College Dublin Dublin 4, Ireland
2Laboratoire de Biométrie et Biologie Évolutive, Université Claude Bernard Lyon 1, 43, bd. du 11 Novembre 1918, 69622 Villeurbanne Cedex, France
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Summary: MADE4, microarray ade4, is a software package that facilitates multivariate analysis of microarray gene-expression data. MADE4 accepts a wide variety of gene-expression data formats. MADE4 takes advantage of the extensive multivariate statistical and graphical functions in the R package ade4, extending these for application to microarray data. In addition, MADE4 provides new graphical and visualization tools that aid in interpretation of multivariate analysis of microarray data.
Availability: The R package MADE4 is available from Bioconductor http://bioinf.vcd.ie/software and from Bioconductor http://www.bioconductor.org
Contact: aedin.culhane{at}ucd.ie
Supplementary information: MADE4 is well documented. There are tutorials, in the form of vignettes, which describe typical analyses. In addition, the MADE4 manual provides descriptions and examples for each function.
| 1 INTRODUCTION |
|---|
|
|
|---|
The aim in writing microarray ade4 (MADE4) was to provide a simple-to-use tool for multivariate analysis of microarray data. Multivariate approaches have been applied very successfully in the analysis of microarray data. Principal component analysis (PCA) has been shown to be useful in exploratory analysis of linear trends in data (Raychaudhuri et al., 2000; Crescenzi and Giuliani, 2001). Fellenberg et al. (2001) described the application of correspondence analysis to study the association between microarray samples and genes in a reduced dimensional space. A group ordination approach was applied successfully to classification and class prediction of microarray samples (Culhane et al., 2002). More recently, Culhane et al. (2003) employed a two-table coupling method (coinertia analysis, CIA) to examine covariant gene-expression patterns between microarray datasets from different platforms.
Although PCA is available in several R packages, including stats and amap, the R package ade4 contains many additional multivariate statistical methods including methods for analysis of one-data matrix, coupling of two-data matrices or multi-table analysis, http://cran.univ-lyon1.fr/doc/Rnews/4Rnews_2004-1.pdf (Thioulouse et al., 1997; Chessel et al., 2004). These latter methods for integrating multiple datasets make this particular package very attractive for analysis of microarray data. MADE4 is developed as an extension to ade4 to facilitate input and analysis of microarray data. In order to provide this functionality, MADE4 is integrated with Bioconductor (Gentleman et al., 2004), probably the most popular microarray analysis software, which contains numerous packages for preprocessing, normalization, gene filtering and analysis of microarray data.
| 2 DATA INPUT |
|---|
|
|
|---|
MADE4 accepts a wide variety of gene-expression data input formats, including Bioconductor AffyBatch, exprSet, marrayRaw, and standard R matrix formats (data.frame or matrix). MADE4 will automatically recognize these data formats, and no additional data processing is required.
| 3 MULTIVARIATE ANALYSIS |
|---|
|
|
|---|
The function ord simplifies running ordination methods such as principal component, correspondence or non-symmetric correspondence analysis. It provides a wrapper which calls each of these methods in ade4.
results.coa <- ord(data, type = coa)
| 4 BETWEEN GROUPS ANALYSIS (BGA) |
|---|
|
|
|---|
Between-group analysis (BGA) is a supervised classification method (Culhane et al., 2002). The basis of BGA is to ordinate the groups rather than the individual samples. In tests on two microarray gene-expression datasets, BGA performed comparably to a range of supervised classification methods, including support vector machines and artificial neural networks (Culhane et al., 2002). An attractive feature of BGA is that it is not limited by the large number of genes relative to the number of samples typical of microarray data. BGA of a dataset can be performed using the function bga. The projection of test data on BGA axes can be assessed using the function suppl. Leave-one-out cross validation can be performed using bga.jackknife.
results.bga <- bga(data, classvector)
| 5 CO-INERTIA ANALYSIS (CIA) |
|---|
|
|
|---|
Co-inertia analysis (CIA) has been applied to the cross-platform comparison of microarray gene-expression datasets (Culhane et al., 2003). CIA is a multivariate method that identifies trends or co-relationships in multiple datasets which contain the same cases or variables. That is, either the rows or the columns of a matrix must be matchable. CIA can be applied to datasets where the number of variables (genes) far exceeds the number of samples (arrays) (Fig. 1).
|
results.cia<-cia(dataset1, dataset2)
| 6 VISUALISATION OF RESULTS |
|---|
|
|
|---|
There are many functions in MADE4 to visualise results. The simplest way to view results produced by ord, bga or cia is to use plot. Microarray samples (or genes) can be colour coded if a vector of class membership is given.
In addition, there are functions for drawing 1D and 3D plots. For example, the function html3D produces output which can be visualized using jmol, Rasmol or chime (Fig. 2), providing a free and very useful interface for colouring, rotating, zooming and manipulating 3D graphs.
|
| Acknowledgments |
|---|
We would like to acknowledge the assistance of Dr Florent Baty, Ian Jeffery and Ailís Fagan in testing development versions of MADE4.
Received on February 9, 2005; revised on March 16, 2005; accepted on March 16, 2005
| REFERENCES |
|---|
|
|
|---|
Rnews Chessel, D., et al. (2004) The ade4 packageI: One-table methods.
Crescenzi, M. and Giuliani, A. (2001) The main biological determinants of tumor line taxonomy elucidated by a principal component analysis of microarray data. FEBS Lett., 507, 114118[CrossRef][Web of Science][Medline].
Culhane, A.C., et al. (2002) Between-group analysis of microarray data. Bioinformatics, 18, 16001608
Culhane, A.C., et al. (2003) Cross-platform comparison and visualisation of gene expression data using co-inertia analysis. BMC Bioinformatics, 4, 59[CrossRef][Medline].
Fellenberg, K., et al. (2001) Correspondence analysis applied to microarray data. Proc. Natl Acad. Sci. USA, 98, 1078110786
Gentleman, R.C., et al. (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol., 5, R80[CrossRef][Medline].
Raychaudhuri, S., et al. (2000) Principal components analysis to summarize microarray experiments: application to sporulation time series. Pac. Symp. Biocomput., 455466.
Thioulouse, J., et al. (1997) ADE-4: a multivariate analysis and graphical display software. Stat. Comput., 7, 7583[CrossRef].
This article has been cited by other articles:
![]() |
M. E. Figueroa, L. Skrabanek, Y. Li, A. Jiemjit, T. E. Fandy, E. Paietta, H. Fernandez, M. S. Tallman, J. M. Greally, H. Carraway, et al. MDS and secondary AML display unique patterns and abundance of aberrant DNA methylation Blood, October 15, 2009; 114(16): 3448 - 3458. [Abstract] [Full Text] [PDF] |
||||
![]() |
Widodo, J. H. Patterson, E. Newbigin, M. Tester, A. Bacic, and U. Roessner Metabolic responses to salt stress of barley (Hordeum vulgare L.) cultivars, Sahara and Clipper, which differ in salinity tolerance J. Exp. Bot., October 1, 2009; 60(14): 4089 - 4103. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. C. Culhane and J. Quackenbush Confounding Effects in "A Six-Gene Signature Predicting Breast Cancer Lung Metastasis" Cancer Res., September 15, 2009; 69(18): 7480 - 7485. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. E. Figueroa, B. J. Wouters, L. Skrabanek, J. Glass, Y. Li, C. A. J. Erpelinck-Verschueren, A. W. Langerak, B. Lowenberg, M. Fazzari, J. M. Greally, et al. Genome-wide epigenetic analysis delineates a biologically distinct immature acute leukemia with myeloid/T-lymphoid features Blood, March 19, 2009; 113(12): 2795 - 2804. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. R. Johnson, E. L. Brodie, A. E. Hubbard, G. L. Andersen, S. H. Zinder, and L. Alvarez-Cohen Temporal Transcriptomic Microarray Analysis of "Dehalococcoides ethenogenes" Strain 195 during the Transition into Stationary Phase Appl. Envir. Microbiol., May 1, 2008; 74(9): 2864 - 2872. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. C Walsh, L. Brennan, E. Pujos-Guillot, J.-L. Sebedio, A. Scalbert, A. Fagan, D. G Higgins, and M. J Gibney Influence of acute phytochemical intake on human urinary metabolomic profiles Am. J. Clinical Nutrition, December 1, 2007; 86(6): 1687 - 1693. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. B. Jeffery, S. F. Madden, P. A. McGettigan, G. Perriere, A. C. Culhane, and D. G. Higgins Integrating transcription factor binding site information with gene expression datasets Bioinformatics, February 1, 2007; 23(3): 298 - 305. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. L. Brodie, T. Z. DeSantis, D. C. Joyner, S. M. Baek, J. T. Larsen, G. L. Andersen, T. C. Hazen, P. M. Richardson, D. J. Herman, T. K. Tokunaga, et al. Application of a High-Density Oligonucleotide Microarray Approach To Study Bacterial Population Dynamics during Uranium Reduction and Reoxidation Appl. Envir. Microbiol., September 1, 2006; 72(9): 6288 - 6298. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bibikova, E. Chudin, B. Wu, L. Zhou, E. W. Garcia, Y. Liu, S. Shin, T. W. Plaia, J. M. Auerbach, D. E. Arking, et al. Human embryonic stem cells have a unique epigenetic signature Genome Res., September 1, 2006; 16(9): 1075 - 1083. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








