Skip Navigation


Bioinformatics Advance Access originally published online on February 2, 2006
Bioinformatics 2006 22(7):897-899; doi:10.1093/bioinformatics/btl025
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
22/7/897    most recent
btl025v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (15)
Google Scholar
Right arrow Articles by Wettenhall, J. M.
Right arrow Articles by Smyth, G. K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wettenhall, J. M.
Right arrow Articles by Smyth, G. K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions@oxfordjournals.org

affylmGUI: a graphical user interface for linear modeling of single channel microarray data

James M. Wettenhall , Ken M. Simpson , Keith Satterley * and Gordon K. Smyth

Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research 1G Royal Pde, Parkville 3050, Australia

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 BACKGROUND
 2 SESSION CONTROL
 3 PRE-PROCESSING AND QUALITY...
 4 DIFFERENTIAL EXPRESSION
 5 NON-STANDARD ANALYSES
 REFERENCES
 

Summary: affylmGUI is a graphical user interface (GUI) to an integrated workflow for Affymetrix microarray data. The user is able to proceed from raw data (CEL files) to QC and pre-processing, and eventually to analysis of differential expression using linear models with empirical Bayes smoothing. Output of the analysis (tables and figures) can be exported to an HTML report. The GUI provides user-friendly access to state-of-the-art methods embodied in the Bioconductor software repository.

Availability: affylmGUI is an R package freely available from http://www.bioconductor.org. It requires R version 1.9.0 or later and tcl/tk 8.3 or later and has been successfully tested on Windows 2000, Windows XP, Linux (RedHat and Fedora distributions) and Mac OS/X with X11. Further documentation is available at http://bioinf.wehi.edu.au/affylmGUI

Contact: keith{at}wehi.edu.au


    1 BACKGROUND
 TOP
 ABSTRACT
 1 BACKGROUND
 2 SESSION CONTROL
 3 PRE-PROCESSING AND QUALITY...
 4 DIFFERENTIAL EXPRESSION
 5 NON-STANDARD ANALYSES
 REFERENCES
 
The Bioconductor project (Gentleman et al., 2004) is an enormous repository of academic software for the analysis of genomic data, especially the analysis of microarray data. The use of cutting-edge methodology implemented in Bioconductor packages can greatly improve the power and consistency of experimental results (Irizarry et al., 2005; Kooperberg et al., 2005). Yet the command-line computing environment (R Core Development Team, 2005) used for Bioconductor is very challenging for users without programming experience. There is a pressing need for a menu-driven or graphical user interface (GUI) for the Bioconductor packages to allow biologists to access the methodology without becoming R programmers. Wettenhall and Smyth (2004) earlier described a GUI software package, limmaGUI, for the analysis of two-color spotted microarrays. Here we describe a GUI for the analysis of Affymetrix GeneChip data (http://www.affymetrix.com). Although the overall design and philosophy of the new package is similar to that of limmaGUI, Affymetrix data require different analysis tools.

affylmGUI provides an interface to the affy, gcrma, affyPLM and limma packages of Bioconductor. The software is itself implemented as an R package and uses the interface to Tcl/Tk provided by the R package tcltk. The package enables users to pre-process and visualize their data and generate lists of putatively differentially expressed genes (Gierer et al., 2005). Users have a choice of several state-of-the-art pre-processing methods for Affymetrix CEL files and advanced statistical methods for assessing differential expression. The package provides powerful statistical methods for dealing with small sample sizes and with complex experiments involving many different RNA sources. The package is therefore most useful in those experimental situations which are the most challenging.


    2 SESSION CONTROL
 TOP
 ABSTRACT
 1 BACKGROUND
 2 SESSION CONTROL
 3 PRE-PROCESSING AND QUALITY...
 4 DIFFERENTIAL EXPRESSION
 5 NON-STANDARD ANALYSES
 REFERENCES
 
The analysis session is controlled via a main window. A session begins by prompting the user to specify a targets file. The targets file is a tab-delimited text file specifying the Affymetrix CEL files to be analyzed and the source of RNA hybridized to each chip. A valid targets file features a ‘Name’ column, containing a unique identifier for each array; a ‘Filename’ column, specifying the corresponding CEL file and a ‘Target’ column which indicates the different RNA sources, thereby specifying which arrays are replicates. A session can be saved at any time and reloaded at a later point. An option to export an HTML report featuring diagnostic plots, summary plots and lists of differentially expressed genes is available.


    3 PRE-PROCESSING AND QUALITY ASSESSMENT
 TOP
 ABSTRACT
 1 BACKGROUND
 2 SESSION CONTROL
 3 PRE-PROCESSING AND QUALITY...
 4 DIFFERENTIAL EXPRESSION
 5 NON-STANDARD ANALYSES
 REFERENCES
 
After the targets file has been read, the expression data are read from the CEL files using the affy package. At this point, the user may produce several diagnostic plots, including MA-plots of the perfect match probes and histograms of the raw intensities.

Quality assessment is followed by background correction, normalization and summarization of the probe-level data into probe-set expression values. These three steps are accomplished by one of three algorithms, namely Robust Multi-Array Analysis (RMA) (Irizarry et al., 2003; Gautier et al., 2004), GCRMA (Wu et al., 2004) or Robust Probe Level Models (RPLM) (Bolstad, 2005). GCRMA differs from RMA only in the background correction step, using probe sequence information to help estimate the background. This gives more accurate fold changes at the expense of marginally lower precision. RPLM differs from RMA only in the summarization step, using robust M-estimators rather than median polish to summarize the probe-level measurements. If this option is selected, additional quality assessment can be performed by plotting false-color images of the weights from the robust regression to look for spatial artifacts or for whole chips that are outliers (Bolstad et al., 2005). If a chip is of very poor quality, the user may wish to omit it; this requires the creation of a new targets file which does not contain the aberrant chip.


    4 DIFFERENTIAL EXPRESSION
 TOP
 ABSTRACT
 1 BACKGROUND
 2 SESSION CONTROL
 3 PRE-PROCESSING AND QUALITY...
 4 DIFFERENTIAL EXPRESSION
 5 NON-STANDARD ANALYSES
 REFERENCES
 
After probe-set expression summaries are obtained, the user can proceed to differential expression. The approach taken by the limma package is to analyze the differential expression in terms of linear models (Smyth, 2005). This approach has many advantages as it allows very general experiments to be analyzed in a unified framework, including factorial, saturated or loop designs and time course experiments, but it requires some mathematical sophistication. It requires the user to specify two matrices, the design matrix, which provides a representation of the RNA targets that have been hybridized to the arrays, and the contrast matrix which defines which comparisons between the RNA targets are of interest to the experimenter. affylmGUI greatly eases this process by largely automating the formation of the two matrices. The design matrix is constructed without user intervention. A set of dialogs help the user to define a set of comparisons of interest from which the contrast matrix is constructed (Fig. 1). This could be as simple as a comparison between two groups (e.g. mutant versus wild-type), or something more complicated such as an interaction effect in a factorial design or contrasts in a time course experiment. In simple situations the comparisons are easily selectable using drop-down menus. In more complex situations the contrast matrix is specified using the ‘Advanced’ option in the contrast definition dialog.


Figure 1
View larger version (13K):
[in this window]
[in a new window]
 
Fig. 1 Screenshot of the contrast matrix specification dialog from the Windows version of affylmGUI.

 
A number of statistics for differential expression are provided. For each contrast, affylmGUI returns the log2-fold change, the moderated t-statistic, P-value and the posterior log-odds of differential expression. This moderated t-statistic is similar to an ordinary t-statistic but with standard errors shrunk towards a common value using empirical Bayes methods (Lönnstedt and Speed, 2002; Smyth, 2004). This provides more stable inference and is particularly effective when the number of replicates is low (Kooperberg et al., 2005). The log-odds of differential expression, or B-statistic, is a Bayesian measure which is essentially equivalent to the moderated-t for ranking purposes. When two or more comparisons have been done, the moderated F-statistics are also computed.

The differential expression results can be presented in tables or plots. The estimated fold changes may be displayed in MA-plots. The user can display (or export) a table of the top genes for each contrast, ranked in order of log2-fold change, moderated t-statistic, P-value (adjusted for multiple testing, using one of six different methods) or B-statistic. If there are multiple comparisons for each gene, Venn diagrams and heat diagrams can also be generated.


    5 NON-STANDARD ANALYSES
 TOP
 ABSTRACT
 1 BACKGROUND
 2 SESSION CONTROL
 3 PRE-PROCESSING AND QUALITY...
 4 DIFFERENTIAL EXPRESSION
 5 NON-STANDARD ANALYSES
 REFERENCES
 
If analysis methods or plots other than those available in affylmGUI are desired, a command-line window is provided which allows the user to interact directly with R. More adventurous users have in this way complete flexibility to access the full power of the underlying packages.

Arbitrary R code can be executed. For example, the user may want to view a list of all objects in the workspace with the function ls(). Doing this after reading in the CEL files would reveal that there is an object called RawAffyData. Suppose a user wishes to use the Affymetrix MAS5 algorithm for pre-processing rather than the algorithms in the affylmGUI menus. This could be achieved by passing RawAffyData to the expresso() function with parameters indicating that MAS5 is desired. Storing the result of this procedure in an object called NormalizedAffyData will ensure that it is recognized by affylmGUI for the purposes of creating plots and fitting linear models.

Any code that is used frequently can be saved and incorporated into the pull-down menus.


    Acknowledgments
 
This research was supported by an NHMRC Transitional Institute Grant. The authors thank Terry Speed for discussions and inspiration and many affylmGUI users for bug reports. Funding to pay the Open Access publication charges was provided by an NHMRC Transitional Grant awarded to the Walter and Eliza Hall Institute for Medical Research.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: David Rocke

Received on November 17, 2005; revised on January 23, 2006; accepted on January 23, 2006

    REFERENCES
 TOP
 ABSTRACT
 1 BACKGROUND
 2 SESSION CONTROL
 3 PRE-PROCESSING AND QUALITY...
 4 DIFFERENTIAL EXPRESSION
 5 NON-STANDARD ANALYSES
 REFERENCES
 

    Bolstad, B. (2005) affyPLM: Methods for fitting probe-level models. R package version 1.6.0.

    Bolstad, B.M., Collin, F., Brettschneider, J., Simpson, K., Cope, L., Irizarry, R.A., Speed, T.P. (2005) Quality Assessment of Affymetrix GeneChip Data. In Gentleman, R., Carey, V., Huber, W., Irizarry, R., Dudoit, S. (Eds.). Bioinformatics and Computational Biology Solutions Using R and Bioconductor, Springer, NY, pp. 33–47.

    Gentleman, R.C., et al. (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol, . 5, R80[CrossRef][Medline].

    Gautier, L., et al. (2004) affy—analysis of Affymetrix GeneChip data at the probe-level. Bioinformatics, 20, 307–315[Abstract/Free Full Text].

    Gierer, P., et al. (2005) Gene expression profile and synovial microcirculation at early stages of collagen-induced arthritis. Arthritis. Res. Ther, . 7, R868–R876[Medline].

    Irizarry, R.A., et al. (2003) Exploration, normalization and summaries of high density oligonucleotide array probe level data. Biostatistics, 4, 249–264[Abstract].

    Irizarry, R.A., et al. (2005) Multiple-laboratory comparison of microarray platforms [Erratum (2005) Nat. Methods, 2, 477.]. Nat. Methods, 2, 1–5.

    Kooperberg, C., et al. (2005) Significance testing for small microarray experiments. Stat. Med, . 24, 2281–2298[CrossRef][Web of Science][Medline].

    Lönnstedt, I. and Speed, T.P. (2002) Replicated microarray data. Stat. Sinica, 12, 31–46.

    R Development Core Team. R: A Language and Environment for Statistical Computing, . (2005) , Vienna, Austria R Foundation for Statistical Computing.

    Smyth, G.K. (2004) Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol, . 3, Article 3.

    Smyth, G.K. (2005) Limma: linear models for microarray data. In Gentleman, R., Carey, V., Dudoit, S., Irizarry, R., Huber, W. (Eds.). Bioinformatics and Computational Biology Solutions using R and Bioconductor, , NY Springer, pp. 397–420.

    Wettenhall, J.M. and Smyth, G.K. (2004) limmaGUI: a graphical user interface for linear modeling of microarray data. Bioinformatics, 20, 3705–3706[Abstract/Free Full Text].

    Wu, Z., et al. (2004) A model based background adjustment for oligonucleotide expression arrays. J. Am. Stat. Assoc, . 99, 909–917[CrossRef][Web of Science].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J. Biol. Chem.Home page
J. Garbarino, M. Padamsee, L. Wilcox, P. M. Oelkers, D. D'Ambrosio, K. V. Ruggles, N. Ramsey, O. Jabado, A. Turkish, and S. L. Sturley
Sterol and Diacylglycerol Acyltransferase Deficiency Triggers Fatty Acid-mediated Cell Death
J. Biol. Chem., November 6, 2009; 284(45): 30994 - 31005.
[Abstract] [Full Text] [PDF]


Home page
Mol PlantHome page
I. Pandelova, M. F. Betts, V. A. Manning, L. J. Wilhelm, T. C. Mockler, and L. M. Ciuffetti
Analysis of Transcriptome Changes Induced by Ptr ToxA in Wheat Provides Insights into the Mechanisms of Plant Susceptibility
Mol Plant, September 1, 2009; 2(5): 1067 - 1083.
[Abstract] [Full Text] [PDF]


Home page
J EndocrinolHome page
L Lundholm, G Bryzgalova, H Gao, N Portwood, S Falt, K D Berndt, A Dicker, D Galuska, J R Zierath, J-A Gustafsson, et al.
The estrogen receptor {alpha}-selective agonist propyl pyrazole triol improves glucose tolerance in ob/ob mice; potential molecular mechanisms
J. Endocrinol., November 1, 2008; 199(2): 275 - 286.
[Abstract] [Full Text] [PDF]


Home page
Biol. Reprod.Home page
V. Thimon, E. Calvo, O. Koukoui, C. Legare, and R. Sullivan
Effects of Vasectomy on Gene Expression Profiling along the Human Epididymis
Biol Reprod, August 1, 2008; 79(2): 262 - 273.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
M. C. Lopez-Martin, M. Becana, L. C. Romero, and C. Gotor
Knocking Out Cytosolic Cysteine Synthesis Compromises the Antioxidant Capacity of the Cytosol to Maintain Discrete Concentrations of Hydrogen Peroxide in Arabidopsis
Plant Physiology, June 1, 2008; 147(2): 562 - 572.
[Abstract] [Full Text] [PDF]


Home page
J AndrolHome page
J. S. Barthold, S. M. Mccahan, A. V. Singh, T. B. Knudsen, X. Si, L. Campion, and R. E. Akins
Altered Expression of Muscle- and Cytoskeleton-Related Genes in a Rat Strain With Inherited Cryptorchidism
J Androl, May 1, 2008; 29(3): 352 - 366.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. Bourdeau, J. Deschenes, D. Laperriere, M. Aid, J. H. White, and S. Mader
Mechanisms of primary and secondary estrogen target gene regulation in breast cancer cells
Nucleic Acids Res., January 17, 2008; 36(1): 76 - 93.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Sanges, F. Cordero, and R. A. Calogero
oneChannelGUI: a graphical interface to Bioconductor tools, designed for life scientists who are not familiar with R language
Bioinformatics, December 15, 2007; 23(24): 3406 - 3408.
[Abstract] [Full Text] [PDF]


Home page
Mol Hum ReprodHome page
V. Thimon, O. Koukoui, E. Calvo, and R. Sullivan
Region-specific gene expression profiling along the human epididymis
Mol. Hum. Reprod., October 1, 2007; 13(10): 691 - 704.
[Abstract] [Full Text] [PDF]


Home page
Plant Cell PhysiolHome page
Y. Manabe, N. Tinker, A. Colville, and B. Miki
CSR1, the Sole Target of Imidazolinone Herbicide in Arabidopsis thaliana
Plant Cell Physiol., September 1, 2007; 48(9): 1340 - 1358.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
C. Tian, E. Chikayama, Y. Tsuboi, T. Kuromori, K. Shinozaki, J. Kikuchi, and T. Hirayama
Top-down Phenomics of Arabidopsis thaliana: METABOLIC PROFILING BY ONE- AND TWO-DIMENSIONAL NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY AND TRANSCRIPTOME ANALYSIS OF ALBINO MUTANTS
J. Biol. Chem., June 22, 2007; 282(25): 18532 - 18541.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
22/7/897    most recent
btl025v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (15)
Google Scholar
Right arrow Articles by Wettenhall, J. M.
Right arrow Articles by Smyth, G. K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wettenhall, J. M.
Right arrow Articles by Smyth, G. K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?