Skip Navigation


Bioinformatics Advance Access originally published online on December 15, 2005
Bioinformatics 2006 22(4):507-508; doi:10.1093/bioinformatics/btk005
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow A corrigendum has been published
Right arrow All Versions of this Article:
22/4/507    most recent
btk005v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (47)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Leek, J. T.
Right arrow Articles by Storey, J. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Leek, J. T.
Right arrow Articles by Storey, J. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

EDGE: extraction and analysis of differential gene expression

Jeffrey T. Leek *, Eva Monsen , Alan R. Dabney and John D. Storey

Department of Biostatistics, University of Washington Seattle 98195, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 EDGE
 3 RESULTS
 REFERENCES
 

Summary: EDGE (Extraction of Differential Gene Expression) is an open source, point-and-click software program for the significance analysis of DNA microarray experiments. EDGE can perform both standard and time course differential expression analysis. The functions are based on newly developed statistical theory and methods. This document introduces the EDGE software package.

Availability: EDGE is freely available for non-commercial users. EDGE can be downloaded for Windows, Macintosh and Linux/UNIX from http://faculty.washington.edu/jstorey/edge

Contact: jtleek{at}u.washington.edu


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 EDGE
 3 RESULTS
 REFERENCES
 
DNA microarrays have become a standard tool used in identifying and characterizing gene expression variation across differing biological conditions. A variety of software packages are available for the significance analysis of microarray experiments. Many of these packages are closed source, difficult to use or available for only one operating system. Most are unable to analyze data from time course microarray experiments. EDGE is a user friendly software package that includes functions for missing data imputation, data transformation and visualization, eigen-genes/eigen-array analysis, hierarchical clustering, differential expression analysis (static and time course) and automatic internet-based NCBI queries of user chosen genes. EDGE can be used to analyze microarray data across all platforms, although interpretation of the results may depend on the experimental design. The EDGE interface is multithreaded, and reports real time updates for the time remaining in lengthy calculations. Many of these calculations are performed through C++ extensions for R that dramatically reduce computation time. Differential expression analyses in EDGE are based on newly developed statistical methodology, including the Optimal Discovery Procedure for static differential expression (Storey, 2005, http://www.bepress.com/uwbiostat/paper259). EDGE is open source and is available for Windows, Macintosh and Linux/UNIX operating systems.


    2 EDGE
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 EDGE
 3 RESULTS
 REFERENCES
 
EDGE runs on top of the statistical software package R (R Development Core Team, 2005, http://www.R-project.org). Detailed downloading and installation instructions are available from the EDGE website. At the beginning of each EDGE session, the main menu should appear as in Figure 1. The first step in an EDGE analysis is to load the pre-normalized expression data and covariate files using the Load/Save Expression Data and Covariates menu. (The covariate file contains information about the experimental design, such as which biological group from which each array comes.) If the expression matrix has missing values, they can be imputed using the KNN imputation algorithm from the Impute Missing Data menu (Troyanskaya et al., 2001). After loading expression data and covariate information, the covariates can be checked for accuracy using the View Covariates menu. It is also possible to center, scale or log transform the expression values using the Transform Data menu.


Figure 1
View larger version (32K):
[in this window]
[in a new window]
 
Fig. 1 The main menu of EDGE.

 
Several tools for visual exploratory analysis are included in the EDGE interface. Boxplots and eigengenes (Alter et al., 2000) can be displayed for each array, or stratified by a covariate using the Display Boxplots option and Display Eigengenes and Eigenarrays options, respectively. EDGE also allows the user to plot clusters of genes with similar expression patterns (Eisen et al., 1998) from the Display Hierarchical Clustering menu. Clustering can be performed on the entire set of genes, or only on the significant genes from a differential expression analysis. A variety of plotting options are available for visualizing the clusters.

The Identify Differentially Expressed Genes menu allows users to set options for performing both static and time course differential expression analyses. For a static analysis, the user should select a class variable indicating the biological group assignment, or the option None (within class differential expression) to identify differentially expressed genes in a single biological sample. In the static setting, significance calculations are based on the Optimal Discovery Procedure (Storey, 2005), which estimates the optimal rule for identifying differentially expressed genes (Storey et al., 2005a, http://www.bepress.com/uwbiostat/paper260). For time course data, the user can perform either a ‘between class’ analysis by selecting a variable distinguishing biological groups, or a ‘within class’ analysis by selecting None (within class differential expression) for the class variable. A ‘between class’ analysis assesses the evidence for a difference in expression over time between two or more biological groups, while a ‘within class’ analysis looks for any differential expression over time within a single group. The user must specify a covariate for the time points, and if necessary, should also specify a covariate corresponding to which individuals were sampled. EDGE implements statistical methodology specifically designed for time course experiments (Storey et al., 2005b).

For either type of analysis, the user should specify the number of permutations to be used in the significance calculations and, in some cases, set a seed for reproducible results. For time course analyses, the user can also specify the type of spline used in fitting the longitudinal model, the dimension of the basis for the spline model and whether to include the baseline expression level in the time course analysis. If the baseline level is included, EDGE will not only identify genes showing different patterns of expression over time, but will also identify genes with different baseline levels of expression.

Once the appropriate options have been selected and the user clicks GO, the expression analysis is performed and the Differential Expression Results menu is displayed. A significance measure is assigned to each gene via the Q-value methodology (Storey and Tibshirani, 2003). The user can select a Q- or P-value cutoff to display the genes that meet that significance threshold. For advanced users, optional Q-value arguments can also be adjusted. The user can plot a histogram of the P-values from all significance tests, create a Q-plot, or cluster significant genes based on similarities in their expression profiles. If the EDGE session is being performed on a computer with internet access, the user can select a significant gene in the results window, and access NCBI information for that gene name. Results of differential expression analyses can be saved for further analysis or reporting.


    3 RESULTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 EDGE
 3 RESULTS
 REFERENCES
 
Figure 2 shows the results of a differential expression analysis on a subset of 3170 genes on 15 arrays from the Hedenfalk et al. (2001) study. The analysis compared expression levels for BRCA1 and BRCA2 tumors. EDGE shows substantial improvements over five leading methodologies.


Figure 2
View larger version (22K):
[in this window]
[in a new window]
 
Fig. 2 A comparison between EDGE and five leading procedures for identifying differentially expressed genes applied to the Hedenfalk et al., 2001 study. For each Q-value (false discovery rate) cutoff, the number of genes found to be significant is plotted for each procedure. See Storey et al. (2005a) for comparisons based on a 3-sample analysis, where improvements are even greater.

 


    Acknowledgments
 
This software development was supported in part by NIH grant R01 HG002913-01.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: John Quackenbush

Received on October 18, 2005; revised on December 10, 2005; accepted on December 11, 2005

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 EDGE
 3 RESULTS
 REFERENCES
 

    Alter, O., et al. (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl Acad. Sci, . 97, 10101–10106[Abstract/Free Full Text].

    Cui, X., et al. (2005) Improved statistical tests for differential gene expression by shrinking variance components estimates. Biostatistics, 6, 59–75[Abstract].

    Dudoit, S., et al. (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc, . 97, 77–87[CrossRef][Web of Science].

    Efron, B., et al. (2001) Empirical Bayes analysis of a microarray experiment. J. Am. Stat. Assoc, . 96, 1151–1160[CrossRef][Web of Science].

    Eisen, M. B., et al. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci, . 95, 14863–14868[Abstract/Free Full Text].

    Hedenfalk, I., et al. (2002) Gene-expression profiles in hereditary breast cancer. N. Engl. J. Med, . 344, 539–548.

    Lonnstedt, I. and Speed, T. (2002) Replicated microarray data. Stat. Sinica, 12, 31–46.

    R: A language and environment for statistical computing R Development Core Team. (2005) , Vienna, Austria R Foundation for Statistical Computing.

    Storey, J.D. (2005) The optimal discovery procedure: a new approach to simultaneous significance testing. UW Biostatistics Working Paper Series Working Paper, 259, .

    Storey, J.D. and Tibshirani, R. (2003) Statistical significance for genome-wide studies. Proc. Natl Acad. Sci, . 100, 9440–9445[Abstract/Free Full Text].

    Storey, J.D., Dai, J.Y., Leek, J.T. (2005a) The Optimal Discovery Procedure for Large-Scale Significance Testing, with Applications to Comparative Microarray Experiments. UW Biostatistics Working Paper Series, Working Paper 260.

    Storey, J.D., et al. (2005b) Significance analysis of time course microarray experiments. Proc. Natl Acad. Sci, . 36, 12837–12842.

    Troyanskaya, O., et al. (2001) Missing value estimation methods for DNA microarrays. Bioinformatics, 17, 520–525[Abstract/Free Full Text].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Plant Physiol.Home page
Y. Pang, J. P. Wenger, K. Saathoff, G. J. Peel, J. Wen, D. Huhman, S. N. Allen, Y. Tang, X. Cheng, M. Tadege, et al.
A WD40 Repeat Protein from Medicago truncatula Is Necessary for Tissue-Specific Anthocyanin and Proanthocyanidin Biosynthesis But Not for Trichome Development
Plant Physiology, November 1, 2009; 151(3): 1114 - 1129.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
P. D. Maningat, P. Sen, M. Rijnkels, A. L. Sunehag, D. L. Hadsell, M. Bray, and M. W. Haymond
Gene expression in the human mammary epithelium during lactation: the milk fat globule transcriptome
Physiol Genomics, March 3, 2009; 37(1): 12 - 22.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
C. M. Cameron, M. J. Cameron, J. F. Bermejo-Martin, L. Ran, L. Xu, P. V. Turner, R. Ran, A. Danesh, Y. Fang, P.-K. M. Chan, et al.
Gene Expression Analysis of Host Innate Immune Responses during Lethal H5N1 Infection in Ferrets
J. Virol., November 15, 2008; 82(22): 11308 - 11317.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Respir. Crit. Care Med.Home page
D. W. Chang, S. Hayashi, S. A. Gharib, T. Vaisar, S. T. King, M. Tsuchiya, J. T. Ruzinski, D. R. Park, G. Matute-Bello, M. M. Wurfel, et al.
Proteomic and Computational Analysis of Bronchoalveolar Proteins during the Course of the Acute Respiratory Distress Syndrome
Am. J. Respir. Crit. Care Med., October 1, 2008; 178(7): 701 - 709.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
Y. Jiao, J. L. Riechmann, and E. M. Meyerowitz
Transcriptome-Wide Analysis of Uncapped mRNAs in Arabidopsis Reveals Regulation of mRNA Degradation
PLANT CELL, October 1, 2008; 20(10): 2571 - 2585.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
Y. Pang, G. J. Peel, S. B. Sharma, Y. Tang, and R. A. Dixon
A transcript profiling approach reveals an epicatechin-specific glucosyltransferase expressed in the seed coat of Medicago truncatula
PNAS, September 16, 2008; 105(37): 14210 - 14215.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
C. Cipollina, J. van den Brink, P. Daran-Lapujade, J. T. Pronk, D. Porro, and J. H. de Winde
Saccharomyces cerevisiae SFP1: at the crossroads of central metabolism and ribosome biogenesis
Microbiology, June 1, 2008; 154(6): 1686 - 1699.
[Abstract] [Full Text] [PDF]


Home page
JDRHome page
M. Handfield, H.V. Baker, and R.J. Lamont
Beyond Good and Evil in the Oral Cavity: Insights into Host-Microbe Relationships Derived from Transcriptional Profiling of Gingival Cells
Journal of Dental Research, March 1, 2008; 87(3): 203 - 223.
[Abstract] [Full Text] [PDF]


Home page
DiabetesHome page
P. White, C. Lee May, R. N. Lamounier, J. E. Brestelli, and K. H. Kaestner
Defining Pancreatic Endocrine Precursors and Their Descendants
Diabetes, March 1, 2008; 57(3): 654 - 668.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. Le Gac, M. D. Brazas, M. Bertrand, J. G. Tyerman, C. C. Spencer, R. E. W. Hancock, and M. Doebeli
Metabolic Changes Associated With Adaptive Diversification in Escherichia coli
Genetics, February 1, 2008; 178(2): 1049 - 1060.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
A. Anand, S. R. Uppalapati, C.-M. Ryu, S. N. Allen, L. Kang, Y. Tang, and K. S. Mysore
Salicylic Acid and Systemic Acquired Resistance Play a Role in Attenuating Crown Gall Disease Caused by Agrobacterium tumefaciens
Plant Physiology, February 1, 2008; 146(2): 703 - 715.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
F. Cordero, M. Botta, and R. A. Calogero
Microarray data analysis and mining approaches
Brief Funct Genomic Proteomic, January 22, 2008; (2008) elm034v1.
[Abstract] [Full Text] [PDF]


Home page
Infect. Immun.Home page
S. Y. Kassim, S. A. Gharib, B. H. Mecham, T. P. Birkland, W. C. Parks, and J. K. McGuire
Individual Matrix Metalloproteinases Control Distinct Transcriptional Responses in Airway Epithelial Cells Infected with Pseudomonas aeruginosa
Infect. Immun., December 1, 2007; 75(12): 5640 - 5650.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Pathol.Home page
F. E. Lovegrove, S. A. Gharib, S. N. Patel, C. A. Hawkes, K. C. Kain, and W. C. Liles
Expression Microarray Analysis Implicates Apoptosis and Interferon-Responsive Mechanisms in Susceptibility to Experimental Cerebral Malaria
Am. J. Pathol., December 1, 2007; 171(6): 1894 - 1903.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Saeys, I. Inza, and P. Larranaga
A review of feature selection techniques in bioinformatics
Bioinformatics, October 1, 2007; 23(19): 2507 - 2517.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
N. D. Teaster, C. M. Motes, Y. Tang, W. C. Wiant, M. Q. Cotter, Y.-S. Wang, A. Kilaru, B. J. Venables, K. H. Hasenstein, G. Gonzalez, et al.
N-Acylethanolamine Metabolism Interacts with Abscisic Acid Signaling in Arabidopsis thaliana Seedlings
PLANT CELL, August 1, 2007; 19(8): 2454 - 2469.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
J. D. Storey, J. Y. Dai, and J. T. Leek
The optimal discovery procedure for large-scale significance testing, with applications to comparative microarray experiments
Biostat., April 1, 2007; 8(2): 414 - 432.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow A corrigendum has been published
Right arrow All Versions of this Article:
22/4/507    most recent
btk005v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (47)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Leek, J. T.
Right arrow Articles by Storey, J. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Leek, J. T.
Right arrow Articles by Storey, J. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?