Skip Navigation


Bioinformatics Advance Access originally published online on November 24, 2007
Bioinformatics 2008 24(2):276-278; doi:10.1093/bioinformatics/btm556
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
24/2/276    most recent
btm556v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Google Scholar
Right arrow Articles by Raffelsberger, W.
Right arrow Articles by Poch, O.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Raffelsberger, W.
Right arrow Articles by Poch, O.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

RReportGenerator: automatic reports from routine statistical analysis using R

Wolfgang Raffelsberger 1,*,{dagger}, Yannick Krause 1,{dagger}, Luc Moulinier 1, David Kieffer 1, Anne-Laure Morand 2, Laurent Brino 2 and Olivier Poch 1

1Laboratoire de Bioinformatique et Génomique Intégratives, IGBMC, UMR 7104, 67404 Illkirch, France and 2Plate-forme ‘Puces à Cellules Transfectées’, LBGS, CEBGS-IGBMC, 67404 Illkirch, France

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 SOFTWARE OVERVIEW
 3 ANALYSIS SCENARIOS
 4 USAGE AND APPLICATION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: With the establishment of high-throughput (HT) screening methods there is an increasing need for automatic analysis methods. Here we present RReportGenerator, a user-friendly portal for automatic routine analysis using the statistical platform R and Bioconductor. RReportGenerator is designed to analyze data using predefined analysis scenarios via a graphical user interface (GUI). A report in pdf format combining text, figures and tables is automatically generated and results may be exported. To demonstrate suitable analysis tasks we provide direct web access to a collection of analysis scenarios for summarizing data from transfected cell arrays (TCA), segmentation of CGH data, and microarray quality control and normalization.

Availability: RReportGenerator, a user manual and a collection of analysis scenarios are available under a GNU public license on http://www-bio3d-igbmc.u-strasbg.fr/~wraff

Contact: wolfgang.raffelsberger{at}igbmc.u-strasbg.fr


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 SOFTWARE OVERVIEW
 3 ANALYSIS SCENARIOS
 4 USAGE AND APPLICATION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The sequencing of the human genome has opened the way for numerous high-throughput (HT) analysis and high content screening (HCS) techniques. Among the programs and platforms capable of performing statistical analyses, ‘R Development Core Team, 2005’ (http://www.r-project.org/) has gained much popularity since many active partners are further developing this open-source language and its additional libraries at CRAN (http://cran.r-project.org/) and Bioconductor (Gentleman et al., 2004). R itself provides a command line interface, which is very powerful but rather difficult to approach for the inexperienced user seeking automated solutions for routine analyses.

Several graphical user interfaces (GUIs) for R have been created, e.g. Simple-R (http://www-sre.wu-wien.ac.at/SimpleR), R-pad (http://www.rpad.org/Rpad/) and iPlots (http://rosuda.org/iPlots/). However, these GUIs were not designed specifically for generating reports from routine analysis. In this context we have created RReportGenerator, a GUI giving inexperienced users the possibility to perform automatic routine analysis while benefitting from the advantages of R and its libraries.


    2 SOFTWARE OVERVIEW
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 SOFTWARE OVERVIEW
 3 ANALYSIS SCENARIOS
 4 USAGE AND APPLICATION
 ACKNOWLEDGEMENTS
 REFERENCES
 
RReportGenerator allows calling R and executing the code from a user-selected pre-defined ‘Analysis Scenario’ for automatically generating reports via a simple GUI. Using the ‘Library’ button, a steadily growing collection of validated scenarios dedicated to biological research can be directly accessed from our website. Furthermore, this web service guarantees to always work with the most recent versions of the scenarios. More information about the selected scenario can be displayed via the ‘Infos’ button. At report generation the intermediary .tex file is passed to LaTeX or MikTex (Windows version), transforming the report into pdf format. Finally, RReportGenerator deletes all temporary files.

Besides, it is possible to keep the intermediary .tex file and separate (post-script) files for plots when selecting the ‘save .tex file’ option or to generate a .dvi version of the analysis report. Providing a filename in the field ‘Supplemental Data Output File’ activates the option to generate an additional file (if part of the scenario-code) designed for exporting data (e.g. to spreadsheet programs like Excel). If the input data is not available as a single file, a supplemental input file may be specified via the GUI. Furthermore, the scenario code can be designed to read all files from the directory of a selected input file, e.g. in Affymetrix microarray quality control (QC) scenarios. The default path for searching input data, scenarios or for saving output can be customized through a configuration window. Internal messages and those created from Sweave are displayed in the ‘Session Window’, allowing to monitor the progression of the data analysis. RReportGenerator was written in the TCL-TK language and compiled for use under Linux and Windows OS (a Mac version will be released soon).


    3 ANALYSIS SCENARIOS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 SOFTWARE OVERVIEW
 3 ANALYSIS SCENARIOS
 4 USAGE AND APPLICATION
 ACKNOWLEDGEMENTS
 REFERENCES
 
While our collection of analysis scenarios is growing permanently experienced users can write and use their own scenarios, too. Novel analysis scenarios contributed from other researchers will also be made available as part of the web service collection.

Internally, analysis scenarios make use of the R-package Sweave (Leisch, 2002) allowing to combine LaTeX markup and R language to integrate text, figures and tables in a pdf report. An example for the code of a very simple analysis scenario is shown in Figure 1. As shown, the R-code of scenarios is typically organized in different chunks dedicated to tasks like performing calculations within R (e.g. ‘chunk_read’), printing figures (e.g. ‘chunk_plot1’) or writing files. Only a few special items of Sweave code need to be adopted for use with RReportGenerator: In order to display a brief summary about the scenario when clicking the ‘Infos’ button from the GUI, the text between the marks ‘%@RRG_INFO’ and ‘%@RRG_INFO_END’ is extracted for display. An input file selected via the GUI can be accessed using the variable ‘<DATA_IN_FILE>’ (containing the complete path and name). Similarly the term ‘<DATA_OUT_FILE>’ is used for automatically inserting the ‘supplemental data output file’-name provided through the GUI.


Figure 1
View larger version (24K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Example code of a mimimal scenario for RReportGenerator. Analysis scenarios use the syntax of Sweave, an R package allowing to weave R with LaTeX markup. Besides, we have added specific terms like ‘%@RRG_INFO’ and ‘@RRG_INFO_END’ for extracting information about the scenario or ‘<DATA_IN_FILE>’ allowing to automatically address a file selected through the GUI from within R commands.

 

    4 USAGE AND APPLICATION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 SOFTWARE OVERVIEW
 3 ANALYSIS SCENARIOS
 4 USAGE AND APPLICATION
 ACKNOWLEDGEMENTS
 REFERENCES
 
RReportGenerator can be used in a wide range of routine analysis cases in clinical and biological research where the aim and structure of the experiments change rarely, like automatic quality control (e.g. of Affymetrix arrays) or analyzing HCS plates.

This program was designed for (i) biologists (performing HT or HCS experiments) who prefer to avoid command line code and who are primarily interested in automated solutions to obtain well-documented reports. Besides, (ii) bioinformaticians familiar with R can write novel analysis scenarios for automating their analysis tasks. (iii) The reports generated (and the supplementary data output) may be used to standardize the exchange of data at the interface from institutional facilities (e.g. from microarray facility to biostatistics facility performing analysis). Our growing collection of validated analysis scenarios covers applications dedicated to performing multiple segmentation methods on comparative genomic hybridization (CGH) data (Marioni et al., 2006), scenarios for conveniently combining multiple QC methods available for Affymetrix gene expression arrays (Bolstad et al., 2003; Gautier et al., 2004; Wilson and Miller, 2005), and a scenario for normalizing and combining technical replicates from two-colour microarrays analysed using MAIA (Novikov and Barillot, 2007). Furthermore, we have developed scenarios for the analysis of transfected cell array (TCA) data with different input formats (tabulated text or multi-sheet Excel files from GE Healthcare IN Cell Analyzers).

In our experience the use of RReportGenerator has given biologists and operators of HCS projects the means of easily performing immediate quality control and basic data analysis tasks, speeding up the overall performance. In consequence, the quality of our routine documentation has improved, biologists can focus quicker on project-specific interpretation and bioinformaticians have more time available for project-specific tasks or writing novel scenarios.

As we are dedicated to improving automatic analysis procedures, in particular for HT testing and HCS data, the list of applications will be further expanded and our tools further developed. In conclusion, we hope that this program and the increasing collection of available analysis scenarios will represent a valuable tool for laboratories performing routine HT experiments.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 SOFTWARE OVERVIEW
 3 ANALYSIS SCENARIOS
 4 USAGE AND APPLICATION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The authors thank Guillaume Berthommier (IGBMC) for his assistance, L. Vallar (CRP-Santé Luxembourg) for sharing data, A. Carles (IGBMC) and B. Tiwari (NERC, Oxford, UK) for critical discussions. This project was supported through the European Retinal Research Training Network ‘RETNET’ (MRTN-CT-2003-504003), EVI-Genoret (LSHG-CT-2005-512036), INSERM, CNRS, ULP, Region Alsace, Conseil Général du Bas Rhin, CUS and Cancéropôle Grand Est.

Conflict of Interest: none declared.


    FOOTNOTES
 
{dagger}The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors. Back

Associate Editor: Chris Stoeckert

Received on June 28, 2007; revised on November 2, 2007; accepted on November 2, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 SOFTWARE OVERVIEW
 3 ANALYSIS SCENARIOS
 4 USAGE AND APPLICATION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Bolstad BM, et al. A comparison of normalization methods for high density oligonucleotide array data based on variance. Bioinformatics (2003) 19:185–193.[Abstract/Free Full Text]

    Gautier L, et al. affy – analysis of Affymetrix GeneChip data at the probe level. Bioinformatics (2004) 20:307–315.[Abstract/Free Full Text]

    Gentleman RC, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. (2004) 5:R80.[CrossRef][Medline]

    Leisch F. Sweave: dynamic generation of statistical reports using literate data analysis. In: Proceedings in Computational Statistics—Härdle W, Rönz B, eds. (2002) Heidelberg: Physica Verlag. 575–580.

    Marioni JC, et al. BioHMM: a heterogeneous hidden Markov model for segmenting array CGH data. Bioinformatics (2006) 22:1144–1146.[Abstract/Free Full Text]

    Novikov E, Barillot E. Software package for automatic microarray image analysis (MAIA). Bioinformatics (2007) 23:639–640.[Abstract/Free Full Text]

    R Development Core Team (2005) R: A Language and Environment for Statistical Computing. In: R Foundation for Statistical Computing. Vienna, Austria. ISBN 3-900051-07-0.

    Wilson CL, Miller CJ. Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis. Bioinformatics (2005) 21:3683–3685.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
24/2/276    most recent
btm556v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Google Scholar
Right arrow Articles by Raffelsberger, W.
Right arrow Articles by Poch, O.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Raffelsberger, W.
Right arrow Articles by Poch, O.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?