Skip Navigation


Bioinformatics Advance Access originally published online on June 22, 2007
Bioinformatics 2007 23(16):2183-2184; doi:10.1093/bioinformatics/btm311
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
23/16/2183    most recent
btm311v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (12)
Google Scholar
Right arrow Articles by Dunning, M. J.
Right arrow Articles by Tavaré, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dunning, M. J.
Right arrow Articles by Tavaré, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

beadarray: R classes and methods for Illumina bead-based data

Mark J. Dunning *, Mike L. Smith , Matthew E. Ritchie and Simon Tavaré

Department of Oncology, University of Cambridge, CRUK Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 DESCRIPTION
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: The R/Bioconductor package beadarray allows raw data from Illumina experiments to be read and stored in convenient R classes. Users are free to choose between various methods of image processing, background correction and normalization in their analysis rather than using the defaults in Illumina's; proprietary software. The package also allows quality assessment to be carried out on the raw data. The data can then be summarized and stored in a format which can be used by other R/Bioconductor packages to perform downstream analyses. Summarized data processed by Illumina's; BeadStudio software can also be read and analysed in the same manner.

Availability: The beadarray package is available from the Bioconductor web page at www.bioconductor.org. A user's; guide and example data sets are provided with the package.

Contact: md392{at}cam.ac.uk


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 DESCRIPTION
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Illumina have created an alternative microarray technology (BeadArray) based on randomly arranged beads. A specific oligonucleotide sequence is assigned to each bead type, which is replicated about 30 times on an array. A series of decoding hybridizations is used to identify every bead (Gunderson et al., 2004). The high degree of replication makes robust measurements for each bead type possible. We have previously used the beadarray package to demonstrate some of the statistical properties of BeadArrays (Dunning et al., 2006).

BeadArrays can be used for various applications, including gene expression studies (Kuhn et al., 2004), SNP genotyping, methylation profiling and array CGH. Arrays are processed in parallel as a SAM (Sentrix Array Matrix) or BeadChip. A SAM is a plate of 96 uniquely prepared hexagonal BeadArrays, each of which contains around 1500 bead types. The BeadChip technology comprises a series of rectangular strips on a slide, each strip containing about 24 000 bead types. For example, there are six pairs of strips on each Human-6 BeadChip. BeadArrays can be one or two colour depending on the application.

After hybridization and washing, each array is scanned by Illumina scanning software (BeadScan) to produce a TIFF image. The latest version of BeadScan can also output a text file giving the identity and position of each bead on the array.

The Bioconductor project (Gentleman et al., 2004) is an online repository of open source software written using the R programming language. The project aims to provide a range of statistical and graphical tools for analysing genomics data. beadarray was the first Bioconductor package written specifically for Illumina data. Other packages, such as lumi, BeadExplorer and beadarraySNP are now available. IlluminaGUI (Eggle and Schultz, 2007) provides a graphical interface allowing summarized Illumina data to be analysed via selected Bioconductor packages.


    2 DESCRIPTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 DESCRIPTION
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
2.1 Bead level data
We refer to the collection of TIFF images and text files as the bead level data for an experiment. Bead level data can be read into memory using the readIllumina function. By default, this function will find all images and text files within the current working directory and apply the image processing steps described in Kuhn et al. (2004). Other image processing options are also available. Users can also choose between different background correction methods. Due to the random nature of the technology, each array has a variable number of rows of intensity data. An R environment object is used to store this information in a memory efficient manner. The same environment may be used for single or two channel data from SAMs or BeadChips.

Typical quality assessment for microarrays involves looking for systematic differences between arrays within an experiment as well as spatial artifacts on each array. Boxplots, density plots and image plots can be generated automatically and summarized in an HTML report and used to identify outlier arrays. Figure 1 shows some of the bead level plotting options available in the beadarray package. These plots can be used to identify problematic arrays (A) or to view the raw bead intensities for particular genes (B) or SNPs (C).


Figure 1
View larger version (19K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. (A) Image plot showing the variation in log2 foreground intensity across the surface of a BeadArray. An obvious spatial effect which can only be identified from bead level data can be seen. (B) Using bead level information to assess the distributions of particular bead types across a set of arrays. Here, we show the bead level intensities (y axis) of four bead types across three arrays from a spike-in experiment in which the concentration of each probe decreases on each array. (C) Plot of the raw data for allele A versus allele B for a particular SNP across eight arrays. Each colour denotes a different array. Using the bead level data allows three distinct genotypes (AA, AB and BB) to be identified.

 
2.2 Bead summary data
After quality assessment has taken place, the replicate beads on each array are summarized to give an average intensity value and variance for each bead type. We refer to this as the bead summary data for an experiment. We use the Illumina default method for calculating these summary values, by removing outliers greater than three median absolute deviations (MADs) from the median and calculating the mean and variance of the remaining beads. Different MAD cutoffs are possible and users may define their own functions to obtain robust summary values and choose between calculating their summary values on the original or logged scale. Alternatively, the bead summary output produced by BeadStudio may be used.

The contents of the class object used to store bead summary data depend on the type of Illumina technology being analysed. The class is an extension of the eSet class, written by the Bioconductor core development team and designed to store and manipulate data from high-throughput genomic experiments. Using a common class to store data means that beadarray users can interact with other Bioconductor packages. Examples include using affy (Gautier et al., 2004) to normalize the data or limma (Smyth, 2005) to find differentially expressed genes.


    3 DISCUSSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 DESCRIPTION
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
BeadArray technology will become increasingly popular, and we anticipate that beadarray will become an important tool in the analysis of Illumina data. The main benefit of beadarray is its flexibility. The package offers a variety of image processing and background correction methods, rather than the default method used by Illumina, and a choice of scale at all stages of the analysis. Having access to the raw data provides scope for users to develop their own analysis methods, such as genotype calling. Also, beadarray is able to read and process raw data from gene expression, SNP genotyping or methylation arrays. All other Bioconductor packages for Illumina analysis only handle summarized data from specific platforms.

beadarray allows the analysis of Illumina data to be performed entirely in R and on any operating system. A simple script can be used to read raw data, produce diagnostic plots and create summarized data. Therefore, the package is amenable for use in core facilities producing large numbers of arrays where processing data using BeadStudio may not be feasible and reproducible research is required.

Users should be aware that using bead level data requires large amounts of computer memory. For example, the raw data for a Human-6 BeadChip consists of twelve 80 MB TIFF images and twelve 40 MB text files. Reading these data into memory and analysing them using beadarray currently requires at least 2 GB of RAM and uses around 1 GB of disk space.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 DESCRIPTION
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
We thank Martin Morgan and Vince Carey for useful advice on the classes used within the package; Roman Sasik for providing code to read TIFF images; Gary Nunn for the spike-in data; Semyon Krugylak for advice on Illumina algorithms; Matthew Forrest and Barbara Stranger for the example data used in Figure 1; Inma Spiteri for the example data in the package and Natalie Thorne and Andy Lynch for useful discussions. The authors were supported in part by grants from the MRC (M.J.D.), CRUK (M.L.S., S.T.) and the Isaac Newton Trust (M.E.R.).

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Alfonso Valencia

Received on April 12, 2007; revised on May 22, 2007; accepted on June 4, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 DESCRIPTION
 3 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Dunning MJ, et al. Quality control and low-level statistical analysis of Illumina BeadArrays. Revstat (2006) 4:1–30.

    Eggle D, Schultz J. IlluminaGUI: Graphical User Interface for analyzing gene expression data generated on the Illumina platform. In: Bioinformatics (2007) In Press.

    Gautier L, et al. affy–analysis of Affymetrix GeneChip data at the probe level. Bioinformatics (2004) 20:307–315.[Abstract/Free Full Text]

    Gentleman RC, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol (2004) 5:R80.[CrossRef][Medline]

    Gunderson KL, et al. Decoding randomly ordered DNA arrays. Genome Res (2004) 14:870–877.[Abstract/Free Full Text]

    Kuhn K, et al. A novel, high-performance random array platform for quantitative gene expression profiling. Genome Res (2004) 14:2347–2356.[Abstract/Free Full Text]

    Smyth GK. Limma: linear models for microarray data. In: Bioinformatics and Computational Biology Solutions Using R and Bioconductor—Gentleman R, ed. (2005) New York: Springer. 397–420.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief BioinformHome page
K. J. Archer and S. E. Reese
Detection call algorithms for high-throughput gene expression microarray data
Brief Bioinform, November 25, 2009; (2009) bbp055v1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. L. Barbosa-Morais, M. J. Dunning, S. A. Samarajiwa, J. F. J. Darot, M. E. Ritchie, A. G. Lynch, and S. Tavare
A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data
Nucleic Acids Res., November 18, 2009; (2009) gkp942v1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Fernando, S. Sewitz, J. Darot, S. Tavare, J. L. Huppert, and S. Balasubramanian
Genome-wide analysis of a G-quadruplex-specific single-chain antibody that regulates gene expression
Nucleic Acids Res., November 1, 2009; 37(20): 6716 - 6722.
[Abstract] [Full Text] [PDF]


Home page
Clin. Microbiol. Rev.Home page
M. B. Miller and Y.-W. Tang
Basic Concepts of Microarrays and Potential Applications in Clinical Microbiology
Clin. Microbiol. Rev., October 1, 2009; 22(4): 611 - 633.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. E. Ritchie, B. S. Carvalho, K. N. Hetrick, S. Tavare, and R. A. Irizarry
R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips
Bioinformatics, October 1, 2009; 25(19): 2621 - 2623.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
V. Labrie, R. Fukumura, A. Rastogi, L. J. Fick, W. Wang, P. C. Boutros, J. L. Kennedy, M. O. Semeralul, F. H. Lee, G. B. Baker, et al.
Serine racemase is associated with schizophrenia susceptibility in humans and in a mouse model
Hum. Mol. Genet., September 1, 2009; 18(17): 3227 - 3243.
[Abstract] [Full Text] [PDF]


Home page
Genes Dev.Home page
A. R.J. Young, M. Narita, M. Ferreira, K. Kirschner, M. Sadaie, J. F.J. Darot, S. Tavare, S. Arakawa, S. Shimizu, F. M. Watt, et al.
Autophagy mediates the mitotic senescence transition
Genes & Dev., April 1, 2009; 23(7): 798 - 803.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Xie, X. Wang, and M. Story
Statistical methods of background correction for Illumina BeadArray data
Bioinformatics, March 15, 2009; 25(6): 751 - 757.
[Abstract] [Full Text] [PDF]


Home page
Biol. Reprod.Home page
L.C. Schulz, E.P. Widmaier, J. Qiu, and R.M. Roberts
Effect of Leptin on Mouse Trophoblast Giant Cells
Biol Reprod, March 1, 2009; 80(3): 415 - 424.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Kauffmann, R. Gentleman, and W. Huber
arrayQualityMetrics--a bioconductor package for quality assessment of microarray data
Bioinformatics, February 1, 2009; 25(3): 415 - 416.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. M. Cairns, M. J. Dunning, M. E. Ritchie, R. Russell, and A. G. Lynch
BASH: a tool for managing BeadArray spatial artefacts
Bioinformatics, December 15, 2008; 24(24): 2921 - 2922.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
P. Du, W. A. Kibbe, and S. M. Lin
lumi: a pipeline for processing Illumina microarray
Bioinformatics, July 1, 2008; 24(13): 1547 - 1548.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
F. Cordero, M. Botta, and R. A. Calogero
Microarray data analysis and mining approaches
Brief Funct Genomic Proteomic, January 22, 2008; (2008) elm034v1.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
23/16/2183    most recent
btm311v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (12)
Google Scholar
Right arrow Articles by Dunning, M. J.
Right arrow Articles by Tavaré, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dunning, M. J.
Right arrow Articles by Tavaré, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?