Bioinformatics Advance Access originally published online on May 26, 2006
Bioinformatics 2006 22(15):1919-1920; doi:10.1093/bioinformatics/btl269
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
© 2006 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
ACE-it: a tool for genome-wide integration of gene dosage and RNA expression data
1 Department of Mathematics, Vrije Universiteit De Boelelaan 1081, 1081 HV, Amsterdam, The Netherlands
2 Department of Pathology, VU Medical Center PO Box 7057, 1007 MB, Amsterdam, The Netherlands
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Summary: We describe a tool, called ACE-it (Array CGH Expression integration tool). ACE-it links the chromosomal position of the gene dosage measured by array CGH to the genes measured by the expression array. ACE-it uses this link to statistically test whether gene dosage affects RNA expression.
Availability: ACE-it is freely available at http://ibivu.cs.vu.nl/programs/acewww/
Contact: b.ylstra{at}vumc.nl
Supplementary Information: Programs, the manual and supplementary information are available on the website.
Gene dosage, among other factors, affects gene expression in tumors (Albertson et al., 2000; Pollack et al., 2002), and can be measured on a genome-wide basis at high resolution by array comparative genomic hybridization (array CGH) (Pinkel and Albertson, 2005). Except for visualization implementations (Pollack et al., 2002; Autio et al., 2003), and calculation of correlations without formal inference (Nigro et al., 2005), sophisticated techniques for the integration of array CGH with expression array data are limited. A statistical tool for the detection of genes whose expression is affected by gene dosage within a series of samples is currently not available. ACE-it tests whether on a particular chromosomal location the RNA expression ratios are affected by the gene dosage. The relation with gene dosage is tested by ACE-it for each gene on the expression array. For this purpose gene dosage is divided into three categories: loss, normal or gain, defined by means of a user chosen cut-off for smoothed data (Olshen et al., 2004; Jong et al., 2004). Subsequently, the chromosomal positions are divided into two groups: positions at which the samples have either normal and gain, or normal and loss. The grouping disregards positions at which all the samples fall into one category, have a balanced distribution over the categories, or have no normals. The restriction into these two groups of chromosomal positions is biologically motivated by the assumption that for a given chromosomal location either oncogenes or tumor suppressor genes drive the chromosomal gain or loss (Albertson et al., 2000; Pinkel and Albertson, 2005). ACE-it allows a user-defined cut-off for contaminating samples within the groupings.
Thus, normalized expression ratios for a particular gene are compared between the two categories within a group. ACE-it assumes that expression increases with increased gene dosage. This assumption leads to the null-hypothesis: (1) the median expression in samples with a gain is equal or smaller than that in samples with normal gene dosage; or, (2) the median expression in samples with a normal is equal or smaller than that in samples with a loss. We use the one-sided Wilcoxon's rank test to test this null-hypothesis, and apply the BenjaminiHochberg's multiplicity correction to the resulting P-values (Benjamini and Hochberg, 1995). Genes whose adjusted P-value is smaller than the user-defined rejection level are considered differentially expressed between gene dosage levels.
One compounding problem when linking gene dosage and RNA expression is that often the two arrays for expression and CGH are performed on different platforms (Ylstra et al., 2006). As a consequence, the arrays can have different elements spotted such that amount and chromosomal position of the ratios measured do not overlap. We solved this by imputing segmented ratios for each chromosomal position not covered by the arrayed elements, such that any given position on the chromosome gets assigned the smoothed array CGH value of the closest physical point of measurement. If expression and CGH array are performed on the same platform this feature may be disenabled.
ACE-it has been developed for the statistical software package R (http://www.r-project.org/) and a graphical user interface (GUI) for windows is provided. ACE-it was tested using array expression and array CGH datasets from various institutes and platforms, including a breast tumor series (Pollack et al., 2002). This yielded several genes whose expression is significantly affected by gene dosage, including HER-2/neu (c-erbB-2) which is amplified in
2030% of breast cancer cases (Fig. 1).
|
| Acknowledgments |
|---|
This work was in part supported by the Centre for Medical Systems Biology (CMSB), a centre of excellence approved by The Netherlands Genomics Initiative. Funding to pay the Open Access publication charges for this article was provided by the Centre for Medical Systems Biology (CMSB), a centre of excellence approved by The Netherlands Genomics Initiative.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Martin Bishop
Received on February 21, 2006; revised on April 24, 2006; accepted on May 19, 2006
| REFERENCES |
|---|
|
|
|---|
Albertson, D.G., et al. (2000) Quantitative mapping of amplicon structure by array CGH identifies CYP24 as a candidate oncogene. Nat. Genet, . 25, 144146[CrossRef][Web of Science][Medline].
Autio, R., et al. (2003) CGH-Plotter: MATLAB toolbox for CGH-data analysis. Bioinformatics, 19, 17141715
Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery ratea practical and powerful approach to multiple testing. J. R Stat. Soc. B Methodol, . 57, 289300.
Jong, K., et al. (2004) Breakpoint identification and smoothing of array comparative genomic hybridization data. Bioinformatics, 20, 36363637
Nigro, J.M., et al. (2005) Integrated array-comparative genomic hybridization and expression array profiles identify clinically relevant molecular subtypes of glioblastoma. Cancer Res, . 65, 16781686
Olshen, A.B., et al. (2004) Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics, 5, 557572[Abstract].
Pinkel, D. and Albertson, D.G. (2005) Array comparative genomic hybridization and its applications in cancer. Nat. Genet, . 37, S11S17.
Pollack, J.R., et al. (2002) Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc. Natl Acad. Sci. USA, 99, 1296312968
Ylstra, B., et al. (2006) BAC to the future! or oligonucleotides: a perspective for micro array comparative genomic hybridization (array CGH). Nucleic Acids Res, . 34, 445450
This article has been cited by other articles:
![]() |
F. Andre, B. Job, P. Dessen, A. Tordai, S. Michiels, C. Liedtke, C. Richon, K. Yan, B. Wang, G. Vassal, et al. Molecular Characterization of Breast Cancer with High-Resolution Oligonucleotide Comparative Genomic Hybridization Array Clin. Cancer Res., January 15, 2009; 15(2): 441 - 451. [Abstract] [Full Text] [PDF] |
||||
![]() |
B Carvalho, C Postma, S Mongera, E Hopmans, S Diskin, M A van de Wiel, W van Criekinge, O Thas, A Matthai, M A Cuesta, et al. Multiple putative oncogenes at the chromosome 20q amplicon contribute to colorectal adenoma to carcinoma progression Gut, January 1, 2009; 58(1): 79 - 89. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. N. Van Wieringen, M. A. Van De Wiel, and B. Ylstra Weighted clustering of called array CGH data Biostat., July 1, 2008; 9(3): 484 - 500. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



