Skip Navigation


Bioinformatics Advance Access originally published online on January 18, 2007
Bioinformatics 2007 23(6):657-663; doi:10.1093/bioinformatics/btl646
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/6/657    most recent
btl646v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (26)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Venkatraman, E. S.
Right arrow Articles by Olshen, A. B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Venkatraman, E. S.
Right arrow Articles by Olshen, A. B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

A faster circular binary segmentation algorithm for the analysis of array CGH data

E. S. Venkatraman * and Adam B. Olshen

Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021, USA

*To whom correspondence should be addressed.


   Abstract

Motivation: Array CGH technologies enable the simultaneous measurement of DNA copy number for thousands of sites on a genome. We developed the circular binary segmentation (CBS) algorithm to divide the genome into regions of equal copy number. The algorithm tests for change-points using a maximal t-statistic with a permutation reference distribution to obtain the corresponding P-value. The number of computations required for the maximal test statistic is O(N2), where N is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster algorithm.

Results: We present a hybrid approach to obtain the P-value of the test statistic in linear time. We also introduce a rule for stopping early when there is strong evidence for the presence of a change. We show through simulations that the hybrid approach provides a substantial gain in speed with only a negligible loss in accuracy and that the stopping rule further increases speed. We also present the analyses of array CGH data from breast cancer cell lines to show the impact of the new approaches on the analysis of real data.

Availability: An R version of the CBS algorithm has been implemented in the "DNAcopy" package of the Bioconductor project. The proposed hybrid method for the P-value is available in version 1.2.1 or higher and the stopping rule for declaring a change early is available in version 1.5.1 or higher.

Contact: venkatre{at}mskcc.org

Supplementary information: Supplementary data are available at Bioinformatics online.

Associate Editor: Chris Stoeckert


Received on June 6, 2006; revised on December 12, 2006; accepted on December 18, 2006

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
R. Shen, A. B. Olshen, and M. Ladanyi
Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis
Bioinformatics, November 15, 2009; 25(22): 2906 - 2912.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Bicciato, R. Spinelli, M. Zampieri, E. Mangano, F. Ferrari, L. Beltrame, I. Cifola, C. Peano, A. Solari, and C. Battaglia
A computational procedure to identify significant overlap of differentially expressed and genomic imbalanced regions in cancer datasets
Nucleic Acids Res., August 1, 2009; 37(15): 5057 - 5070.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
O. M. Rueda and R. Diaz-Uriarte
RJaCGH: Bayesian analysis of aCGH arrays for detecting copy number changes and recurrent regions
Bioinformatics, August 1, 2009; 25(15): 1959 - 1960.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
B. Daines, H. Wang, Y. Li, Y. Han, R. Gibbs, and R. Chen
High-Throughput Multiplex Sequencing to Discover Copy Number Variants in Drosophila
Genetics, August 1, 2009; 182(4): 935 - 941.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. LaFramboise
Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances
Nucleic Acids Res., July 1, 2009; (2009) gkp552v1.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
B. T. Hennessy, A.-M. Gonzalez-Angulo, K. Stemke-Hale, M. Z. Gilcrease, S. Krishnamurthy, J.-S. Lee, J. Fridlyand, A. Sahin, R. Agarwal, C. Joy, et al.
Characterization of a Naturally Occurring Breast Cancer Subset Enriched in Epithelial-to-Mesenchymal Transition and Stem Cell Characteristics
Cancer Res., May 15, 2009; 69(10): 4116 - 4124.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
K.-T. Kuo, B. Guan, Y. Feng, T.-L. Mao, X. Chen, N. Jinawath, Y. Wang, R. J. Kurman, I.-M. Shih, and T.-L. Wang
Analysis of DNA Copy Number Alterations in Ovarian Serous Tumors Identifies New Molecular Genetic Changes in Low-Grade and High-Grade Carcinomas
Cancer Res., May 1, 2009; 69(9): 4036 - 4042.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Nilsson, M. Johansson, F. Al-Shahrour, A. E. Carpenter, and B. L. Ebert
Ultrasome: efficient aberration caller for copy number studies of ultra-high resolution
Bioinformatics, April 15, 2009; 25(8): 1078 - 1079.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Bengtsson, A. Ray, P. Spellman, and T. P. Speed
A single-sample method for normalizing and combining full-resolution copy numbers from multiple platforms, labs and analysis methods
Bioinformatics, April 1, 2009; 25(7): 861 - 867.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
E. Budinska, E. Gelnarova, and M. G. Schimek
MSMAD: a computationally efficient method for the analysis of noisy array CGH data
Bioinformatics, March 15, 2009; 25(6): 703 - 713.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. LaFramboise, W. Winckler, and R. K. Thomas
A flexible rank-based framework for detecting copy number aberrations from array data
Bioinformatics, March 15, 2009; 25(6): 722 - 728.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
E. F. Attiyeh, S. J. Diskin, M. A. Attiyeh, Y. P. Mosse, C. Hou, E. M. Jackson, C. Kim, J. Glessner, H. Hakonarson, J. A. Biegel, et al.
Genomic copy number determination in cancer cells from single nucleotide polymorphism microarrays based on quantitative genotyping corrected for aneuploidy
Genome Res., February 1, 2009; 19(2): 276 - 283.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
L.-y. Wang, A. Abyzov, J. O. Korbel, M. Snyder, and M. Gerstein
MSB: A mean-shift-based approach for the analysis of structural variation in the genome
Genome Res., January 1, 2009; 19(1): 106 - 117.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
W. E. Corver, A. Middeldorp, N. T. ter Haar, E. S. Jordanova, M. van Puijenbroek, R. van Eijk, C. J. Cornelisse, G. J. Fleuren, H. Morreau, J. Oosting, et al.
Genome-wide Allelic State Analysis on Flow-Sorted Tumor Fractions Provides an Accurate Measure of Chromosomal Aberrations
Cancer Res., December 15, 2008; 68(24): 10333 - 10340.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. Ionita-Laza, N. M. Laird, B. A. Raby, S. T. Weiss, and C. Lange
On the frequency of copy number variants
Bioinformatics, October 15, 2008; 24(20): 2350 - 2355.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Erdman and J. W. Emerson
A fast Bayesian change point analysis for the segmentation of microarray data
Bioinformatics, October 1, 2008; 24(19): 2143 - 2148.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H.-I H. Chen, F.-H. Hsu, Y. Jiang, M.-H. Tsai, P.-C. Yang, P. S. Meltzer, E. Y. Chuang, and Y. Chen
A probe-density-based analysis method for array CGH data: simulation, normalization and centralization
Bioinformatics, August 15, 2008; 24(16): 1749 - 1756.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Andersson, C. E. G. Bruder, A. Piotrowski, U. Menzel, H. Nord, J. Sandgren, T. R. Hvidsten, T. Diaz de Stahl, J. P. Dumanski, and J. Komorowski
A segmental maximum a posteriori approach to genome-wide copy number profiling
Bioinformatics, March 15, 2008; 24(6): 751 - 758.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Pique-Regi, J. Monso-Varona, A. Ortega, R. C. Seeger, T. J. Triche, and S. Asgharzadeh
Sparse representation and Bayesian detection of genome copy number alterations from microarray data
Bioinformatics, February 1, 2008; 24(3): 309 - 318.
[Abstract] [Full Text] [PDF]


Home page
J. Med. Genet.Home page
N M C Maas, G Van Buggenhout, F Hannes, B Thienpont, D Sanlaville, K Kok, A Midro, J Andrieux, B-M Anderlid, J Schoumans, et al.
Genotype-phenotype correlation in 21 patients with Wolf-Hirschhorn syndrome using high resolution array comparative genome hybridisation (CGH)
J. Med. Genet., February 1, 2008; 45(2): 71 - 80.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.