Skip Navigation

Bioinformatics 2008 24(13):i375-i382; doi:10.1093/bioinformatics/btn188
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Rapaport, F.
Right arrow Articles by Vert, J.-P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rapaport, F.
Right arrow Articles by Vert, J.-P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2008 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Classification of arrayCGH data using fused SVM

Franck Rapaport 1,2,3,*, Emmanuel Barillot 1,2,3 and Jean-Philippe Vert 1,2,3

1Institut Curie, Centre de Recherche, 2INSERM, U900, Paris, F-75248 France and 3Center for Computational Biology, Ecole des Mines de Paris, 35 rue saint Honore, 77305 Fontainebleau, France

*To whom correspondence should be addressed.


   Abstract

Motivation: Array-based comparative genomic hybridization (arrayCGH) has recently become a popular tool to identify DNA copy number variations along the genome. These profiles are starting to be used as markers to improve prognosis or diagnosis of cancer, which implies that methods for automated supervised classification of arrayCGH data are needed. Like gene expression profiles, arrayCGH profiles are characterized by a large number of variables usually measured on a limited number of samples. However, arrayCGH profiles have a particular structure of correlations between variables, due to the spatial organization of bacterial artificial chromosomes along the genome. This suggests that classical classification methods, often based on the selection of a small number of discriminative features, may not be the most accurate methods and may not produce easily interpretable prediction rules.

Results: We propose a new method for supervised classification of arrayCGH data. The method is a variant of support vector machine that incorporates the biological specificities of DNA copy number variations along the genome as prior knowledge. The resulting classifier is a sparse linear classifier based on a limited number of regions automatically selected on the chromosomes, leading to easy interpretation and identification of discriminative regions of the genome. We test this method on three classification problems for bladder and uveal cancer, involving both diagnosis and prognosis. We demonstrate that the introduction of the new prior on the classifier leads not only to more accurate predictions, but also to the identification of known and new regions of interest in the genome.

Availability: All data and algorithms are publicly available.

Contact: franck.rapaport{at}curie.fr



Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
Z. Tian, T. Hwang, and R. Kuang
A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge
Bioinformatics, November 1, 2009; 25(21): 2831 - 2838.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Z. Barutcuoglu, E. M. Airoldi, V. Dumeaux, R. E. Schapire, and O. G. Troyanskaya
Aneuploidy prediction and tumor classification with heterogeneous hidden conditional random fields
Bioinformatics, May 15, 2009; 25(10): 1307 - 1313.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.