Skip Navigation



Bioinformatics Advance Access published online on December 3, 2008

Bioinformatics, doi:10.1093/bioinformatics/btn585
This Article
Right arrow Advance Access manuscript (PDF)
Right arrow All Versions of this Article:
25/10/1307    most recent
btn585v2
btn585v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Barutcuoglu, Z.
Right arrow Articles by Troyanskaya, O. G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Barutcuoglu, Z.
Right arrow Articles by Troyanskaya, O. G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2008). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Aneuploidy Prediction and Tumor Classification with Heterogeneous Hidden Conditional Random Fields

Zafer Barutcuoglu 1, Edoardo M. Airoldi 1,2, Vanessa Dumeaux 3, Robert E. Schapire 1 and Olga G. Troyanskaya 1,2

1Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540
2Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544
3Institute of Community Medicine, Tromso University, Tromso, Norway

*To whom correspondence should be addressed. Dr. Olga G. Troyanskaya, E-mail: ogt{at}cs.princeton.edu


   Abstract

Motivation: The heterogeneity of cancer cannot always be recognized by tumor morphology, but may be reflected by the underlying genetic aberrations. Array-CGH methods provide highthroughput data on genetic copy numbers, but determining the clinically relevant copy number changes remains a challenge. Conventional classification methods for linking recurrent alterations to clinical outcome ignore sequential correlations in selecting relevant features. Conversely, existing sequence classification methods can only model overall copy number instability, without regard to any particular position in the genome.

Results: Here we present the Heterogeneous Hidden Conditional Random Field, a new integrated array-CGH analysis method for jointly classifying tumors, inferring copy numbers, and identifying clinically relevant positions in recurrent alteration regions. By capturing the sequentiality as well as the locality of changes, our integrated model provides better noise reduction, and achieves more relevant gene retrieval and more accurate classification than existing methods. We provide an efficient L1-regularized discriminative training algorithm, which notably selects a small set of candidate genes most likely to be clinically relevant and driving the recurrent amplicons of importance. Our method thus provides unbiased starting points in deciding which genomic regions and which genes in particular to pursue for further examination. Our experiments on synthetic data and real genomic cancer prediction data show that our method is superior, both in prediction accuracy and relevant feature discovery, to existing methods. We also demonstrate that it can be used to generate novel biological hypotheses for breast cancer.

Contact: Olga G. Troyanskaya (ogt{at}cs.princeton.edu)

Associate Editor: Dr. Jonathan Wren


Received on July 29, 2008; revised on November 9, 2008; accepted on November 9, 2008

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.