Skip Navigation


Bioinformatics Advance Access originally published online on June 22, 2007
Bioinformatics 2007 23(17):2256-2264; doi:10.1093/bioinformatics/btm322
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/17/2256    most recent
btm322v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lottaz, C.
Right arrow Articles by Spang, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lottaz, C.
Right arrow Articles by Spang, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Annotation-based distance measures for patient subgroup discovery in clinical microarray studies

Claudio Lottaz *, Joern Toedling {dagger} and Rainer Spang

Max Planck Institute for Molecular Genetics and Berlin Center for Genome Based Bioinformatics, Ihnestr. 73, D-14195 Berlin, Germany

*To whom correspondence should be addressed.


   Abstract

Motivation: Clustering algorithms are widely used in the analysis of microarray data. In clinical studies, they are often applied to find groups of co-regulated genes. Clustering, however, can also stratify patients by similarity of their gene expression profiles, thereby defining novel disease entities based on molecular characteristics. Several distance-based cluster algorithms have been suggested, but little attention has been given to the distance measure between patients. Even with the Euclidean metric, including and excluding genes from the analysis leads to different distances between the same objects, and consequently different clustering results.

Results: We describe a new clustering algorithm, in which gene selection is used to derive biologically meaningful clusterings of samples by combining expression profiles and functional annotation data. According to gene annotations, candidate gene sets with specific functional characterizations are generated. Each set defines a different distance measure between patients, leading to different clusterings. These clusterings are filtered using a resampling-based significance measure. Significant clusterings are reported together with the underlying gene sets and their functional definition.

Conclusions: Our method reports clusterings defined by biologically focused sets of genes. In annotation-driven clusterings, we have recovered clinically relevant patient subgroups through biologically plausible sets of genes as well as new subgroupings. We conjecture that our method has the potential to reveal so far unknown, clinically relevant classes of patients in an unsupervised manner.

Availability: We provide the R package adSplit as part of Bioconductor release 1.9 and on http://compdiag.molgen.mpg.de/software

Contact: claudio.lottaz{at}molgen.mpg.de

{dagger}Present address: EMBL – European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.


Received on October 24, 2006; revised on February 8, 2007; accepted on June 11, 2007

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.