Skip Navigation



Bioinformatics Advance Access published online on June 3, 2009

Bioinformatics, doi:10.1093/bioinformatics/btp327
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow Supplementary Data
Right arrowOA All Versions of this Article:
25/14/1789    most recent
btp327v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Dotan-Cohen, D.
Right arrow Articles by Melkman, A. A
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dotan-Cohen, D.
Right arrow Articles by Melkman, A. A
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2009 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Seeing the Forest for the Trees: Using the Gene Ontology to re-structure Hierarchical clustering.

Dikla Dotan-Cohen 1,*, Simon Kasif 2,5 and Avraham A Melkman 1

1Department of Computer Science, Ben-Gurion University, Beer Sheva, Israel 84105
2Department of Biomedical Engineering, Boston University, MA 02215
3Center for Advanced Genomic Technology, Boston University, MA 02215
4Bioinformatics Program, Boston University, MA 02215
5Children's Hospital Boston, Harvard/MIT Program in Health Sciences and Technology, 300 Longwood Avenue, Boston MA 02115

*To whom correspondence should be addressed. Mrs. Dikla Dotan-Cohen, E-mail: dotna{at}cs.bgu.ac.il


   Abstract

Motivation: There is a growing interest in improving the cluster analysis of expression data by incorporating into it prior knowledge, such as the GO-annotations of genes, in order to improve the biological relevance of the clusters that are subjected to subsequent scrutiny. The structure of the Gene Ontology is another source of background knowledge that can be exploited through the use of semantic-similarity.

Results: We propose here a novel algorithm that integrates semantic-similarities (derived from the ontology structure) into the procedure of deriving clusters from the dendrogram constructed during expression-based hierarchical clustering. Our approach can handle the multiple annotations, from different levels of the GO-hierarchy, which most genes have. Moreover, it treats annotated and unanno-tated genes in a uniform manner. Consequently, the clusters ob-tained by our algorithm are characterized by significantly enriched annotations. In both cross-validation tests and when using an exter-nal index such as protein-protein interactions, our algorithm per-forms better than previous approaches.

When applied to human-cancer expression data, our algorithm iden-tifies, among others, clusters of genes related to immune response and glucose metabolism. These clusters are also supported by pro-tein-protein interaction data.

Associate Editor: Prof. David Rocke


Received on December 7, 2009; revised on April 28, 2009; accepted on May 15, 2009

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.