Bioinformatics Advance Access published online on November 16, 2007
Bioinformatics, doi:10.1093/bioinformatics/btm563
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut library for R

aDept. of Human Genetics, University of California at Los Angeles, CA 90095-7088, bRosettaInpharmatics-Merck Research Laboratories, Seattle, WA
To whom correspondence should be addressed. Prof. Steve Horvath, E-mail: stevetihi{at}yahoo.com
| Abstract |
|---|
Summary: Hierarchical clustering is a widely used method for detecting clusters in genomic data. Clusters are defined by cutting branches off the dendrogram. A common but inflexible method uses a constant height cutoff value; this method exhibits suboptimal performance on complicated dendrograms. We present the Dynamic Tree Cut R library that implements novel dynamic branch cutting methods for detecting clusters in a dendrogram depending on their shape. Compared to the constant height cutoff method, our techniques offer the following advantages: (1) they are capable of identifying nested clusters; (2) they are flexible — cluster shape parameters can be tuned to suit the application at hand; (3) they are suitable for automation; and (4) they can optionally combine the advantages of hierarchical clustering and partitioning around medoids, giving better detection of outliers. We illustrate the use of these methods by applying them to protein–protein interaction network data and to a simulated gene expression data set.
Availability: The Dynamic Tree Cut method is implemented in an R library available at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting.
Contact: Peter.Langfelder{at}gmail.com, BinZhang.ucla{at}gmail.com, SHorvath{at}mednet.ucla.edu
Supplementary information: An in-depth description of the method and a manual for use in R are available at the above URL.
Associate Editor: Dr. Trey Ideker
*The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
Received on September 12, 2007; revised on September 12, 2007; accepted on November 6, 2007
This article has been cited by other articles:
![]() |
A. Torkamani and N. J. Schork Identification of rare cancer driver mutations by network reconstruction Genome Res., September 1, 2009; 19(9): 1570 - 1578. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bosco, K. L. McKenna, M. J. Firth, P. D. Sly, and P. G. Holt A Network Modeling Approach to Analysis of the Th2 Memory Responses Underlying Human Atopic Disease J. Immunol., May 15, 2009; 182(10): 6011 - 6021. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Andreopoulos, A. An, X. Wang, and M. Schroeder A roadmap of clustering algorithms: finding a match for a biomedical application Brief Bioinform, May 1, 2009; 10(3): 297 - 314. [Abstract] [Full Text] [PDF] |
||||


