Bioinformatics Advance Access originally published online on September 3, 2009
Bioinformatics 2009 25(21):2795-2801; doi:10.1093/bioinformatics/btp526
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bi-correlation clustering algorithm for determining a set of co-regulated genes
1 Department of Computer Science and Engineering, Netaji Subhash Engineering College, Kolkata 700152 and 2 Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700108, West Bengal, India
* To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Biclustering has been emerged as a powerful tool for identification of a group of co-expressed genes under a subset of experimental conditions (measurements) present in a gene expression dataset. Several biclustering algorithms have been proposed till date. In this article, we address some of the important shortcomings of these existing biclustering algorithms and propose a new correlation-based biclustering algorithm called bi-correlation clustering algorithm (BCCA).
Results: BCCA has been able to produce a diverse set of biclusters of co-regulated genes over a subset of samples where all the genes in a bicluster have a similar change of expression pattern over the subset of samples. Moreover, the genes in a bicluster have common transcription factor binding sites in the corresponding promoter sequences. The presence of common transcription factors binding sites, in the corresponding promoter sequences, is an evidence that a group of genes in a bicluster are co-regulated. Biclusters determined by BCCA also show highly enriched functional categories. Using different gene expression datasets, we demonstrate strength and superiority of BCCA over some existing biclustering algorithms.
Availability: The software for BCCA has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/
rajat. Then it needs to be installed. Two word files (included in the zip file) need to be consulted before installation and execution of the software.
Contact: rajat{at}isical.ac.in
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Joaquin Dopazo
Received on January 20, 2009; revised on August 14, 2009; accepted on September 1, 2009