Bioinformatics Advance Access published online on June 9, 2006
Bioinformatics, doi:10.1093/bioinformatics/btl279
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Computer Science, University of California, Los Angeles, USA; Department of Chemical Engineering, University of California, Los Angeles, USA
* To whom correspondence should be addressed.
Network component analysis (NCA) is a method to deduce transcription factor (TF) activities and TF-gene regulation control strengths from gene expression data and a TF-gene binding connectivity network. Previously, this method could analyze a maximum number of regulators equal to the total sample size because of the identifiability limit in data decomposition. As such, the total number of source signal components was limited to the total number of experiments rather than the total number of biological regulators. However, networks that have less transcriptome data points than the number of regulators are of interest. Thus it is imperative to develop a theoretical basis that allows realistic source signal extraction based on relatively few data points. On the other hand, such methods would inherently increase numerical challenges leading to multiple solutions. Therefore, solutions to both problems are needed. Results: We have improved NCA for transcription factor activity (TFA) estimation, based the observation that most genes are regulated by only a few TFs. This observation leads to the derivation of a new identifiability criterion which is tested during numerical iteration that allows us to decompose data when the number of TFs is greater than the number of experiments. To show that our method works with real microarray data and has biological utility, we analyze Saccharomyces cerevisiae cell cycle microarray data (73 experiments) using a TF-gene connectivity network (96 TFs) derived from ChIP-chip binding data. We compare the results of NCA analysis to results obtained from ChIP-chip regression methods, and we show that NCA and regression produce TFAs that are qualitatively similar, but the NCA TFAs outperform regression in statistical tests. We also show that NCA can extract subtle TFA signals that correlate with known cell cycle TF function and cell cycle phase. Overall we determined that 31 TFs have statistically periodic TFAs in one or more experiments, 75% of which are known cell cycle regulators. In addition we find that the 12 TFAs that are periodic in two or more experiments correspond to well known cell cycle regulators. We also investigated TFA sensitivity to the choice of connectivity network we constructed two networks using different ChIP-chip p-value cut-offs. Availability: The NCA Toolbox for MATLAB is available at http://www.seas.ucla.edu/~liaoj/download.htm. Associate Editor: Golan Yona
Received May 27, 2006
Accepted June 2, 2006
Article
Transcriptome network component analysis with limited microarray data
Simon J. Galbraith 1
,
Linh M. Tran 2
,
and
James C. Liao 2 *
2 Department of Chemical Engineering, University of California, Los Angeles, USA
James C. Liao, E-mail: liaoj{at}seas.ucla.edu
![]()
Abstract
These authors contributed equally to this publication![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
L. R. Jarboe, D. R. Hyduke, L. M. Tran, K. J. Y. Chou, and J. C. Liao Determination of the Escherichia coli S-Nitrosoglutathione Response Network Using Integrated Biochemical and Systems Analysis J. Biol. Chem., February 22, 2008; 283(8): 5148 - 5157. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. P. Brynildsen, T.-Y. Wu, S.-S. Jang, and J. C. Liao Biological network mapping and source signal deduction Bioinformatics, July 15, 2007; 23(14): 1783 - 1791. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Parisi, P. Wirapati, and F. Naef Identifying synergistic regulation involving c-Myc and sp1 in human tissues Nucleic Acids Res., March 1, 2007; (2007) gkl1157v2. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. P. Brynildsen, L. M. Tran, and J. C. Liao A Gibbs sampler for the identification of gene expression and network connectivity consistency Bioinformatics, December 15, 2006; 22(24): 3040 - 3046. [Abstract] [Full Text] [PDF] |
||||


