Bioinformatics Advance Access originally published online on September 6, 2005
Bioinformatics 2005 21(21):4033-4038; doi:10.1093/bioinformatics/bti656
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Inference of transcriptional regulatory network by two-stage constrained space factor analysis
Department of Statistics, University of California Los Angeles, CA 90095-1554, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Motivation: Microarray gene expression and cross-linking chromatin immunoprecipitation data contain voluminous information that can help the identification of transcriptional regulatory networks at the full genome scale. Such high-throughput data are noisy however. In contrast, from the biomedical literature, we can find many evidenced transcription factor (TF)target gene binding relationships that have been elucidated at the molecular level. But such sporadically generated knowledge only offers glimpses on limited patches of the network. How to incorporate this valuable knowledge resource to build more reliable network models remains a question.
Results: We present a modified factor analysis approach. Our algorithm starts with the evidenced TFgene linkages. It iterates between the network configuration estimation step and the connection strength estimation step, using the high-throughput data, till convergence. We report two comprehensive regulatory networks obtained for Saccharomyces cerevisiae, one under the normal growth condition and the other under the environmental stress condition.
Contact: kcli{at}stat.ucla.edu
Supplementary information: http://kiefer.stat.ucla.edu/lap2/download/bti656_supplement.pdf
| INTRODUCTION |
|---|
|
|
|---|
Transcription network modeling is a major step towards deciphering the cellular regulation system. It involves two major tasks: (1) finding the target genes for each transcription factor (TF), and (2) correlating each TF's activity to its target transcripts as the condition varies. The first task specifies the network configuration. Several methods are available. The computational approach includes the inference of TF binding targets by drawing information from TF binding motifs (Qiu, 2003; Pritsker et al., 2004), and from gene-expression dynamics (Pournara and Wernisch, 2004; Qian et al., 2003; Rung et al., 2002; Segal et al., 2003; Zhu et al., 2002). A more direct approach is the genome-wide location analysis, or cross-linking chromatin immunoprecipitation (ChIP), which profiles each TF for its binding sites over the entire genome (Harbison et al., 2004; Lee et al., 2002). Combining ChIP data with microarray gene-expression data can give more interpretable network connectivity estimates (Bar-Joseph et al., 2003; Xu et al., 2004; Zhou et al., 2005). It also serves the purpose of elucidating the relationship between a TF's activity and the abundance of its target transcripts. Among many related works, of our special interest is the Network Component Analysis (NCA) model by Liao et al. (2003), which treats TF activities as latent variables. We shall incorporate this idea in developing our method.
Although ChIP and gene-expression data are invaluable for building the transcription network at the genome scale, they are both subject to high level noises. To minimize the noise interference in network construction, instead of taking a de novo approach which would require the simultaneous estimation of a tall magnitude of parameters, our idea is to use a set of highly reliable connections as the skeleton for network building. For yeast, more than 1000 evidenced TFgene relationships exist in the literature and they have been organized into knowledgebase available from the internet (Lee et al., 2002; Wingender et al., 2001). This source of information provides an excellent starting point for network construction.
We present an algorithm that integrates ChIP data, microarray data and prior biological knowledge to obtain the transcription network. Our approach has several features. First, it utilizes the known TFgene relationships. Second, it takes into account the combinatorial nature of transcription regulation. Third, it provides an estimate of TF activity, which can be used to further study the transcriptional regulation of the TFs themselves. Fourth, it takes into account the condition specificity in modeling the TFtarget gene binding relationship.
| METHODS |
|---|
|
|
|---|
The two-stage constrained space factor analysis model
To relate TFgene linkages with transcript abundance, we adopt the factor analysis model (Morrison, 1990), which takes the form of X = LY + E. It is well-known that without any constraint on the loading matrix L, the model is not identifiable. In practice, rotation on the loading matrix is taken to yield interpretable results. Quite often, this reduces the number of non-zero loadings (Morrison, 1990).
Suppose there are N genes, K TFs and J gene-expression conditions. We represent the configuration of a transcription regulatory network by a sparse connection matrix CNxK = [c1, c2, ..., cK] between TFs and genes. Each column vector ck is composed of 1 and 0s, indicating the binding (1) and non-binding (0) relationship of the k-th TF to each gene.
To apply factor analysis model, we take X to be the microarray gene-expression profile matrix GNxJ, Y to be the TF activity profile matrix TKxJ, L to be the regulation strength matrix BNxK = [b1, b2, ..., bK]. We rewrite the model as
![]() | (1) |
![]() | (2) |
In our analysis, the expression profiles are already in log ratios. If the network configuration matrix C were pre-specified, then model (1) would be reduced to the NCA model proposed by Liao et al. (2003), wherein conditions can be found with respect to parameter identification. But the more challenging task for us is how to estimate C.
To guide the estimation of C, we use the condition that elements in the configuration matrix C be bounded by the corresponding elements from two matrices CMIN and CMAX:
![]() | (3) |
We start with C = CMIN. After stabilizing the initial estimates of B and T (see next section), we update the configuration by adding a new linkage that best agrees with current B and T estimates. We then update B and T. This procedure is repeated many times till convergence.
The algorithm
We normalize each gene-expression profile to bring the mean to zero and standard deviation to one. We also normalize estimated TF activity profile in each iteration of our algorithm.
Step 1. Initial estimation of TF activity profiles T from higher-confidence set. Set C = CMIN. The initial estimate of the activity profile for a TF is constructed by the consensus of the expression profiles for those genes targeted by this TF, using the leading component of a weighted PCA (Morrison, 1990).
Step 2. Estimation of B and T. After the initial estimate of T is obtained, an alternating least-square procedure (Gifi, 1990) is applied to minimize the sum of square error ||G BT||2.
- Estimating B. Fix the T matrix. For each row vector gi in matrix G, find all k*s such that ci,k* = 1. Regress gi against the corresponding tk*s. Replace the bi,k*s with the regression coefficients. Here we use ridge regression to deal with the stability issue arising from the collinearity between the regressor variables (Faraway, 2004).
- Estimating T. Fix the B matrix, regress each column of matrix G, gj against B. Replace the corresponding column of matrix T, tj with the estimated coefficients. The two steps are iterated until the sum of squared change of T is smaller than a cutoff value.
Step 3. Adding new TFgene relations. The algorithm searches through all TFgene pairs allowed by CMAX C to find a pair that best agrees with the current B and T estimates. Because all gene-expression profiles and TF activity profiles are normalized, this is done efficiently by finding the highest absolute covariance between the residual (unexplained part) of an expression profile and a TF activity profile.
Define matrix D = CMAX C. First we find the row-wise covariance matrix V between the residual expression matrix R = G BT and the TF activity matrix T, by vi,k = cov(ri, tk). We then find the pair {i*, k*} = arg maxi,k(|vi,k| x di,k). We assign ci*,k* = 1 and bi*,k* = cov(ri*,tk*). Then the estimates of B and T are stabilized as described earlier.
We iterate between Steps 2 and 3. In each iteration, we record the total reduction of residual sum of squares (RSS) ||G BT||2. When the average reduction in RSS in the last 10 iterations is less than one-fifth of that of the initial 10, we consider most of the signals in the lower-confidence set have been picked up, and terminate the iteration.
Step 4. Fine-tuning of TF-gene relations. Once the convergence is reached, we use T as the final estimate of TF activity profiles. Based on this estimate, we make an additional effort to fine-tune the network configuration matrix C, using regression variable selection techniques.
For each gene i, to determine its regulator TFs, we find k*s such that cMAX i,k* = 1, and consider the multiple linear regression model
![]() | (4) |
The data source
The higher-confidence set consists of known geneTF relationships in the biomedical literature [see TRANSFAC (Wingender et al., 2001) and the website of Young's group (Lee et al., 2002)]. There are a total of 1089 TFgene relationships, from which we shall construct the matrix CMIN used in Equation (3).
The lower-confidence set is based on the ChIP dataset (Harbison et al., 2004). We use all TFgene pairs that were reported to have P-values <0.05. The use of this loose cutoff point is to lower the false-negative rate. We shall combine the lower- and higher-confidence sets and use matrix CMAX to represent the information.
Two large-scale microarray datasets are used in this study. The cell-cycle dataset (Spellman et al., 1998) is used for normal growth network estimation. The stressresponse dataset (Gasch et al., 2000) is used for stress-specific network estimation.
Time-shifted activityexpression correlation
For the cell-cycle data, we further investigate the time-shifting behavior between a TF's expression profile and its activity profile. Denote the expression profile by x = (x1, x2, ..., xM), the activity profile by y = (y1, y2, ..., yM) and time points by t = (t1, t2, ..., tM). Let
t be the amount of time-shifting in minutes, which takes an integer value between 0 and 20. We first estimate the correlation between x(t) and y(t +
t). We then find the delayed time
t that maximizes the correlation in absolute value. We estimate y(t +
t) by fitting y with a cubic spline.
| RESULTS |
|---|
|
|
|---|
Regulatory network under rich-medium growth condition
Harbinson et al. profiled 203 TFs for their genome-wide DNA binding sites under rich medium growth condition (Harbison et al., 2004; Lee et al., 2002). We consider only those TFs that have evidenced binding targets. To avoid multiple counting of TFgene relationships, if a group of TFs (e.g. HAP2/HAP3/HAP4/HAP5) always operate together as a functional unit according to the literature, we will count them as one TF. There are a total of 99 TFs used in our analysis. Their names and functions provided by Saccharomyces Genome Database (SGD) (Dwight et al., 2002) are given in Supporting Table 4.
We start with 891 evidenced relationships and 29 154 lower-confidence relationships. Using the cell-cycle microarray data by Spellman et al. (1998), we apply the algorithm as described in Methods section to reach a final network which has 3846 TFgene connections. For each TF, we examine the biological processes that its target genes participate by GO Term Finder of SGD (Ashburner et al., 2000; Dwight et al., 2002). The list of genes regulated by each TF can be found at http://www.stat.ucla.edu/~tyu/factor/. The over-represented terms are given in Supporting Table 5.
Some TFs are more specialized, whereas others act on a broader range of cellular processes. The GO slims define broad biological processes (Ashburner et al., 2000; Dwight et al., 2002). For each process, we identify TFs that regulate significant numbers of genes in it (Table 1). We find ABF1, FKH1/2 and INO2/4 to be the leading factors, each acting on 9 of the 33 processes. ABF1 and INO2/4 mostly act on metabolic and transport processes, whereas FKH1/2 mostly acts on cell cycle-related processes. Other widely influential factors include cell cycle-related SWI4, SWI6, and metabolism-related MSN2/4, HAP1, HAP2/3/4/5 and XBP1.
|
Figure 1 shows the regulatory relationship between TFs. An arrow points from a TF to another TF if the latter is the target gene of the former according to the TFgene network we constructed. Consistent with their biological roles, we find the cell-cycle regulatory TFs SWI6, SWI4, FKH1/FKH2, ACE2/SWI5, MCM1 and STB1 (diamond nodes, Fig. 1) are linked together. Around the leading hub in the network, GAT1, we find a sub-network that involves nitrogen metabolism-related TFs GAT1, DAL80, DAL81, GZF3, GLN3, and stress-related TFs IXR1, XBP1, YAP1, RPN4 and HAP1 (square nodes, Fig. 1). Interestingly, the two regulators of GAT1 expression are cell-cycle TFs ACE2/SWI5 and FKH1/FKH2.
|
We investigate if the activity profile of a TF is correlated with its own gene-expression profile subject to a possible time delay. We consider the alpha-factor data, cdc-15 data and cdc-28 data separately. The elutriation-synchronized data are excluded from our analysis because the time interval (every 30 min) used in collecting the mRNA sample is too long. For each TF, we first compute the activityexpression correlation without time delay. Table 2 lists a total of 17 TFs which have correlation >0.4 in at least two of the three synchronization experiments. For each of the remaining TFs, we analyze the time-shifted activityexpression correlation as described in Methods section. We find 10 TFs showing delayed activityexpression correlation (see Table 3). As an example, the time-delay pattern for SWI4 is shown in Figure 2. Both the expression and the activity profiles exhibit cell-cycle pattern periodicity. The estimated time lag is
10 min (1/6 cycle).
|
|
|
As suggested by one referee, a related issue that can be addressed by using GO is about the functional homogeneity of transcription modules. Similar to the distance measure used by Ye and Godzik (2004) in studying protein domains, we compute the average length of the shortest GO-path between two genes linked to the same TF. The results are summarized in Supporting Figure 3. We only find a marginally significant (P-value 0.0324, one-sided signed rank sum test) decrease of distance when comparing with TF modules obtained by using CHIP data alone. Another measure based on GO-slims yields similar findings (see Supportive Information Text 1 for more discussion). Note that we have not paid attention to TF modules defined by a combination of TFs yet. Although ideally one would expect higher functional homogeneity for such better-defined modules, this is certainly a more complicated problem to address.
Regulatory network under stress condition
TF binding to a subset of the regulatory sequences may be dependent on the environmental conditions of the cell. Harbison et al. (2004) analyzed the genome-wide binding properties of 84 TFs under multiple stress conditions. Combining this dataset with the stressresponse microarray gene expression data (Gasch et al., 2000), we shall identify a network underlying the gene-expression regulation in stress conditions.
In the stress-specific ChIP dataset (Harbison et al., 2004), some TFs are profiled in multiple conditions. We include a TFgene linkage in the lower-confidence set as long as it is observed in one of the conditions. Starting with 579 higher-confidence TFgene relationships and 29 316 lower-confidence relationships, we apply our algorithm and obtain a network of 8183 TFgene connections, which involve 49 TFs. The list of genes regulated by each TF can be found at http://www.stat.ucla.edu/~tyu/factor/. For each TF, we find biological processes that are over-represented by its targeted genes (see Supporting Table 6).
We further use the regulatory network to study the 868 environmental stressresponse (ESR) genes reported by Gasch et al. (2000). Among the 585 genes repressed in ESR, 98% have connections in our network, compared with 78% for non-ESR genes. At the significance level of 105, seven TFs are identified as major regulators of these genes. The most prominent among them is RAP1, which regulates 93 genes. The others are ARG80/ARG81/ARG82 (regulating 56 genes), RCS1 (52 genes), CBF1/MET4/MET31 (46 genes), HSF1 (45 genes), RTG1/RTG3 (43 genes) and GAT1 (42 genes). Among the 283 upregulated genes in ESR, 97% have connections in our network. At the significance level of 105, six TFs are identified as major regulators of these genes. They are MSN2/MSN4 (79 genes), PHO2 (32 genes), HAP2/HAP3/HAP4/HAP5 (26 genes), AFT2 (26 genes), ROX1 (26 genes) and RPH1 (26 genes).
| DISCUSSION |
|---|
|
|
|---|
We have presented a method to infer the transcriptional regulatory network at the full genome scale, by integrating information from microarray gene-expression data, genome-wide location (ChIP) data and the evidenced TFtarget gene relationships in the biomedical literature. Our method is based on a constrained space factor analysis model, which treats TF activity as hidden variables.
In the analysis of transcriptional regulation, one central theme is how to describe TF activity. By finding co-regulated gene modules, some authors implicitly infer building blocks of the network without modeling the TF activity (Eisen et al., 1998; D'haeseleer et al., 2000; Ihmels et al., 2002, 2004; Kwon et al., 2003; Toh and Horimoto, 2002). Studying co-expression dynamics with other gene-expression levels as indicators of cellular state changes also by-passes the TF activity modeling issue (Li, 2002; Li et al., 2004). Some authors tried to connect the TFs' activity directly with their gene-expression levels (Qian et al., 2003; Segal et al., 2003; Zhu et al., 2002). Bayesian learning by perturbed expression aims at directly finding the network structure, without the need to estimate TF activities (Pe'er et al., 2001; Pournara and Wernisch, 2004; Rung et al., 2002).
As in Liao et al. (2003), our method treats TF activities as hidden variables which help both the network configuration specification and the TF-binding strength modeling simultaneously. The estimation of a TF's activity is independent of the information about its own transcription profile, which allows further analysis of TF behavior as we demonstrated in the Results section.
Different from Liao's NCA model, however, we did not consider the network configuration as being given. Nevertheless, we consider both network configuration and connection strength estimation as integrative components of a general factor analysis model. We fit the model by iterating between the step of network configuration search and the step of parameter estimation. Several factors necessitate this adaptive model fitting approach. First, high-throughput data contain high levels of biological and measurement noise. Second, we have only incomplete knowledge about the network configuration. Third, there are probably other hidden variables, e.g. unknown TFs that are not included in the model. They may have confounding effects with the variables under study. The use of prior knowledgebase of TFtarget gene relationship and our stepwise expansion of the network connection make our approach more immune to these confounding variables. Conceptually, our evolving model approach is analogous to model building by the neural network approach. In neural network modeling, the number of parameters that have to be estimated from the data is overwhelming. Yet with proper training, the network can converge to a useful local optimal solution. Likewise, although the full size factor analysis model (1) has multiple solutions, we aim at converging to a local optimum by adaptive learning. The available TF-target knowledge serves us well in providing a reasonable starting point. As the repertoires of data and knowledge grow richer and richer in the future, we can expect our approach to become even more powerful.
We report two regulatory networks under different growth conditions for Saccharomyces cerevisiae. The network under the normal growth condition is estimated from cell-cycle microarray data and normal growth ChIP data. Based on the TF activity identified from the cell-cycle time-series, new time-shifting relationships are found between the activity and expression of some TFs. The stressresponse network is estimated by using stressresponse ChIP data and gene-expression data, pooling many stress conditions together. This network explains the expression of 98% of the ESR genes identified by Gasch et al. (2000), and correctly identifies several leading regulators. Somewhat expected, a comparison between the two networks shows that most TFs are regulating different sets of genes. An interesting exception is RAP1 (repressor activator protein). RAP1 regulates 45 genes for the network under normal growth condition, whereas it regulates 211 genes under stress condition. Among these two sets, 27 genes are shared (P-value
1027). Furthermore, we find that 26 of the 27 shared genes are associated with protein biosynthesis, a process that is repressed under stress conditions. This is consistent with RAP1's role in ESR regulation (Gasch et al., 2000; Li et al., 1999).
In this report, all gene-expression profiles are standardized before the network estimation starts. We did not filter out genes with less-varying expressions in their original scale. As suggested by a referee, proper pre-screening should help reduce the instability in estimating our model parameters associated with such non-informative genes. During the revision of this paper, we examined the standard deviations in the original expression profiles and compared those for genes in our final network with those for the remaining genes. We find significantly higher expressional variation for the genes in the network [(P-value: 1.4 x 1077 for cell-cycle data and 4.4 x 10129 for stressresponse data); see Supporting Figure 9]. These findings suggest that our results are not overwhelmed by the non-varying expression profiles.
| Acknowledgments |
|---|
We thank Dr Chiara Sabatti and Dr James Liao for helpful discussions. This work is supported by NSF grants DMS-0201005, DMS-0104038 and DMS-0406091.
Conflict of Interest: none declared.
Received on July 25, 2005; revised on August 24, 2005; accepted on August 30, 2005
| REFERENCES |
|---|
|
|
|---|
Ashburner, M., et al. (2000) Gene Ontology: tool for the unification of biology. Nat. Genet., 25, 2529[CrossRef][Web of Science][Medline].
Bar-Joseph, Z., et al. (2003) Computational discovery of gene modules and regulatory networks. Nat. Biotechnol., 21, 13371342[CrossRef][Web of Science][Medline].
D'haeseleer, P., et al. (2000) Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics, 16, 707726
Dwight, S.S., et al. (2002) Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO). Nucleic Acids Res., 30, 6972
Eisen, M.B., et al. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA, 95, 1486314868
Faraway, J.J. Linear Models by, (2004) , Boca Raton, FL R. Chapman & Hall/CRC.
Gasch, A.P., et al. (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell, 11, 42414257
Gifi, A. Nonlinear Multivariate Analysis, (1990) , New York John Wiley & Sons.
Harbison, C.T., et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature, 431, 99104[CrossRef][Medline].
Ihmels, J., et al. (2002) Revealing modular organization in the yeast transcriptional network. Nat. Genet., 31, 370377[CrossRef][Web of Science][Medline].
Ihmels, J., et al. (2004) Defining transcription modules using large-scale gene expression data. Bioinformatics, 20, 19932003
Kwon, A.T., et al. (2003) Inference of transcriptional regulation relationships from gene expression data. Bioinformatics, 19, 905912
Lee, T.I., et al. (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799804
Li, B., et al. (1999) Transcriptional elements involved in the repression of ribosomal protein synthesis. Mol. Cell. Biol., 19, 53935404
Li, K.C. (2002) Genome-wide coexpression dynamics: theory and application. Proc. Natl Acad. Sci. USA, 99, 1687516880
Li, K.C., et al. (2004) A system for enhancing genome-wide coexpression dynamics study. Proc. Natl Acad. Sci. USA, 101, 1556115566
Liao, J.C., et al. (2003) Network component analysis: reconstruction of regulatory signals in biological systems. Proc. Natl Acad. Sci. USA, 100, 1552215527
Morrison, D.F. Multivariate Statistical Methods, (1990) , New York McGraw-Hill Publishing Company.
Pe'er, D., et al. (2001) Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17, Suppl. 1, S215S224[Abstract].
Pournara, I. and Wernisch, L. (2004) Reconstruction of gene networks using Bayesian learning and manipulation experiments. Bioinformatics, 20, 29342942
Pritsker, M., et al. (2004) Whole-genome discovery of transcription factor binding sites by network-level conservation. Genome Res., 14, 99108
Qian, J., et al. (2003) Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data. Bioinformatics, 19, 19171926
Qiu, P. (2003) Recent advances in computational promoter analysis in understanding the transcriptional regulatory network. Biochem. Biophys. Res. Commun., 309, 495501[CrossRef][Web of Science][Medline].
Rung, J., et al. (2002) Building and analysing genome-wide gene disruption networks. Bioinformatics, 18, Suppl. 2, S202S210[Abstract].
Segal, E., et al. (2003) Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet., 34, 166176[CrossRef][Web of Science][Medline].
Spellman, P.T., et al. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell, 9, 32733297
Toh, H. and Horimoto, K. (2002) Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling. Bioinformatics, 18, 287297
Wingender, E., et al. (2001) The TRANSFAC system on gene expression regulation. Nucleic Acids Res., 29, 281283
Xu, X., et al. (2004) Learning module networks from genome-wide location and expression data. FEBS Lett., 578, 297304[CrossRef][Web of Science][Medline].
Ye, Y. and Godzik, A. (2004) Comparative analysis of protein domain organization. Genome Res., 14, 343353
Zhou, X.J., et al. (2005) Functional annotation and network reconstruction through cross-platform integration of microarray data. Nat. Biotechnol., 23, 238243[CrossRef][Web of Science][Medline].
Zhu, Z., et al. (2002) Computational identification of transcription factor binding sites via a transcription-factor-centric clustering (TFCC) algorithm. J. Mol. Biol., 318, 7181[CrossRef][Web of Science][Medline].
This article has been cited by other articles:
![]() |
Ning Sun and Hongyu Zhao Reconstructing transcriptional regulatory networks through genomics data Statistical Methods in Medical Research, December 1, 2009; 18(6): 595 - 617. [Abstract] [PDF] |
||||
![]() |
H. Li and M. Zhan Unraveling transcriptional regulatory programs by integrative analysis of microarray and transcription factor binding data Bioinformatics, September 1, 2008; 24(17): 1874 - 1880. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Chang, Z. Ding, Y. S. Hung, and P. C. W. Fung Fast network component analysis (FastNCA) for gene regulatory network reconstruction from microarray data Bioinformatics, June 1, 2008; 24(11): 1349 - 1358. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Yuan and K.-C. Li Context-dependent clustering for dynamic cellular state modeling of microarray gene expression Bioinformatics, November 15, 2007; 23(22): 3039 - 3047. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Sun, T. Yu, and K.-C. Li Detection of eQTL modules mediated by activity levels of transcription factors Bioinformatics, September 1, 2007; 23(17): 2290 - 2297. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||







