Bioinformatics Advance Access originally published online on July 14, 2006
Bioinformatics 2006 22(18):2276-2282; doi:10.1093/bioinformatics/btl380
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Identification of transcription factor cooperativity via stochastic system model


Laboratory of Control and Systems Biology, Department of Electrical Engineering National Tsing Hua University, Hsinchu 300, Taiwan
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Motivation: Transcription factor binding sites are known to co-occur in the same gene owing to cooperativity of the transcription factors (TFs) that bind to them. Genome-wide location data can help us understand how an individual TF regulates its target gene. Nevertheless, how TFs cooperate to regulate their target genes still needs further study. In this study, genome-wide location data and expression profiles are integrated to reveal how TFs cooperate to regulate their target genes from the stochastic system perspective.
Results: Based on a stochastic dynamic model, a new measurement of TF cooperativity is developed according to the regulatory abilities of cooperative TF pairs and the number of their occurrences. Our method is employed to the yeast cell cycle and reveals successfully many cooperative TF pairs confirmed by previous experiments, e.g. Swi4-Swi6 in G1/S phase and Ndd1-Fkh2 in G2/M phase. Other TF pairs with potential cooperativity mentioned in our results can provide new directions for future experiments. Finally, a cooperative TF network of cell cycle is constructed from significant cooperative TF pairs.
Contact: bschen{at}ee.nthu.edu.tw
Supplementary information: http://www.ee.nthu.edu.tw/~bschen/cooperativity/
| 1 INTRODUCTION |
|---|
|
|
|---|
Precise transcriptional control is one of the major reasons for different gene expression and regulation. Owing to advances in DNA microarray technology and genome sequencing, measuring gene expression levels on a genomic scale has become possible. Measuring gene expression time profiles can help us understand mechanisms of transcriptional regulation, including functional significance of cell cycle regulation and response of environmental changes (Spellman et al., 1998; Gasch et al., 2000). However, time profiles alone are not enough to identify precisely the whole transcriptional regulatory network. More precise method also needs the binding information of transcription factors (TFs) to promoters in DNA. Genes are always regulated by a number of upstream regulatory genes through binding of TFs to specific sites in the DNA promoter region. To construct a gene regulatory network, it is important to understand the binding relationship between target genes and TFs. In recent studies, the genome-wide location (ChIP-chip) analysis is employed to obtain the binding information of TFs to promoters in DNA (Iyer et al., 2001; Simon et al., 2001; Lee et al., 2002; Harbison et al., 2004; Chang et al., 2005; Lin et al., 2005). However, these studies did not shed light on the interactions or cooperativity between TFs.
With advances in experimental approaches and abundant data sources, functional genomics has begun to investigate the more complex, cooperative TF interactions to regulate properly gene expression. In order to find cooperative TF interactions, a statistical technique was employed to identify significant homotypic or heterotypic TF binding site clusters (Wagner, 1999). Moreover, the correlation of genome-wide expression profiles was employed to uncover functional motif combinations in the promoters (Pilpel et al., 2001). Furthermore, genome-wide location data (ChIP-chip) and gene expression profiles were integrated to assess TF cooperativity rigorously (Banerjee and Zhang, 2003; Kato et al., 2004; Tsai et al., 2005). However, these studies only use expression correlation score in the view of statistics to determine TF cooperativity. We are not aware of the previous attempts to construct an overall transcriptional regulatory system to uncover TF cooperativity from the dynamic system perspective. In this study, we assess TF cooperativity not only in dynamic system perspective, but also in statistics. We could get more insight into the regulatory mechanism by constructing its dynamic system model and could also make more precise prediction of TF cooperativity.
A dynamic model is often employed to describe a complex and kinetic system in many fields. Systems biology and computational biology methods have recently been widely employed to describe the biological functions from the dynamic system point of view (Hasty et al., 2002; Davidson et al., 2003; Hood, 2003; Tegner et al., 2003; Chen et al., 2004). In this article, we exploit genome-wide location data (Harbison et al., 2004) and gene expression profiles (Spellman et al., 1998) to construct a dynamic regulatory model for the genes of interest. The dynamic regulatory model is developed to describe how the upstream regulatory genes control a target gene to produce the output mRNA expression through its regulatory network. In order to reduce computational complexity, we estimate TF cooperativities one gene by one gene. In the dynamic regulatory model, there are two groups of input regulations affecting expression profiles of the target gene. The first group is due to individual TF regulation, and the other is due to TF cooperativity regulation. After estimating the regulatory abilities of these two groups, we can get all TF abilities for regulating the transcriptional process of the target gene. It is more possible to have TF cooperativity if binding sites for specific pairs of regulators co-occur more frequently within the same promoter regions of target genes (Harbison et al., 2004). Therefore, according to the estimated cooperative TF abilities and the number of TF pairs co-binding to the target genes, a statistical measurement method is specified to assess TF cooperativity rigorously. The statistical measurement method is proposed to reduce the possible estimation error owing to few data points. The new measurement of TF cooperativity is the multiplicative cooperative p-value, which can be obtained by multiplying all individual p-values of different target genes to estimate efficiently the cooperativity of any cooperative TF pairs on a genomic scale. Further, we use the shuffling method to avoid overfitting to confirm the reliability of the proposed method.
Our results show that many cooperative TF pairs that were previously characterized through experiments indeed have a high cooperativity in our analysis, thus validating the method we proposed as a TF cooperativity predictor and providing valuable information for further analysis. A cooperative network is also constructed with significant cooperative TF pairs for the yeast cell cycle. Finally, the results reveal several novel possible cooperative TF pairs not found in previous studies, thus providing new directions for future experiments.
| 2 METHODS |
|---|
|
|
|---|
2.1 Selecting and processing experimental data
We used the genome-wide yeast microarray hybridization data of Spellman et al. (1998) as our mRNA expression profiles. They proposed many experimental methods for resetting the yeast cell cycle to measure mRNA expression profiles for the whole genome comprehensively. Here, we used the 768 cell cycle-related genes selected by Simon et al. (2001) from
factor experimental cell cycle data as the target genes.
factor is one experimental technique to synchronize cell cycle. The genome-wide location data are taken from Harbison et al. (2004), in which the genomic occupancy of 203 DNA-binding TFs is determined in yeast. We selected the significant binding set using a binding p-value p < 0.0015 as the inputs in the dynamic gene transcriptional regulatory model. In genome-wide expression profiles of Spellman et al. (1998), there are many missing data. We used the cubic spline interpolation method (Faires and Burden, 1998), which employs piecewise third-order polynomials to fit data points, for compensating these missing values in the profiles. Furthermore, in order to avoid the overfitting estimation of the profiles, we also use the cubic spline method to interpolate the data points when there are more than six TFs binding to the target gene, which implies that the number of parameters to be estimated is 23. Only 5% of the target genes are bound by more than six TFs (Supplementary Material). In order to fit the gene dynamic model in the linear scale, the microarray data are returned from the log2 scale to the linear scale.
2.2 Dynamic model of gene regulatory networks
First, we consider a gene regulatory network as a system block with several regulatory genes as inputs and a target gene as output. Owing to random noise and fluctuation at the molecular level, the transcriptional behavior of the gene regulatory network is described by a stochastic discrete dynamic equation, and the general form of transcriptional regulation for a target gene is written as follows:
![]() | (1) |
{1, 2, ... , N} represent the regulation functions of N TFs binding to the target gene and bi indicates the regulatory ability of the i-th TF. G[t] is the possible regulatory function of cooperative TFs. k represents the basal level from other factors, and
[t] denotes a stochastic noise owing to model uncertainty and fluctuation of mRNA microarray data in the target gene.
For system identification of gene regulatory network, it is more practical to consider the biochemical reaction between the TF regulation functions xi[t] at the motif binding sites and their relevant mRNA expression profiles yi[t] of the upstream regulatory gene (Goldbeter and Koshland, 1981; Mestl et al., 1995). For this purpose, we describe the binding regulation function xi[t] of the TF as the following sigmoid function
![]() | (2) |
The regulatory function G[t] is combined by all possible TF cooperativities of the target gene. We describe the regulatory function G[t] in Equation (1) as the following:
![]() | (3) |
fi·j(yi[t]·yj[t]) is a sigmoid function of product yi[t] · yj[t] to denote the binding function of cooperative TFs i and j, and ci·j denotes the regulatory ability (or kinetic activity) of the cooperative TFs i and j. N denotes the number of TFs binding to the target gene.
Substituting Equation (3) into Equation (1), we get the following dynamic transcriptional regulatory equation:
![]() | (4) |
|
2.3 Identifying gene regulatory networks
After constructing the discrete stochastic dynamic model, we use the method of maximum likelihood to estimate the parameters. The details are shown in Supplementary Material.
After identifying the parameters, the transcriptional regulatory network of target genes could be expressed as the following dynamic equation:
![]() | (5) |
,
,
,
are the estimated parameters of a, bi, ci·j, k, respectively. Therefore, the TF cooperativity at each target gene could be evaluated by
, the regulatory ability of the cooperative TF pair, in Equation (5). However,
in (5) is only from one target gene point of view. A statistical method should be developed to measure the cooperativity of TFs to combine all
of the TF i and j for all target genes (i.e. on a genomic scale). It would be discussed in the following section.
2.4 Measurement of TF cooperativity
In order to measure the possible TF cooperativity, we define a new measurement, the multiplicative cooperative p-value PC, according to the statistics of regulatory abilities
of all target genes in the dynamic model of Equation (5) and the number of TF pairs appearing in all target genes. For any TF pair, if the binding sites of specific TF pairs co-occur more frequently in the target genes, then it is more likely to have cooperativity between these two TFs (Harbison et al., 2004). In addition, the magnitude of the regulatory ability
of TF cooperation in our dynamic model can be also useful to help us determine the significant TF cooperativity because the magnitude represents its importance in the transcriptional regulation of the target gene.
To define the multiplicative cooperative p-value PC on a genomic scale, we first calculate the individual p-value for each regulatory ability of a cooperative TF pair. The p-value is defined as (PB)m when a cooperative TF pair binds to the m-th target gene. For the m-th target gene bound by the cooperative TF pair i and j, we rewrote its regulatory ability
as
. After constructing 1000 random permutations of xi·j[t] (Supplementary Material) in Equation (4) for the m-th target gene bound by TFs pair i and j, we used these random permutations to estimate 1000 different
rewritten as
and then computed a probability density
of all absolute values
according to their magnitudes by normalization. The probability density distribution
of cooperativity is shown in Figure 2. Then, using
, we calculated the individual p-value (PB)m of TFs i and j at the m-th target gene by
![]() | (6) |
denotes the regulatory ability of the cooperative TF pair i and j at the m-th target gene without random permutation.
|
When calculating individual p-value for each regulatory ability (PB)m, we use 1000 random permutations to infer the probability density function
. Theoretically, the continuous probability density function is obtained when the number of shuffling is infinite, which is not applicable in practice. Thus, we use the discrete normalized histogram to approximate the probability density function instead. In Figure 2, (PB)m should be the area under the smooth curve (pdf), but as we can calculate only the approximation of the pdf, (PB)m would be in fact the numbers of
, which are greater than
normalized by the number of shuffling, i.e. the sum of heights of corresponding bars. In reality, the minimal value of (PB)m is equal to the inverse number of shuffling, i.e. 0.001 in the case of 1000 shuffling. Thus, taking a greater number of shuffling would make quite a different (PB)m value. However, choosing the number of shuffling would not affect the final significant cooperative TF pairs, as long as the number is large enough. So in order to reduce the computation complexity, we choose the number of shuffling to be 1000.
A TF pair may co-occur to bind many target genes, so we could obtain many individual p-values (PB)m, m
{1, 2, ... , L}, where L is the number of the target genes bound by the possible cooperative TF pair i and j. In Figure 1, for example, TFs Mbp1 and Swi6 both bind to HO and CLB5. Therefore, the gene HO has its cMbp1·Swi6 and the corresponding (PB)HO, and so does CLB5 with its cMbp1·Swi6 and the corresponding (PB)CLB5. After computing all the individual p-value (PB)m for the TF pair i and j for all target genes, i.e. (PB)m for m
{1, 2, ... , L}, we define the multiplicative cooperative p-value PC for the TF pair i and j by multiplying all of its (PB)m as follows:
![]() | (7) |
In Equations (6) and (7), the more likely the TF cooperativity is, the smaller PC of the TF pair is. Therefore, there are two situations that make PC smaller. One is the larger number L of the TF pairs co-occurring in all target genes, i.e. with a larger number of individual p-values for the TF pair. This situation is the same with the result that the TF pair has a greater possibility of cooperation if binding sites of specific TF pairs co-occur more frequently in target genes (Harbison et al., 2004). Another situation is that the regulatory ability
of the TF pair i and j is so significant at some target genes that the individual p-value (PB)m of the TF pair is small at the target genes. In this way, the multiplicative cooperative p-value PC also becomes small. Therefore, it is reasonable to use the multiplicative cooperative p-value PC to evaluate the TF cooperativity.
2.5 Determination of significant cooperative TF pairs
After calculation of the multiplicative cooperative p-value PC, we have a list of cooperative TF pairs. The list has
pairs with their PC values, respectively, where N represents the total number of TFs. Sorted by the PC values (the smaller PC means the more likely cooperative TF pair), it can be presented as a list in which we can find the most possible cooperative TF pairs to the least possible ones. However, which TF pairs are considered as significant cooperative TF pairs should be determined.
Because the measurement of TF cooperativity depends on the individual p-values PB and the number of the target genes bound by the possible cooperative TF pair L, we determine the threshold of significant cooperative pairs by PB and L as well. The idea is that we determine the threshold of PB and L, respectively, and then calculate the significance threshold as
![]() | (8) |
| 3 RESULTS |
|---|
|
|
|---|
3.1 Analysis of TF cooperativity
We integrated the expression profiles of the yeast cell cycle taken from Spellman et al. (1998) and genome-wide location data of Harbison et al. (2004) to identify the cooperativity of all possible TF interaction pairs among 203 TFs. After constructing and identifying parameters of gene transcriptional regulatory networks of yeast, we calculated the multiplicative cooperative p-value PC of each possible TF interaction pair. Sorted by PC values, all possible TF cooperative pairs are ranked according to their cooperative possibilities. We then calculate the significance threshold PC,Threshold and determine the significant cooperative TF pairs. In this study, the significance threshold PC,Threshold =1021 and we identified 55 significant cooperative TF pairs, which are shown in Table 1. According to these results, the cooperative network of significant cooperative TF pairs is given in Figure 3. Of the results listed, 72.73% are confirmed by evidence in many studies. Most of these TF cooperative pairs confirmed are related to the cell cycle of yeast. It may be because our source data are the yeast cell cycle expression profiles so that most of these TF cooperative pairs found in the list are related to cell cycle. Moreover, the TF cooperative pairs that have not yet been proved but are found by the method we proposed can provide a direction for future experiments. The details of the TF cooperativities we found are discussed below.
|
|
3.2 Cell cycle
3.2.1 G1/S Phase (Swi4-Swi6, Mbp1-Swi6, Mbp1-Swi4, Stb1-Swi6)
The G1 to S phase transition of the eukaryotic cell cycle is a critical point for the coordination of cell cycle progression with cellular growth. SBF and MBF are sequence-specific TFs that activate gene expression to mediate the G1/S transition of the cell cycle in yeast (Iyer et al., 2001). SBF is a protein complex of Swi4 and Swi6, and MBF is a protein complex of Mbp1 and Swi6 (Koch et al., 1993). The related Swi4 and Mbp1 proteins are the DNA-binding components of the respective factors, and Swi6 may have a regulatory function (Dirick et al., 1992; Primig et al., 1992). The cooperativity between Swi4 and Swi6 is included in our results, and so is the cooperativity between Mbp1 and Swi6. In addition, Mbp1 and Swi4 sharing 50% identity in their DNA binding domains are found with the probable cooperativity in our results (Koch et al., 1993).
In the late G1 phase, Stb1 is an important regulator controlling the timing of start transcription that is revealed in the absence of the G1 regulator Cln3 and binds to Swi6 in vivo (Ho et al., 1999). The interaction of Stb1 and Swi6 confirmed by Ho et al. (1999) and Costanzo et al. (2003) is also found in our results.
3.2.2 G2/M Phase (Ndd1-Fkh1, Ndd1-Fkh2, Fkh1-Fkh2, Mcm1-Fkh2, Ndd1-Mcm1)
In the G2/M phase, the important regulator is SFF, which is a larger transcription factor containing Ndd1, Fkh1 and Fkh2 (Koranda et al., 2000; Kumar et al., 2000; Pic et al., 2000; Futcher, 2002). Our results strongly indicate the cooperativity between Ndd1 and Fkh2, which are components of SFF. Another cooperativity between Ndd1 and Fkh1, which are also components of SFF, is found in our results, too. In addition, Fkh1 and Fkh2 are found with cooperativity in our results. Fkh1 and Fkh2 share 72% identical DNA binding domain, and the double mutant of Fkh1 and Fkh2 displays obvious morphological change (Kumar et al., 2000).
SFF is thought to regulate a program of mitotic transcription in conjunction with the transcription factor Mcm1. Moreover, Fkh2, a component of SFF, assembles into a ternary complex with Mcm1 on both the SWI5 and CLB2 cell cycle genes (Koranda et al., 2000; Kumar et al., 2000). The fact that Fkh2 shows cooperativity with Mcm1 is also listed in our result. In addition to Ndd1 and Fkh2, Mcm1 and Fkh2, another cooperative TF pair in the G2/M phase, Ndd1 and Mcm1, is found in our results. For the above cooperative TF pairs found in the G2/M phase, it is not surprising to find experimentally that Mcm1, Fkh2 and Ndd1 also form a complex to regulate the CLB2 gene and other genes (Kumar et al., 2000; Zhu et al., 2000).
3.2.3 M/G1 Phase (Ace2-Swi5)
In the M/G1 phase, we found that the TF pair, Ace2 and Swi5, is cooperative. Ace2 and Swi5 is a pair of TFs of yeast that regulates the expression of many cell cycle-specific genes (Doolin et al., 2001). In recent studies, Ace2 and Swi5 cooperate to induce the expressions of a subset of genes, but the antagonistic interaction between Ace2 and Swi5 has been found (Doolin et al., 2001). With 82% identical DNA binding domains, Ace2 and Swi5 bind to the same DNA sequence (McBride et al., 1999), and it is possible that proteins compete for access to these promoters, but only one activates transcription (Doolin et al., 2001). Therefore, one partner of Swi5 and Ace2 sometimes can have a stronger contribution towards regulation, and the finding of antagonistic interaction of Ace2 and Swi5 is not surprising.
3.3 Mating (Ste12Dig1)
The TF Ste12 is responsible for activating genes in response to MAP kinase cascades controlling mating and filamentous growth. Two inhibitors Dig1 and Dig2 regulate Ste12 negatively (Olson et al., 2000). It was found that Dig1 and Dig2 do not function through redundant mechanisms, but rather inhibit pheromone-responsive transcription through interactions with separate regions of Ste12 (Olson et al., 2000). In our results, we indeed found that Ste12 cooperates with Dig1. However, in the location data of Harbison et al., Dig2 is not listed in the 203 TF. That is why another protein Dig2 was not found to show cooperativity with Ste12 in our results. If the genome-wide location data are comprehensive for more TFs, we believe that our method can predict more cooperative TF pairs, like Dig2 and Ste12.
3.4 Comparison with results of other methods
Pilpel et al. (2001) uncover functional motif combinations in the promoters of Saccharomyces cerevsiae using microarray data. They uncover not only cell cycle-related motifs, but also sporulation and various stress responses. Focusing on cell cycle, they identified only 10 motif pairs. Comparing our results with the cell cycle results from Pilpel et al. (2001), there are only three TF pairs in common (Supplementary Material). This may be because they only use microarray data but not use ChIP-chip data to infer combinatorial motifs.
Banerjee and Zhang (2003) integrated genome-wide location data from Lee et al. (2002) and gene expression data from Cho et al. (1998) to infer cooperativity among transcription factors by expression correlation. Comparing the TF cooperativities we found with Banerjee and Zhang's results show that many cooperative TF pairs confirmed by literature evidences are found in both results even though their study based on different dataset (Supplementary Material). Furthermore, our results indicate more cooperative TF pairs confirmed by literature evidences but not found by Banerjee and Zhang's methods. However, there are still four pairs, namely Arg80Arg81, Ace2Hsf1, Hsf1Skn7 and Hir1Hir2, found in Banerjee and Zhang's results but not in our results. The details can be seen in Supplementary Material.
Tsai et al. (2005) use statistical methods (ANOVA) to identify synergistic pairs of yeast cell cycle TFs. They combined ChIP-chip data from Harbison et al. (2004) and microarray data from Spellman et al. (1998) as we did. Comparing our results with Tsai et al.'s results (confident synergistic TF pairs as stated in their study), we find that many cooperative TF pairs confirmed by literature evidences are found in both results. Once more, our method finds more real cooperative TF pairs than Tsai et al.'s method. The overlap of our results, Tsai et al.'s results and cooperative TF pairs supported by literature evidences is shown in Figure 4. In addition, there are four pairs, Fkh2-Swi6, Fkh1-Mbp1, Dat1-Hap1 and Hap1-Rap1, which are still not confirmed by literature evidences, are also found by both results. These TF pairs provide directions for future experiments. From Figure 4, we also find that there are three pairs, Hir1Hir2, Hir1Hir3 and Hir2Hir3, found in Tsai et al.'s results but not in ours. These three TF pairs could be considered as false negative TF pairs of our method. After checking these three cooperative TF pairs in detail, we find that Hir1 and Hir2 co-occurred to bind only six target genes of the 768 cell cycle genes from the location data of Harbison et al. (2004) at p-value p < 0.0015. For this cooperative TF pair, the number L in Equation (7) is only 6, and that is why its PC is not small enough to be considered significant. The situations of Hir1Hir3 and Hir2Hir3 are the same, thus these three pairs are considered as false negatives in our results. Finally, we calculated the confirmed rates, which are the rates of both results compared with literature evidences, of the TF pairs listed in our findings and Tsai et al.'s results (confident synergistic TF pairs as stated in their study). The confirmed rate of our method, 52.73%, is higher than that of Tsai et al.'s, 28.89%.
|
| 4 DISCUSSION |
|---|
|
|
|---|
In this study, we successfully identified cooperative TF pairs by integrating genome-wide location data and expression profiles. Our method is based on transcriptional regulatory mechanisms of TFs and their cooperativities at binding sites from dynamic system perspective. Unlike others' studies using statistical correlation (Pilpel et al., 2001; Banerjee and Zhang, 2003; Tsai et al., 2005), our method provides the view of dynamic regulatory systems to mimic the transcriptional procedure. The measure of TF cooperativity of TFs is developed by a multiplicative cooperative p-value according to the statistics of estimated regulatory ability of cooperative genes in the dynamic regulatory model. We developed a method to determine significant cooperative TF pairs among all possible TF cooperativities as well. Our results showed that many cooperative TF pairs identified by our method are confirmed by literature evidences. We also indicated several possible TF pairs with cooperativity that are listed in our results but not confirmed yet. These provide new directions for future experiments. Moreover, it was shown that the confirmed rate of our method is higher than other methods using statistical correlation (Pilpel et al., 2001; Banerjee and Zhang, 2003; Tsai et al., 2005).
From the cooperative network of significant TF cooperative pairs shown in Figure 3, we can find that the interaction links among several TFs (Swi4, Swi6, Fkh1, Fkh2, Ndd1 and Mbp1) are more compact than others (compactness means that there are more interaction links with the TFs). All of these TFs have important roles in the yeast cell cycle, so it is likely that there exist TF cooperativities among these TFs. However, some of these TFs may co-bind to many target genes because of their importance for the yeast cell cycle but nevertheless they show no cooperativity among them. In our analysis, this situation will lead to a false positive (the false positive pairs are listed in the Supplementary Material). It may be another possible reason why the interaction links among these important TFs are more compact. We also list the TF cooperative pairs that are confirmed by literature evidences but not shown up in our results, i.e., the false negatives. The false negative rate is 27.27% (Supplementary Material).
In order to confirm the reliability of the proposed method, the shuffling method was used to test the overfitting. In the expression data shuffling case, only 60% of TF cooperativities are found, which may be due to the correct ChIP-chip data and the uncertainty reduction by p-value detection method. In the location data shuffling, no cooperativities of TFs are found due to the same significance threshold PC,Threshold as the original result without shuffling, i.e. 1021. When comparing the results, we found that there is no overfitting by the proposed method (Supplementary Material).
With the microarray data of the yeast cell cycle of Spellman et al. (1998) and the ChIP-chip data of Harbison et al. (2004), it is shown that our method can successfully predict the cooperative TF pairs of the yeast cell cycle. If the gene expression profiles and genome-wide location data of other process for yeast or other species, such as the heat shock stress, are available, our method could be employed to identify the cooperative TF pairs of different stress conditions or other species (Supplementary Material). Furthermore, our method can be extended to identify the cooperativities among more than two TFs by adding extra terms generated by multiplications of their expression profiles through the sigmoid function. Moreover, we can integrate protein-protein interaction data to determine interactions not only among TFs but also among TFs and other proteins in our model (Supplementary Material). That is, by modifying the method we proposed, we can construct an integrated cellular network of transcription regulation and proteinprotein interaction.
| Acknowledgments |
|---|
The authors thank Professor Wen-Ping Hsieh for her valuable discussions and comments.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors. Associate Editor: Martin Bishop
Received on May 8, 2006; revised on June 30, 2006; accepted on July 6, 2006
| REFERENCES |
|---|
|
|
|---|
Banerjee, N. and Zhang, M.Q. (2003) Identifying cooperativity among transcription factors controlling the cell cycle in yeast. Nucleic Acids Res, . 31, 70247031
Chang, W.C., et al. (2005) Quantitative inference of dynamic regulatory pathways via microarray data. BMC Bioinformatics, 6, 44[CrossRef][Medline].
Chen, H.C., et al. (2004) Quantitative characterization of the transcriptional regulatory network in the yeast cell cycle. Bioinformatics, 20, 19141927
Cho, R.J., et al. (1998) A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell, 2, 6573[CrossRef][ISI][Medline].
Costanzo, M., et al. (2003) G1 transcription factors are differentially regulated in Saccharomyces cerevisiae by the Swi6-binding protein Stb1. Mol. Cell Biol, . 23, 50645077
Davidson, E.H., et al. (2003) Regulatory gene networks and the properties of the developmental process. Proc. Natl. Acad. Sci. USA, 100, 14751480
Dirick, L., et al. (1992) A central role for SWI6 in modulating cell cycle start-specific transcription in yeast. Nature, 357, 50813[CrossRef][Medline].
Doolin, M.T., et al. (2001) Overlapping and distinct roles of the duplicated yeast transcription factors Ace2p and Swi5p. Mol. Microbiol, . 40, 422432[CrossRef][ISI][Medline].
Faires, J.D. and Burden, R. Numerical Methods, (1998) 2nd edn Brooks/Cole Publishing Company, Pacific Grove, CA, pp. 8791.
Futcher, B. (2002) Transcriptional regulatory networks and the yeast cell cycle. Curr. Opin. Cell Biol, . 14, 676683[CrossRef][ISI][Medline].
Gasch, A.P., et al. (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell, 11, 42414257
Goldbeter, A. and Koshland, D.E., Jr. (1981) An amplified sensitivity arising from covalent modification in biological systems. Proc. Natl. Acad. Sci. USA, 78, 68406844
Harbison, C.T., et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature, 431, 99104[CrossRef][Medline].
Hasty, J., et al. (2002) Engineered gene circuits. Nature, 420, 224230[CrossRef][Medline].
Ho, Y., et al. (1999) Regulation of transcription at the Saccharomyces cerevisiae start transition by Stb1, a Swi6-binding protein. Mol. Cell Biol, . 19, 52675278
Hood, L. (2003) Systems biology: integrating technology, biology, and computation. Mech. Ageing. Dev, . 124, 916[CrossRef][ISI][Medline].
Iyer, V.R., et al. (2001) Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature, 409, 533538[CrossRef][Medline].
Kato, M., et al. (2004) Identifying combinatorial regulation of transcription factors and binding motifs. Genome Biol, . 5, R56[CrossRef][Medline].
Koch, C., et al. (1993) A role for the transcription factors Mbp1 and Swi4 in progression from G1 to S phase. Science, 261, 15511557
Koranda, M., et al. (2000) Forkhead-like transcription factors recruit Ndd1 to the chromatin of G2/M-specific promoters. Nature, 406, 9498[CrossRef][Medline].
Kumar, R., et al. (2000) Forkhead transcription factors, Fkh1p and Fkh2p, collaborate with Mcm1p to control transcription required for M-phase. Curr. Biol, . 10, 896906[CrossRef][ISI][Medline].
Lee, T.I., et al. (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799804
Lin, L.H., et al. (2005) Dynamic modeling of cis-regulatory circuits and gene expression prediction via cross-gene identification. BMC Bioinformatics, 6, 258[CrossRef][Medline].
Manke, T., et al. (2003) Correlating proteinDNA and proteinprotein interaction networks. J. Mol. Biol, . 333, 7585[CrossRef][ISI][Medline].
McBride, H.J., et al. (1999) Distinct regions of the Swi5 and Ace2 transcription factors are required for specific gene activation. J. Biol. Chem, . 274, 2102921036
Mestl, T., et al. (1995) A mathematical framework for describing and analyzing gene regulatory networks. J. Theor. Biol, . 176, 291300[CrossRef][ISI][Medline].
Olson, K.A., et al. (2000) Two regulators of Ste12p inhibit pheromone-responsive transcription by separate mechanisms. Mol. Cell Biol, . 20, 41994209
Pic, A., et al. (2000) The forkhead protein Fkh2 is a component of the yeast cell cycle transcription factor SFF. EMBO J, . 19, 37503761[CrossRef][ISI][Medline].
Pilpel, Y., et al. (2001) Identifying regulatory networks by combinatorial analysis of promoter elements. Nat. Genet, . 29, 153159[CrossRef][ISI][Medline].
Primig, M., et al. (1992) Anatomy of a transcription factor important for the start of the cell cycle in Saccharomyces cerevisiae. Nature, 358, 593597[CrossRef][Medline].
Simon, I., et al. (2001) Serial regulation of transcriptional regulators in the yeast cell cycle. Cell, 106, 697708[CrossRef][ISI][Medline].
Spellman, P.T., et al. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cererisiae by microarray hybridization. Mol. Biol. Cell, 9, 32733297
Tegner, J., et al. (2003) Reverse engineering gene networks: integrating genetic perturbations with dynamical modeling. Proc. Natl. Acad. Sci. USA, 100, 59445949
Tsai, H.K., et al. (2005) Statistical methods for identifying yeast cell cycle transcription factors. Proc. Natl. Acad. Sci. USA, 102, 1353213537
Wagner, A. (1999) Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes. Bioinformatics, 15, 776784
Zhu, G., et al. (2000) Two yeast forkhead genes regulate the cell cycle and pseudohyphal growth. Nature, 406, 9094[CrossRef][Medline].
This article has been cited by other articles:
![]() |
D. Datta and H. Zhao Statistical methods to infer cooperative binding among transcription factors in Saccharomyces cerevisiae Bioinformatics, February 15, 2008; 24(4): 545 - 552. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








. The pdf
for the m-th target gene bound by TF pair i and j. The individual p-value
of the m-th target gene is the sum of heights of corresponding bars in practice.



