Skip Navigation


Bioinformatics Advance Access originally published online on November 30, 2004
Bioinformatics 2005 21(8):1295-1300; doi:10.1093/bioinformatics/bti172
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/8/1295    most recent
bti172v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (23)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gershenzon, N. I.
Right arrow Articles by Ioshikhes, I. P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gershenzon, N. I.
Right arrow Articles by Ioshikhes, I. P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

Synergy of human Pol II core promoter elements revealed by statistical sequence analysis

Naum I. Gershenzon and Ilya P. Ioshikhes *

Department of Biomedical Informatics, The Ohio State University 3184 Graves Hall, 333 W. 10th Avenue, Columbus, OH 43210, USA

*To whom correspondence should be addressed.


    Abstract
 TOP
 Abstract
 INTRODUCTION
 DATA AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

Motivation: The subject of our paper is bioinformatics analysis of the distinguishing features of human promoter DNA sequences, in particular of synergetic combinations of core promoter elements therein. We suppose that specific scenarios of transcription initiation are essentially related to various particular implementations of the interaction of basal transcription machinery with promoter DNA, depending on the presence and mutual positioning of core promoter elements.

Results: In addition to the combinations of core promoter elements previously experimentally confirmed [TATA box and Initiator (Inr), Downstream Promoter Element (DPE) and Inr, and TFIIB recognition element (BRE) and TATA box] we propose other alternate synergetic combinations: BRE and Inr, BRE and DPE, and TATA and DPE with respective models. The suggestion is based on a high statistical significance of the alternate combinations in promoters, comparable with the significance of the known combinations. We also present arguments that the BRE element is statistically more important than previously thought, and suggest possible mechanisms of action of the core elements in the promoters with multiple transcription start sites.

Contact: ioschikhes-1{at}medctr.osu.edu

Supplementary information: Supplementary information is available at http://bmi.osu.edu/~ilya/synergy/Gershenzon_SuppMat-R.pdf


    INTRODUCTION
 TOP
 Abstract
 INTRODUCTION
 DATA AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Gene transcription is a multi-step, multi-level process involving many transcription factors. A fundamental step of the transcription initiation is an interaction of the basal transcription machinery [also named pre-initiation complex (PIC)] with a core promoter area of DNA spanning about ± 40 bp around the transcription start site (TSS) (compare Smale and Kadonaga, 2003; Butler and Kadonaga, 2002; Zhang, 1998; Lewis et al., 2000). So far a few core-promoter elements have been found to be a target for the basal machinery. The most common elements are TATA box, Initiator (Inr), Downstream Promoter Element (DPE), and TFIIB recognition element (BRE) (Smale and Kadonaga, 2003).

There are a few general TFs (TFIIA, B, D, E, F and H) necessary for successful initiation of transcription (Roeder, 1996; Orphanides et al., 1996; Nikolov and Burley, 1997; Hampsey, 1998). Despite a diversity of scenarios TFIID always plays the central role in this process (Burley and Roeder, 1996; Burke and Kadonaga, 1997), acting in cooperation (synergy) with the core promoter elements and/or specific TFs (Nikolov and Burley, 1997; Hampsey, 1998; Lemon and Tjian, 2000). The TFIID consists of TATA Binding Protein (TBP) and at least 12 transcription associated factors (TAFs) (Green, 2000). In the TATA box-containing (TATA+) promoters, TBP binding starts the process of PIC formation. In the absence of the TATA box (TATA-less promoters), TAFs bind to DNA and/or to other TFs in order to involve TFIID (and TBP) in PIC Burke and Kadonaga, 1997; Zenzie-Gregory et al., 1993; Martinez et al., 1995; Tsai and Sigler, 2000). Several combinations of the core-promoter elements were found to be synergistically advantageous for transcription initiation: TATA box and Inr (O'Shea-Greenfield and Smale, 1992; Emami et al., 1997), DPE and Inr (Burke and Kadonaga, 1997, Zhou and Chiang, 2001), and BRE and TATA box (Tsai and Sigler, 2000; Lagrange et al., 1998). In the present study, we discovered the high statistical significance of other combinations of the core-promoter elements suggesting existence of the additional synergetic combinations: BRE and Inr, TATA box and DPE, and BRE and DPE.

The statistics of the core elements for human promoters still remains obscure even for the most studied elements like TATA box and Inr. TATA-containing promoters were historically discovered first and the TATA box was thought to be the universal promoter element (Butler and Kadonaga, 2002). The TATA-less promoters were obtained several years later and their percentage in the total number of studied promoters has decreased steadily since: from 78% (Bucher, 1990) to 64% (Babenko et al., 1999) to 32% (Suzuki et al., 2001). There is also no consistency with Inr-containing promoters: 60% (Bucher, 1990) and 85% (Suzuki et al., 2001). The DPE was mainly studied in Drosophila (Kutach and Kadonaga, 2000). It was shown that DPE is conserved from Drosophila to human (Burke and Kadonaga, 1997); however, so far only one human gene with DPE has been experimentally studied (Zhou and Chiang, 2001). Few human genes with functional BRE have actually been investigated (Lagrange et al., 1998; Tsai and Sigler, 2000), so a general role of the BRE element was still under question (Smale and Kadonaga, 2003). In this paper, we give statistics of the aforementioned core-promoter elements and their synergetic combinations, both those previously described and those suggested herein. These statistics are based only on an examination of the presence of the element motifs defined by respective position weight matrices or consensus sequences. The actual functionality of each individual element in each individual gene is beyond the scope of this article.

Despite the complexity and diversity of the biochemical interactions between the basal machinery and the core-promoter sequence, these interactions essentially related to can be considered as one between the different parts of PIC and DNA through the core promoter elements (Smale and Kadonaga, 2003). The hypothesis behind our research is that specific scenarios of the transcription initiation are essentially various particular implementations of the general PIC–DNA interaction, depending on the presence and mutual positioning of different core promoter elements.

Based on statistical analysis we will examine the following particular questions in order to check this hypothesis:

  1. How many known human promoters follow known scenarios of the interaction of the basal machinery and DNA? In particular, the transcription of how many promoters is guided by the TATA box and/or by any of the known synergetic combinations?
  2. May statistical analysis suggest new scenarios?
  3. Do all known promoters contain at least one known core-promoter element at a position where it is able to function?
  4. Do the four known core-promoter elements play any role in transcription of genes with multiple start sites (MSS) promoters. In particular, is there any correlation between the multiple TSS positions and positions of the core-promoter elements?
  5. What is a relationship between core-promoter elements and CpG island?


    DATA AND METHODS
 TOP
 Abstract
 INTRODUCTION
 DATA AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
A total of 1871 non-redundant human promoter sequences from the Eukaryotic Promoter Database (EPD) release 75 (http://www.epd.isb-sib.ch) and 8793 human promoters from the Database of Transcriptional Start Sites (DBTSS) (http://dbtss.hgc.jp/index.html) were used for statistical analyses as two separate datasets. We also constructed a small test set of 27 human promoters with MSS (see Supplementary Material 1). This set was utilized to analyze the statistics of core-promoter elements in MSS promoters. Each promoter was considered several times, one time for each known TSS, so the total number of sequences in this set is 107. The software package, Promoter Classifier (Gershenzon and Ioshikhes, 2005) (available at http://www.bmi.osu.edu/~ilya/promoter_classifier/) was used for statistical analysis.

We exploit the idea that due to evolution, the motifs necessary for promoter regulation have been preserved in a promoter area and, therefore, their occurrence frequencies there are far from random. So the statistical analysis of averaged positional distribution of the element's occurrence frequency (OFi = ni/Ns, where ni is the number of promoters containing a considered element centered at position i in Ns aligned promoter sequences) is the main method of our investigation. To find the element's occurrence frequency distribution we scan each promoter sequence at each position by respective weight matrix or motif consensus. We examine the presence of the core-promoter elements and relations between the elements in different subsets of human promoters. To implement this strategy we divided all three datasets into subsets (the respective subsets for EPD are available in the Sequence Supplementary Material). To extract a subset of promoter sequences containing the TATA box or Inr element at their functional positions, the positional weight matrices (PWM) with optimal cut-off values (Table 1) were applied (Bucher, 1990). We define the TATA or Inr element as being present at a certain position if the PWM score at this position exceeds the cut-off value, and define the element to be absent at this position otherwise. Since there are no matrices for DPE and BRE, we matched 5 out of 5 letters and 6 out of 7 for the DPE and BRE consensuses (Smale and Kadonaga, 2003), respectively.


View this table:
[in this window]
[in a new window]
 
Table 1 The parameters of core-promoter elements

 
We used the same parameters to extract subsets containing known synergetic combinations, yet the respective elements had to be placed at their experimentally defined synergetic distance from one another. The distances between the elements in the remaining combinations were chosen based on the positions of the respective elements in the known combinations.

To estimate the statistical significance of the occurrence frequency of an element or synergetic combination in the respective functional window, we calculated a parameter statistical significance, dS, measured in units of standard deviation , where Nin is the number of occurrences of an element or combination inside its functional window and Nout is the number of occurrences of that element or combination in the average interval of the same length outside the functional window. Since MSS promoters may contain core elements in several positions, to calculate statistical significance for the MSS dataset we use the value Nout [respectively recalculated (Nout(MSS) = Nout (EPD) * 107/1871)] from the EPD dataset.

In order to comprehend a correlation of the core elements and their combinations with CpG islands we divided the datasets to subsets with CpG island spanning TSS (CpG+) and without it (CpG-less). For implementation the commonly used parameters of CpG island (Gardiner-Garden and Frommer, 1987) were applied: (i) the length is over 200; (ii) over 50% of nucleotides are G or C; and (iii) the ratio of observed/expected CG dinucleotides (NCG * L/NC * NG) exceeds 0.6. Here L is the length of the window considered; NCG is the number of CG dinucleotides; and NC and NG are the number of nucleotides C and G, respectively, in that window. We scan over the sequences with window L = 100 starting from position –200 bp and ending at position 100 for the 5' end of the window. The promoter is considered having a CpG island if the combined length of overlapping windows which satisfy criteria (ii) and (iii) exceeds 200. Since the 3' ends of EPD sequences are defined up to +100 bp only, for them the window size L starting from the position +1 was shrunk accordingly.


    RESULTS
 TOP
 Abstract
 INTRODUCTION
 DATA AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
To define an interval (window) for a functional position of a given element we considered the distribution of the element's occurrence frequency along the promoters. For both databases, we found the unambiguous maximums for the occurrence frequencies of the centers of the TATA and Inr elements at positions –28 and +1 respectively (see Figs 1 and 2 in Supplementary Material 2), which is consistent with the known functional positions of these elements. The occurrence frequency of the TATA box is essentially larger in the window (–33 to –23 bp) than in the surrounding area. We consider this window as functional for the TATA box. For the Inr element the functional window is (–5 to +6 bp) since the TSS position in EPD is defined with the accuracy ± 5 bp (Cavin Périer et al., 1998). Since DPE works in cooperation with Inr if positioned 27 bp downstream from it (Burke and Kadonaga, 1997), we applied the window (28 – 5) – (28 + 5) bp for the DPE. BRE is shifted from the TATA box in the 5' direction to the distance equal to the BRE length plus the 2 bp between the center of TATA element and its first ‘T’ (Tsai and Sigler, 2000). So the functional window for BRE is (–33 –7 –2 to –23 –7 –2 bp). Note that the occurrence frequencies of all four core-promoter elements at their functional windows are essentially larger than in the rest of the promoter area. Indeed, the statistical significance ranges from 5.0 (11.0) StD for DPE up to 52.0 (46.1) StD for the TATA box (Table 1, the last two columns). Hereafter the first number refers to EPD and the following number (in parentheses) refers to DBTSS. While the high levels of the statistical significance for the TATA box and Inr elements are not surprise, the high statistical significance of the DPE and BRE elements at their expected functional positions have never been revealed before for human genes.

Tables 2 and 3 represent percentages of the core-promoter elements in different EPD and DBTSS subsets. According to these data, half of the promoters, 49.0% (48.4%), have the Inr element at a functional position, only 21.8% (10.4%) have TATA box, 24.6% (24.6%) contain DPE, and 24.5% (25.5%) have BRE.


View this table:
[in this window]
[in a new window]
 
Table 2 The distribution of elements in different subsets of promoters from the EPD database

 

View this table:
[in this window]
[in a new window]
 
Table 3 The distribution of elements in different subsets of DBTSS database

 
As we see, the percentage of the TATA+ promoters is much lower than even the minimal previous estimate (32%, Suzuki et al., 2001). Comparison of an absolute number of the TATA+ promoters (Tables 2 and 3) with those expected from the Suzuki's estimate for the EPD and DBTSS datasets (599 and 2814 sequences, respectively) gives a difference of 7.8 (35.8) StD below the estimate. The TATA+ promoters have a larger probability of having an Inr element than TATA-less promoters. Indeed, 61.9% (57.9%) of TATA+ promoters have Inr compared with 45.4% (47.3%) of TATA-less promoters. The presence of DPE is virtually irrelevant to the presence of the TATA box or Inr elements (Tables 2 and 3), in contrast to the Drosophila (Kutach and Kadonaga, 2000). The BRE-containing promoters ‘prefer’ to be TATA-less promoters: 28.1% (26.9%) of TATA-less promoters contain BRE versus 11.8% (13.8%) of TATA+ promoters. The majority of the promoters, 77.3% (74.3%), have at least one of four core-promoter elements at its functional position and 41.8% (44.1%) have only one element including TATA – 5.5% (2.9%), Inr – 20.1% (23.0%), DPE – 6.6% (8.4%), and BRE – 9.6% (9.8%). The list of promoters from EPD with no core-promoter elements at a functional position may be found in the Sequence Supplementary Material.

Table 4 shows the distances between the elements, percentages, actual numbers and statistical significance of promoters having combinations of the elements in the entire EPD and DBTSS datasets. We also calculated the percentages of all combinations (the last sub-column in columns 4 and 5) with the distances being the same as in column 3 plus one in both directions. The results clearly indicate the high statistical significance of the occurrence frequencies of all considered combinations in both promoter databases (the third sub-column in columns 4 and 5) (see also Figs 1–6 of Supplementary Material 3). The data is consistent between two databases. The widening of the range of the distances between elements increases the percentages of the promoters containing respective combinations, preserving the ratios between them.


View this table:
[in this window]
[in a new window]
 
Table 4 The statistical parameters of combinations of core elements

 
The presence of a CpG island essentially affects the promoter contents. The distributions of elements and their synergetic combinations for the CpG+ and CpG-less subset of promoters are presented in Table 5. As expected, the percentage of TATA+ promoters in the CpG-less subset is much higher than in CpG+ (Table 5, first line). However, still 13.3% (6.9%) of promoters with CpG island have a TATA box. The percentage of Inr+ promoters in the CpG-less subset is also higher than in CpG+ (second line). The presence of DPE is slightly more probable in the absence of CpG islands (third line). Note that statistical significances of occurrence frequency of the TATA box, Inr and DPE elements are high for both CpG+ and CpG-less subsets for both databases. Thus, our statistics do not confirm the widely held opinion that ‘CpG islands usually lack consensus or near-consensus TATA boxes, DPE elements, or Inr elements’ (Smale and Kadonaga, 2003). The BRE is the only element whose presence is much more probable in the CpG+ promoters [30.9% (33.4%) in CpG+ versus 9.7% (7.7%) in CpG-less]. The statistical significance of BRE is high for the CpG+ subset and non-substantial for the CpG-less subset (line 4). All six combinations of elements (with the exception of combinations with BRE in CpG-less subset) have high level of statistical significance in both subsets for both databases (lines 5–10).


View this table:
[in this window]
[in a new window]
 
Table 5 The percentage (%), absolute number (N) and statistical significance (dS) of elements and their synergetic combinations in CpG+ and CpG-less promoters calculated for EPD and DBTSS promoter databases (respective P-values are less than 0.0001 if dS ≥ 3.8StD)

 
We found that 83 from 107 MSS promoters (i.e. 76.9%) contain at least one core-promoter element in the functional position relative to the TSS. This percentage is practically the same as for all promoters from the both datasets. The statistical significance of the presence of any one of the four elements in the functional position is comparatively high for a relatively small dataset: dS = 3.5StD, P = 0.0005. Remarkably, the portion of MSS promoters containing BRE (29.6%) is larger than on average in the EPD/DBTSS datasets. Since the dS value is roughly proportional to the {surd} of the number of sequences, one may expect respective decrease of a statistical significance of every particular element on a small dataset. For example, one would expect the statistical significance of every element in the MSS promoters to be approximately in 4.2 [] times lower than for EPD database (Table 1). BRE is the only element whose statistical significance exceeds the expectation, reaching dS = 3.4StD, P = 0.0007 in the MSS TATA-less promoters. Thus the presence of the BRE element in the CpG+ and MSS promoters is comparable with the presence of the TATA box in the CpG-less promoters.


    DISCUSSION
 TOP
 Abstract
 INTRODUCTION
 DATA AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
The TATA box at position –26 to –30 and the Inr element around TSS enable the successful transcription initiation (O'Shea-Greenfield and Smale, 1992). In this scenario, the TFIID presumably binds to DNA through both the TATA box and Inr elements (see schematic representation on Fig. 1A). Only 9.4% (4.3%) of the promoters (Table 4) contain TATA and Inr at the synergetic distance. How does the transcription machinery work in the rest of the promoters?



View larger version (51K):
[in this window]
[in a new window]
 
Fig. 1 Illustration of different possible scenarios of interaction between general TFs (TFIID and TFIIB) and core-promoter elements. The lower bar on each picture represents promoter area of DNA. TSS is placed at position +1. The bold black lines indicate interaction between the TFs or between the TFs and binding elements.

 
The DPE element was found to be a target for TFIID and, in some cases, leads the transcription initiation in cooperation with the Inr element (Burke and Kadonaga, 1997) (Fig. 1B). Some of the TAFs [TAFII55 was found to be one of them (Zhou and Chiang, 2001)] bind to the DPE motif and attract TFIID to DNA.

TFIIB plays a central role in TSS selection as well as in the PIC assembly connecting TFIID and Pol II (Hawkes and Roberts, 1999; Fairley et al., 2002). In the presence of BRE, TFIIB binds to DNA immediately upstream of the TATA box and to TFIID to direct transcription (Tsai and Sigler, 2000; Lagrange et al., 1998) or to repress it (Evans et al., 2001) (Fig. 1C).

Note the common features of the aforementioned combinations: (1) all of them involve TFIID, and TBP binds to DNA regardless of the presence/absence of TATA box; (2) TFIID covers the TSS area; (3) the distance from the TSS to the edge of the complex is approximately the same (~30–40 bp). Combinations BRE_Inr, BRE_DPE and TATA_DPE also satisfy these requirements. These combinations are presented in a number of promoters comparable with the three previous combinations with comparable statistical significance (Table 1). They may therefore be also considered as possible synergetic combinations of core-promoter elements (Fig. 1D–F).

The following arguments show the possibility of synergy between the BRE and Inr elements. An essential part of the TATA-less promoters [28.1% (26.9%)] contains BRE (Tables 2 and 3). There is experimental evidence that TFIIB may recognize BRE directly, not necessarily through interaction with TBP (Lagrange et al., 1998). As in the TATA+ promoters, TFIIB can bind to the BRE motif of DNA and to some TAFs of the TFIID complex (Fig. 1D), attracting TFIID to DNA. It was found that non-sequence-specific bound TBP (i.e. bound not to the TATA box element) is also active in assembling PIC (Coleman and Pugh, 1995); so as in the Inr_DPE promoters, TBP could bind to the DNA upstream of TSS. The interaction of TFIID and Inr may create a stable complex as in the TATA_Inr case. The percentage of combination BRE_Inr is comparable with the TATA_Inr combination and the statistical significance of the former combination is high: 9.6 (8.8) StD (Table 4). This observation partially supports the statement that ‘IIB–BRE interaction will play a role, possibly a dominant role, in preinitiation complex assembly and transcription initiation at TATA-less promoters’ (Lagrange et al., 1998).

The same arguments also work for the BRE_DPE combination. Indeed, the subset TATA-less_Inr-less contains much more BRE [31.2% (27.5%)] than the subset TATA+Inr+ [12.6% (14.1%)]. The statistical significance of this combination is also high: 8.0 (19.3) StD (Table 4). In this case TFIIB binds to both the TFIID and the BRE motif, and TFIID through TAFs binds to the DPE motif (Fig. 1E).

Finally, the combination TATA_DPE [Fig. 1F, statistical significance 25.5 (19.9) StD] also may work in the framework described above: TFIID binds to the TATA box through TBP, and another part of TFIID binds to DPE. Of course, in such promoters the TATA box may be strong enough to start transcription (at least in vitro) alone (Burke and Kadonaga, 1997). Hypothetically, in vivo, when many subtle factors are essential for transcription regulation, the TATA and DPE elements placed at their functional positions could work synergistically.

As we have already mentioned the majority of promoters have at least one core-promoter element at a functional position. These elements can work as an anchor for the basal machinery. In many cases, the presence of a synergetic combination of two elements, which is much stronger than a single element, dictates the position of TSS. In other cases, the position of one core element plus the position of binding sites of non-general transcription factors, like Sp1, which interacts with both DNA and PIC, define the position of TSS (Liao et al., 1994). In any case, most likely the presence of any core element is beneficial for transcription initiation. Usually the promoters with strong synergetic combinations have SSS. If there are no such combinations, as in TATA-less_Inr-less promoters, the presence of core elements could possibly initiate multiple weak TSSs in a so-called initiation window of MSS promoters (Lin et al., 2001). This is consistent with the suggestion that the TSS positions in MSS promoters are defined in part by the positions of the core elements. (See Supplemental Material 4 for example of an MSS sequence with several core-promoter elements at functional positions.)

In order to minimize the possible database biases we used two different promoter databases, EPD and DBTSS. Comparisons show that both databases give, in general, consistent results (Tables 1 5). The only visible difference is between TATA + promoters: 21.8% for EPD and 10.4% for DBTSS. This discrepancy may be explained by the difference in database creation and fivefold difference in volume. The EPD database is a collection of experimentally defined promoters (Cavin Périer et al., 1998). The DBTSS promoters were identified by the mRNA start sites determined by a large-scale sequencing of the cDNA libraries constructed by the ‘oligo-capping’ method (Suzuki et al., 2001). So the percentage of the TATA+ promoters in the EPD database is higher since the TATA-containing promoters are more accessible for experimental analysis by standard start-site mapping techniques and hence were discovered. We have already mentioned that the maximal occurrence frequency of the Inr element is placed at position +1. The frequencies at the positions of the nearest neighbors (–1, +1) bp are approximately the same as an average occurrence frequency. This pattern is true for both the CpG+ and CpG-less subsets. This means that for both databases in the majority of (at least) Inr+ promoters the TSS position was defined with precise accuracy.

The most important conclusions of this study are: (1) The portion of the TATA+ promoters is just from 10 to 20% of all known human promoters. (2) The statistical significances of the occurrence frequency of the DPE and BRE elements at their experimentally defined functional positions are high, indicating that considerable amount of human genes use these elements for the transcription. (3) The combinations of the core-promoter elements such as BRE and Inr, TATA and DPE, and BRE and DPE are statistically as important as known synergetic combinations such as TATA_Inr, Inr_DPE and TATA_BRE suggesting that the former combinations may also work synergistically. (4) The high percentage and statistical significance of MSS promoters having core-promoter elements at functional positions suggests that those elements define the position of TSS in MSS promoters. (5) The high percentage and statistical significance of BRE, especially in CpG+ and MSS promoters, suggests that this element may be functional in many promoters including TATA-less promoters. (6) Approximately one-fourth of all promoters do not have any of the four core-promoter elements suggesting the existence of other yet undiscovered core elements.


    Acknowledgments
 
We are grateful to L.F. Johnson, J. Kadonaga and M.Q. Zhang for their useful comments on this work.

Received on July 29, 2004; revised on November 3, 2004; accepted on November 20, 2004

    REFERENCES
 TOP
 Abstract
 INTRODUCTION
 DATA AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

    Babenko, V.N., Kosarev, P.S., Vishnevsky, O.V., Levitsky, V.G., Basin, V.V., Frolov, A.S. (1999) Investigating extended regulatory regions of genomic DNA sequences. Bioinformatics, 15, 644–653[Abstract/Free Full Text].

    Bucher, P. (1990) Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J. Mol. Biol., 212, 563–578[CrossRef][ISI][Medline].

    Burke, T.W. and Kadonaga, J.T. (1997) The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila. Genes Dev., 11, 3020–3031[Abstract/Free Full Text].

    Burley, S.K. and Roeder, R.G. (1996) Biochemistry and structural biology of transcription factor IID (TFIID). Annu. Rev. Biochem., 65, 769–799[CrossRef][ISI][Medline].

    Butler, J.E. and Kadonaga, J.T. (2002) The RNA polymerase II core promoter: A key component in the regulation of gene expression. Genes Dev., 16, 2583–2592[Free Full Text].

    Cavin Périer, R., Junier, T., Bucher, P. (1998) The Eukaryotic Promoter Database EPD. Nucleic Acids Res., 26, 353–357[Abstract/Free Full Text].

    Coleman, R.A. and Pugh, B.F. (1995) Evidence for functional binding and stable sliding of the TATA binding protein on nonspecific DNA. J. Biol. Chem., 270, 13850–13859[Abstract/Free Full Text].

    Emami, K.H., Jain, A., Smale, S.T. (1997) Mechanism of synergy between TATA and initiator: synergistic binding of TFIID following a putative TFIIA-induced isomerization. Genes Dev., 11, 3007–3019[Abstract/Free Full Text].

    Evans, R., Fairley, J.A., Roberts, S.G. (2001) Activator-mediated disruption of sequence-specific DNA contacts by the general transcription factor TFIIB. Genes Dev, 5, 2945–2949.

    Fairley, J.A., Evans, R., Hawkes, N.A., Roberts, S.G. (2002) Core promoter-dependent TFIIB conformation and a role for TFIIB conformation in transcription start site selection. Mol. Cell Biol., 22, 6697–6705[Abstract/Free Full Text].

    Gardiner-Garden, M. and Frommer, M. (1987) CpG islands in vertebrate genomes. J. Mol. Biol., 196, 261–282[CrossRef][ISI][Medline].

    Gershenzon, N. and Ioshikhes, I. (2005) Promoter Classifier: software package for promoter database analysis. Appl. Bioinformatics, 4, (in press).

    Green, M.R. (2000) TBP-associated factors (TAFIIs): multiple, selective transcriptional mediators in common complexes. Trends Biochem. Sci., 25, 59–63[CrossRef][ISI][Medline].

    Hampsey, M. (1998) Molecular genetics of the RNA polymerase II general transcriptional machinery. Microbiol. Mol. Biol. Rev., 62, 465–503[Abstract/Free Full Text].

    Hawkes, N.A. and Roberts, S.G.E. (1999) The role of human TFIIB in transcription start site selection in vitro and in vivo. J. Biol. Chem., 274, 14337–14343[Abstract/Free Full Text].

    Kutach, A.K. and Kadonaga, J.T. (2000) The downstream promoter element DPE appears to be as widely used as the TATA box in Drosophila core promoters. Mol. Cell. Biol., 20, 4754–4764[Abstract/Free Full Text].

    Lagrange, T., Kapanidis, A.N., Tang, H., Reinberg, D., Ebright, R.H. (1998) New core promoter element in RNA polymerase II-dependent transcription: Sequence-specific DNA binding by transcription factor IIB. Genes Dev., 12, 34–44[Abstract/Free Full Text].

    Lemon, B. and Tjian, R. (2000) Orchestrated response: A symphony of transcription factors for gene control. Genes Dev., 14, 2551–2569[Free Full Text].

    Lewis, BA., Kim, T.K., Orkin, S.H. (2000) A downstream element in the human beta-globin promoter: evidence of extended sequence-specific transcription factor IID contacts. Proc. Natl Acad. Sci. USA, 97, 7172–7177[Abstract/Free Full Text].

    Liao, W.-C., Geng, Y., Johnson, L.F. (1994) In vitro transcription of the TATAA-less mouse thymidylate synthase promoter: multiple transcription start points and evidence for bidirectionality. Gene, 146, 183–189[CrossRef][ISI][Medline].

    Lin, Y., Ince, T.A., Scotto, K.W. (2001) Optimization of a versatile in vitro transcription assay for the expression of multiple start site TATA-less promoters. Biochemistry, 40, 12959–12966[CrossRef][Medline].

    Martinez, E., Zhou, Q., L'Etoile, N.D., Oelgeschlager, T., Berk, A.J., Roeder, R.G. (1995) Core promoter-specific function of a mutant transcription factor TFIID defective in TATA-Box binding. Proc. Natl Acad. Sci. USA, 92, 11864–11868[Abstract/Free Full Text].

    Nikolov, D.B. and Burley, S.K. (1997) RNA polymerase II transcription initiation: a structural view. Proc. Natl Acad. Sci. USA, 94, 15–22[Abstract/Free Full Text].

    Orphanides, G., Lagrange, T., Reinberg, D. (1996) The general transcription factors of RNA polymerase II. Genes Dev., 10, 2657–2662[Free Full Text].

    O'Shea-Greenfield, A. and Smale, S.T. (1992) Roles of TATA and initiator elements in determining the start site location and direction of RNA polymerase II transcription. J. Biol. Chem., 267, 1391–1402[Abstract/Free Full Text].

    Roeder, R.G. (1996) The role of general initiation factors in transcription by RNA polymerase II. Trends Biochem. Sci., 21, 327–335[CrossRef][ISI][Medline].

    Smale, S.T. and Kadonaga, J.T. (2003) The RNA polymerase II core promoter. Annu. Rev. Biochem., 72, 449–479[CrossRef][ISI][Medline].

    Suzuki, Y., Tsunoda, T., Sese, J., Taira, H., Mizushima-Sugano, J., Hata, H., Ota, T., Isogai, T., Tanaka, T., Nakamura, Y., et al. (2001) Identification and characterization of the potential promoter regions of 1031 kinds of human genes. Genome Res., 11, 677–684[Abstract/Free Full Text].

    Tsai, F.T.F. and Sigler, P.B. (2000) Structural basis of preinitiation complex assembly on human Pol II promoters. EMBO J., 19, 25–36[CrossRef][ISI][Medline].

    Zenzie-Gregory, B., Khachi, A., Garraway, I.P., Smale, S.T. (1993) Mechanism of initiator-mediated transcription: evidence for a functional interaction between the TATA-binding protein and DNA in the absence of a specific recognition sequence. Mol. Cell Biol., 13, 3841–3849[Abstract/Free Full Text].

    Zhang, M.Q. (1998) A discrimination study of human core-promoters. Pac. Symp. Biocomput. 1998, 240–251.

    Zhou, T. and Chiang, C.-M. (2001) The intronless and TATA-less human TAFII55 gene contains a functional initiator and a downstream promoter element. J. Biol. Chem., 276, 25503–25511[Abstract/Free Full Text].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Biophys. JHome page
C. H. Choi, Z. Rapti, V. Gelev, M. R. Hacker, B. Alexandrov, E. J. Park, J. S. Park, N. Horikoshi, A. Smerzi, K. O. Rasmussen, et al.
Profiling the Thermodynamic Softness of Adenoviral Promoters
Biophys. J., July 15, 2008; 95(2): 597 - 608.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. C. Frith, E. Valen, A. Krogh, Y. Hayashizaki, P. Carninci, and A. Sandelin
A code for transcription initiation in mammalian genomes
Genome Res., January 1, 2008; 18(1): 1 - 12.
[Abstract] [Full Text] [PDF]


Home page
Genes Dev.Home page
Y. Isogai, S. Keles, M. Prestel, A. Hochheimer, and R. Tjian
Transcription of histone gene cluster by differential core-promoter factors
Genes & Dev., November 15, 2007; 21(22): 2936 - 2949.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
B. Malecova, P. Gross, M. Boyer-Guittaut, S. Yavuz, and T. Oelgeschlager
The Initiator Core Promoter Element Antagonizes Repression of TATA-directed Transcription by Negative Cofactor NC2
J. Biol. Chem., August 24, 2007; 282(34): 24767 - 24776.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Vardhanabhuti, J. Wang, and S. Hannenhalli
Position and distance specificity are important determinants of cis-regulatory motifs in addition to evolutionary conservation
Nucleic Acids Res., May 11, 2007; 35(10): 3203 - 3213.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
J. J. Stewart, J. A. Fischbeck, X. Chen, and L. A. Stargell
Non-optimal TATA Elements Exhibit Diverse Mechanistic Consequences
J. Biol. Chem., August 11, 2006; 281(32): 22665 - 22673.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
D.-H. Lee, N. Gershenzon, M. Gupta, I. P. Ioshikhes, D. Reinberg, and B. A. Lewis
Functional Characterization of Core Promoter Elements: the Downstream Core Element Is Recognized by TAF1
Mol. Cell. Biol., November 1, 2005; 25(21): 9674 - 9686.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. I. Gershenzon, G. D. Stormo, and I. P. Ioshikhes
Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites
Nucleic Acids Res., April 22, 2005; 33(7): 2290 - 2301.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/8/1295    most recent
bti172v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (23)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gershenzon, N. I.
Right arrow Articles by Ioshikhes, I. P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gershenzon, N. I.
Right arrow Articles by Ioshikhes, I. P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?