Skip Navigation


Bioinformatics Advance Access originally published online on March 28, 2007
Bioinformatics 2007 23(10):1307-1308; doi:10.1093/bioinformatics/btm113
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/10/1307    most recent
btm113v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhu, Q.-H.
Right arrow Articles by Luo, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhu, Q.-H.
Right arrow Articles by Luo, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

DPTF: a database of poplar transcription factors

Qi-Hui Zhu 1, An-Yuan Guo 1, Ge Gao 1, Ying-Fu Zhong 1, Meng Xu 2, Minren Huang 2 and Jinchu Luo 1,*

1Center for Bioinformatics, National Laboratory of Protein Engineering and Plant Genetic Engineering and College of Life Sciences, Peking University, Beijing 100871, P. R. China and 2The Key Laboratory of Forest Genetics and Gene Engineering, Nanjing Forest University, Nanjing, 210037, China

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IDENTIFICATION AND ANNOTATION...
 3 IMPLEMENTATION AND USER...
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: The database of poplar transcription factors (DPTF) is a plant transcription factor (TF) database containing 2576 putative poplar TFs distributed in 64 families. These TFs were identified from both computational prediction and manual curation. We have provided extensive annotations including sequence features, functional domains, GO assignment and expression evidence for all TFs. In addition, DPTF contains cross-links to the Arabidopsis and rice transcription factor databases making it a unique resource for genome-scale comparative studies of transcriptional regulation in model plants.

Availiability: DPTF is available at http://dptf.cbi.pku.edu.cn

Contact: dptf{at}mail.cbi.pku.edu.cn


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IDENTIFICATION AND ANNOTATION...
 3 IMPLEMENTATION AND USER...
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The genus Populus possesses many characteristics that make it ideal for functional genomic studies. Populus is a model system for the investigating wood development, crown formation and disease resistance in perennial plants. The availability of the recently sequenced black cottonwood (Populus trichocarpa) genome (Tuskan et al., 2006), together with two herbaceous plant genomes (Arabidopsis and rice), allows investigation of similarities and differences in transcriptional regulators between dicotyledon and monocotyledon (poplar versus rice). These genomes also allow comparative analysis of woody and herbaceous plants (poplar versus Arabidopsis).

Transcription factors (TFs) play critical roles in the regulation of gene expression. Genome-wide identification, characterization and annotation of TFs may shed light on understanding the biological function of TFs and exploring the mechanisms of transcriptional regulation. To gain a comprehensive knowledge of TFs in woody plant, we have identified 2576 TFs in P.trichocarpa systematically, and constructed a database of poplar TFs (DPTF). DPTF provides extensive annotations of poplar TFs through functional inferences by sequence analysis and expression profiles. In addition, DPTF also contains evolutionary relationships of TF homologs among poplar, Arabidopsis and two subspecies of Asian cultivated rice, indica and japonica.


    2 IDENTIFICATION AND ANNOTATION OF PUTATIVE TRANSCRIPTION FACTORS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IDENTIFICATION AND ANNOTATION...
 3 IMPLEMENTATION AND USER...
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
We obtained 45 555 predicted P.trichocarpa proteins from the DOE Joint Genome Institute repository (JGI: http://genome.jgi-psf.org/Poptr1_1/Poptr1_1.home.html). In order to provide a complete collection of the poplar TFs and improve the quality of database, we combined automated search and manual curation using the same approach we applied to the construction of the databases of Arabidopsis and rice TFs (DATF, Gao et al., 2006; DRTF, Guo et al., 2005). A value of 0.01 was used as the common E-value cut-off for most TF families in HMMER search. To reduce the false positive and false negative rates, we manually checked and refined the results by inspecting the raw alignments. Some families (e.g. Alfin, HRT, LUG and NZZ) which were either characterized recently or contained only a few members did not have DNA-binding domain profiles. For those families, representative sequences were used as seeds to search for BLAST homologs and the E-value cut-off were manually inspected (for details see the DPTF Help page). As a result of this analysis, a total of 2576 putative TFs were identified and classified into 64 families.

To provide comprehensive information on the putative TFs identified, we made extensive annotations at both family and gene levels. For each TF family, brief introductions and key references were included in the summary pages of the DPTF database website. For each identified TF, DPTF supplied links to various public sequence databases including UniProt (Wu et al., 2006), RefSeq (Pruitt et al., 2005), EMBL (Cochrane, et al., 2006) and TransFac (Matys et al., 2006). Furthermore, domain structures were identified and annotated by InterProScan (Quevillon et al., 2005), and cross-links to corresponding databases were also provided.

In addition to the annotations based on sequence analysis, we also collected available expression profiling data for all putative TFs. Expression patterns based on EST data were extracted from the NCBI UniGene database. Microarray expression data generated by eQTL analysis of adventitious root development in Populus (unpublished data) were also integrated. The expression information may give valuable insights for further analysis of transcriptional regulations.

A preliminary analysis of the evolutionary relationship between transcription factors in three model plants was conducted. Potential orthologs in Arabidopsis and rice (Oryza sativa ssp. indica and ssp. japonica) were detected from best-reciprocal BLAST hits. In addition, neighbor-joining phylogenetic trees were also constructed for each family. These trees can be browsed on the DPTF database website or downloaded as vector graphics format files.


    3 IMPLEMENTATION AND USER INTERFACE
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IDENTIFICATION AND ANNOTATION...
 3 IMPLEMENTATION AND USER...
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
DPTF allows users to browse the content taxa by family or chromosome, to query by JGI poplar gene IDs or a combination of keywords. DPTF users can retrieve gene expression information at the genome level. BLAST searches against CDS or protein sequences of poplar, Arabidopsis and rice TFs are also available. All the TF CDS and protein sequences can be downloaded through the DPTF website for high-throughput analysis.


    4 DISCUSSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IDENTIFICATION AND ANNOTATION...
 3 IMPLEMENTATION AND USER...
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The poplar genome size was estimated to be roughly 485 Mb which is nearly four times of the Arabidopsis genome. In spite of this size difference, the predicted number of genes in poplar was estimated at only twice that of Arabidopsis (Tuskan et al., 2006). Using the same methods and similar cut-off criteria, 2576 putative TFs were identified in poplar, 1922 TFs in Arabidopsis, 2025 in indica rice and 2384 in japonica rice. Some gene families such as MADS have a similar number of members in these two species (111 in poplar and 104 in Arabidopsis). A few families such as GeBP contain fewer genes in poplar (7 in poplar and 21 in Arabidopsis). Using the best-reciprocal BLAST search approach, 49% potential orthologs in Arabidopsis, and 78% in rice were located.

The DPTF database addresses the lack of poplar gene expression data collected in research-accessible public databases. In addition, <40% genes have matched ESTs in UniGene. Expression information from genome-wide P.deltoides and P.euramericana microarray experiments have been collected, and this data provides expression profiling information for nearly 90% putative TFs. For example, all the 12 BES1 and 13 ARID TFs have been annotated with microarray data, while only three TFs in these two families have EST evidence in public databases. The detailed expression information in DPTF may provide useful information for further functional assessment.

As a large and long-lived forest tree growing in extensive wild populations across continents, poplar has been evolving under selective pressure different from those of annual herbaceous species. Its development involves extensive secondary growth. For example, some TFs in MYB-R2R3 and ZF-HD appear to play an important role in regulating wood formation (Hertzberg et al., 2001; Plomion et al., 2001). Analysis of data with the same cut-off standard identified 216 MYB and 25 ZF-HD TFs in poplar but 150 MYB and 16 ZF-HD TFs in Arabidopsis, and 138 MYB and 15 ZF-HD TFs in rice. These results indicate that some poplar-specific TFs in these two families might play a role in wood formation.

DPTF is the first database of TFs for the perennial woody plant and the first platform to collate the TF databases for poplar, Arabidopsis and rice. DPTF will provide comprehensive and valuable resources for research on transcriptional regulation and comparative genomics. DPTF will be maintained and updated regularly as more data and information become available.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IDENTIFICATION AND ANNOTATION...
 3 IMPLEMENTATION AND USER...
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
This study was supported by grants: 2003CB715900 (973), 90408015 and 30671705 (NSFC), 20060390012 (MOE), 2006AA02Z334 (863) and the China High-Tech Platform Program.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Alex Bateman

Received on November 27, 2006; revised on March 12, 2007; accepted on March 15, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IDENTIFICATION AND ANNOTATION...
 3 IMPLEMENTATION AND USER...
 4 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Cochrane G, et al. EMBL nucleotide sequence database: developments in 2005. Nucleic Acids Res., ( (2006) ) 34, : D10–D15.[Abstract/Free Full Text].

    Gao G, et al. DRTF: a database of rice transcription factors. Bioinformatics, ( (2006) ) 22, : 1286–1287.[Abstract/Free Full Text].

    Guo A, et al. DATF: a database of Arabidopsis transcription factors. Bioinformatics, ( (2005) ) 21, : 2568–2569.[Abstract/Free Full Text].

    Hertzberg M, et al. A transcriptional roadmap to wood formation. Proc. Natl Acad. Sci. USA, ( (2001) ) 98, : 14732–14737.[Abstract/Free Full Text].

    Matys V, et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res., ( (2006) ) 34, : D108–D110.[Abstract/Free Full Text].

    Plomion C, et al. Wood formation in trees. Plant Physiol., ( (2001) ) 127, : 1513–1523.[Free Full Text].

    Pruitt KD, et al. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res., ( (2005) ) 33, : D501–D504.[Abstract/Free Full Text].

    Quevillon E, et al. InterProScan: protein domains identifier. Nucleic Acids Res., ( (2005) ) 33, : W116–W120.[Abstract/Free Full Text].

    Tuskan GA, et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science, ( (2006) ) 313, : 1596–1604.[Abstract/Free Full Text].

    Wu CH, et al. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res., ( (2006) ) 34, : D187–D191.[Abstract/Free Full Text].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
jashsHome page
B. Cai, C.-H. Li, A.-S. Xiong, R.-H. Peng, J. Zhou, F. Gao, Z. Zhang, and Q.-H. Yao
DGTF: A Database of Grape Transcription Factors
J. Amer. Soc. Hort. Sci., May 1, 2008; 133(3): 459 - 461.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
P. J. Rushton, M. T. Bokowiec, S. Han, H. Zhang, J. F. Brannock, X. Chen, T. W. Laudeman, and M. P. Timko
Tobacco Transcription Factors: Novel Insights into Transcriptional Regulation in the Solanaceae
Plant Physiology, May 1, 2008; 147(1): 280 - 295.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A.-Y. Guo, X. Chen, G. Gao, H. Zhang, Q.-H. Zhu, X.-C. Liu, Y.-F. Zhong, X. Gu, K. He, and J. Luo
PlantTFDB: a comprehensive plant transcription factor database
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D966 - D969.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/10/1307    most recent
btm113v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhu, Q.-H.
Right arrow Articles by Luo, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhu, Q.-H.
Right arrow Articles by Luo, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?