Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (82)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Wang, T.
Right arrow Articles by Stormo, G. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wang, T.
Right arrow Articles by Stormo, G. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 19 no. 18 2003
pages 2369-2380
© 2003 Oxford University Press

Combining phylogenetic data with co-regulated genes to identify regulatory motifs

Ting Wang and Gary D. Stormo *

Department of Genetics, Washington University Medical School, St. Louis, MO 63110, USA

Received on April 3, 2003 ; revised on June 2, 2003 ; accepted on June 12, 2003

Motivation: Discovery of regulatory motifs in unaligned DNA sequences remains a fundamental problem in computational biology. Two categories of algorithms have been developed to identify common motifs from a set of DNA sequences. The first can be called a ‘multiple genes, single species’ approach. It proposes that a degenerate motif is embedded in some or all of the otherwise unrelated input sequences and tries to describe a consensus motif and identify its occurrences. It is often used for co-regulated genes identified through experimental approaches. The second approach can be called ‘single gene, multiple species’. It requires orthologous input sequences and tries to identify unusually well conserved regions by phylogenetic footprinting. Both approaches perform well, but each has some limitations. It is tempting to combine the knowledge of co-regulation among different genes and conservation among orthologous genes to improve our ability to identify motifs.

Results: Based on the Consensus algorithm previously established by our group, we introduce a new algorithm called PhyloCon (Phylogenetic Consensus) that takes into account both conservation among orthologous genes and co-regulation of genes within a species. This algorithm first aligns conserved regions of orthologous sequences into multiple sequence alignments, or profiles, then compares profiles representing non-orthologous sequences. Motifs emerge as common regions in these profiles. Here we present a novel statistic to compare profiles of DNA sequences and a greedy approach to search for common subprofiles. We demonstrate that PhyloCon performs well on both synthetic and biological data.

Availability: Software available upon request from the authors. http://ural.wustl.edu/softwares.html

Contact: stormo{at}ural.wustl.edu

* To whom correspondence should be addressed at 4566 Scott Avenue, Campus Box 8232, St. Louis, MO 63110, USA.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
J. Liu, X. Xu, and G. D. Stormo
The cis-regulatory map of Shewanella genomes
Nucleic Acids Res., September 1, 2008; 36(16): 5376 - 5390.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
D. Xie, J. Cai, N.-Y. Chia, H. H. Ng, and S. Zhong
Cross-species de novo identification of cis-regulatory modules with GibbsModule: Application to gene regulation in embryonic stem cells
Genome Res., August 1, 2008; 18(8): 1325 - 1335.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Keles, C. L. Warren, C. D. Carlson, and A. Z. Ansari
CSI-Tree: a regression tree approach for modeling binding properties of DNA-binding molecules based on cognate site identification (CSI) data
Nucleic Acids Res., June 1, 2008; 36(10): 3171 - 3184.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
U. J. Pape, S. Rahmann, and M. Vingron
Natural similarity measures between position frequency matrices with an application to clustering
Bioinformatics, February 1, 2008; 24(3): 350 - 357.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
M. Brilli, R. Fani, and P. Lio
Current trends in the bioinformatic sequence analysis of metabolic pathways in prokaryotes
Brief Bioinform, January 1, 2008; 9(1): 34 - 45.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
S. R. Davies, L.-W. Chang, D. Patra, X. Xing, K. Posey, J. Hecht, G. D. Stormo, and L. J. Sandell
Computational identification and functional validation of regulatory motifs in cartilage-expressed genes
Genome Res., October 1, 2007; 17(10): 1438 - 1447.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
X. Cai, H. Hu, and X. S. Li
Tree Gibbs Sampler: identifying conserved motifs without aligning orthologous sequences
Bioinformatics, August 1, 2007; 23(15): 2013 - 2014.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
X. Dai, J. He, and X. Zhao
A new systematic computational approach to predicting target genes of transcription factors
Nucleic Acids Res., July 26, 2007; 35(13): 4433 - 4440.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. A. Newberg, W. A. Thompson, S. Conlan, T. M. Smith, L. A. McCue, and C. E. Lawrence
A phylogenetic Gibbs sampler that yields centroid solutions for cis-regulatory site prediction
Bioinformatics, July 15, 2007; 23(14): 1718 - 1727.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Mahony and P. V. Benos
STAMP: a web tool for exploring DNA-binding motif similarities
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W253 - W258.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
G. Zhao, L. A. Schriefer, and G. D. Stormo
Identification of muscle-specific regulatory modules in Caenorhabditis elegans
Genome Res., March 1, 2007; 17(3): 348 - 357.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
K. Tan, T. Shlomi, H. Feizi, T. Ideker, and R. Sharan
Transcriptional regulation of protein complexes within and across species
PNAS, January 23, 2007; 104(4): 1283 - 1288.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. J. Donaldson and B. Gottgens
CoMoDis: composite motif discovery in mammalian genomes
Nucleic Acids Res., January 12, 2007; 35(1): e1 - e1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Pachkov, I. Erb, N. Molina, and E. van Nimwegen
SwissRegulon: a database of genome-wide annotations of regulatory sites
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D127 - D131.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
L. Elnitski, V. X. Jin, P. J. Farnham, and S. J.M. Jones
Locating mammalian transcription factor binding sites: A survey of computational and experimental techniques
Genome Res., December 1, 2006; 16(12): 1455 - 1464.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
G. Haberer, M. T. Mader, P. Kosarev, M. Spannagl, L. Yang, and K. F.X. Mayer
Large-Scale cis-Element Detection by Analysis of Correlated Expression and Sequence Conservation between Arabidopsis and Brassica oleracea
Plant Physiology, December 1, 2006; 142(4): 1589 - 1602.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. GuhaThakurta
Computational identification of transcriptional regulatory elements in DNA sequence
Nucleic Acids Res., July 19, 2006; 34(12): 3585 - 3598.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Neph and M. Tompa
MicroFootPrinter: a tool for phylogenetic footprinting in prokaryotic genomes.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W366 - W368.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S.-W. Ho, G. Jona, C. T. L. Chen, M. Johnston, and M. Snyder
Linking DNA-binding proteins to their recognition sequences by using protein microarrays
PNAS, June 27, 2006; 103(26): 9940 - 9945.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. S Hon and A. N Jain
A deterministic motif finding algorithm with application to the human genome
Bioinformatics, May 1, 2006; 22(9): 1047 - 1054.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
L.-W. Chang, R. Nagarajan, J. A. Magee, J. Milbrandt, and G. D. Stormo
A systematic model to predict transcriptional regulatory mechanisms based on overrepresentation of transcription factor binding profiles
Genome Res., March 1, 2006; 16(3): 405 - 413.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. D. MacIsaac, D. B. Gordon, L. Nekludova, D. T. Odom, J. Schreiber, D. K. Gifford, R. A. Young, and E. Fraenkel
A hypothesis-based approach for identifying the binding specificity of regulatory proteins from chromatin immunoprecipitation data
Bioinformatics, February 15, 2006; 22(4): 423 - 429.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
T. Wang and G. D. Stormo
Identifying the conserved network of cis-regulatory sites of a eukaryotic genome
PNAS, November 29, 2005; 102(48): 17400 - 17405.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
X. Li, S. Zhong, and W. H. Wong
Reliable prediction of transcription factor binding sites by phylogenetic verification
PNAS, November 22, 2005; 102(47): 16945 - 16950.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. T. Jensen, L. Shen, and J. S. Liu
Combining phylogenetic motif discovery and motif clustering to predict co-regulated genes
Bioinformatics, October 15, 2005; 21(20): 3832 - 3839.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Hu, B. Li, and D. Kihara
Limitations and potentials of current motif discovery algorithms
Nucleic Acids Res., September 2, 2005; 33(15): 4899 - 4913.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
J. Gertz, L. Riles, P. Turnbaugh, S.-W. Ho, and B. A. Cohen
Discovery, validation, and genetic dissection of transcription factor binding sites by comparative and functional genomics
Genome Res., August 1, 2005; 15(8): 1145 - 1152.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
X. Li and W. H. Wong
Sampling motifs on phylogenetic trees
PNAS, July 5, 2005; 102(27): 9481 - 9486.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
K. Tan, L. A. McCue, and G. D. Stormo
Making connections between novel transcription factors and their DNA motifs
Genome Res., February 1, 2005; 15(2): 312 - 320.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. E. Schones, P. Sumazin, and M. Q. Zhang
Similarity of position frequency matrices for transcription factor binding sites
Bioinformatics, February 1, 2005; 21(3): 307 - 313.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
D. GuhaThakurta, L. A. Schriefer, R. H. Waterston, and G. D. Stormo
Novel transcription regulatory elements in Caenorhabditis elegans muscle genes
Genome Res., December 1, 2004; 14(12): 2457 - 2468.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
F. Backhed, H. Ding, T. Wang, L. V. Hooper, G. Y. Koh, A. Nagy, C. F. Semenkovich, and J. I. Gordon
The gut microbiota as an environmental factor that regulates fat storage
PNAS, November 2, 2004; 101(44): 15718 - 15723.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
Q. Zhou and W. H. Wong
CisModule: De novo discovery of cis-regulatory modules by hierarchical mixture modeling
PNAS, August 17, 2004; 101(33): 12114 - 12119.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
Y. Hu, T. Wang, G. D. Stormo, and J. I. Gordon
RNA interference of achaete-scute homolog 1 in mouse prostate neuroendocrine cells reveals its gene targets and DNA binding sites
PNAS, April 13, 2004; 101(15): 5559 - 5564.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.