Bioinformatics Advance Access published online on August 19, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth480
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA; Department of Physics and Astronomy, State University of New York, Stony Brook, NY 11794, USA
* To whom correspondence should be addressed. E-mail: dschones{at}cshl.edu.
Motivation: Transcription-factor binding sites in promoter sequences of higher eukaryotes are commonly modeled using position frequency matrices. The ability to compare position frequency matrices representing binding sites is especially important for de novo sequence motif discovery, where it is desirable to compare putative matrices to one another and to known matrices. Results: We describe a position frequency matrix similarity quantification method based on product-multinomial distributions, demonstrate its ability to identify position frequency matrix similarity and show that it has a better false positive to false negative ratio compared to existing methods. We group transcription factor binding site frequency matrices from two libraries into matrix families, and identify the matrices that are common and unique to these libraries. We identify similarities and differences between the skeletal-muscle-specific and non-muscle-specific frequency matrices for the binding sites of Mef-2, Myf, Sp-1, SRF and TEF of Wasserman and Fickett (1998). We further identify known frequency matrices and matrix families that are strongly similar to the matrices given by Wasserman and Fickett. We provide methodology and tools to compare and query libraries of frequency matrices for transcription factor binding sites. Availability: Software is available to use over the web at http://rulai.cshl.edu/MatCompare. Supplementary Information: Database and clustering statistics, matrix families, and representatives are available at http://rulai.cshl.edu/MatCompare/Supplementary.
Revised July 28, 2004
Accepted August 13, 2004
Article
Similarity of position frequency matrices for transcription factor binding sites
2 Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA; Computer Science Department, Portland State University, P.O. Box 751, Portland, OR 97207, USA
3 Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y. Zhang, W. Wu, Y. Cheng, D. C. King, R. S. Harris, J. Taylor, F. Chiaromonte, and R. C. Hardison Primary sequence and epigenetic determinants of in vivo occupancy of genomic DNA by GATA1 Nucleic Acids Res., September 18, 2009; (2009) gkp747v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Fan, P. B. Bitterman, and O. Larsson Regulatory element identification in subsets of transcripts: Comparison and integration of current computational methods RNA, August 1, 2009; 15(8): 1469 - 1482. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Zhang, M. Xu, S. Li, and Z. Su Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes Nucleic Acids Res., June 1, 2009; 37(10): e72 - e72. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Tokovenko, R. Golda, O. Protas, M. Obolenskaya, and A. El'skaya COTRASIF: conservation-aided transcription-factor-binding site finder Nucleic Acids Res., April 1, 2009; 37(7): e49 - e49. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. T. Fulp, G. Cho, E. D. Marsh, I. M. Nasrallah, P. A. Labosky, and J. A. Golden Identification of Arx transcriptional targets in the developing basal forebrain Hum. Mol. Genet., December 1, 2008; 17(23): 3740 - 3760. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. J. Pape, S. Rahmann, and M. Vingron Natural similarity measures between position frequency matrices with an application to clustering Bioinformatics, February 1, 2008; 24(3): 350 - 357. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Bryne, E. Valen, M.-H. E. Tang, T. Marstrand, O. Winther, I. da Piedade, A. Krogh, B. Lenhard, and A. Sandelin JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update Nucleic Acids Res., January 11, 2008; 36(suppl_1): D102 - D106. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Kheradpour, A. Stark, S. Roy, and M. Kellis Reliable prediction of regulator targets using 12 Drosophila genomes Genome Res., December 1, 2007; 17(12): 1919 - 1931. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Das, T. A. Clark, A. Schweitzer, M. Yamamoto, H. Marr, J. Arribere, S. Minovitsky, A. Poliakov, I. Dubchak, J. E. Blume, et al. A correlation with exon expression approach to identify cis-regulatory elements for tissue-specific alternative splicing Nucleic Acids Res., July 10, 2007; (2007) gkm485v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Martinez, A. D. Smith, B. Li, M. Q. Zhang, and K. S. Harrod Computational prediction of novel components of lung transcriptional networks Bioinformatics, January 1, 2007; 23(1): 21 - 29. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Haberer, M. T. Mader, P. Kosarev, M. Spannagl, L. Yang, and K. F.X. Mayer Large-Scale cis-Element Detection by Analysis of Correlated Expression and Sequence Conservation between Arabidopsis and Brassica oleracea Plant Physiology, December 1, 2006; 142(4): 1589 - 1602. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Fang, S. Fan, X. Zhang, and M. Q. Zhang Predicting methylation status of CpG islands in the human brain Bioinformatics, September 15, 2006; 22(18): 2204 - 2209. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Smith, P. Sumazin, Z. Xuan, and M. Q. Zhang DNA motifs in human and mouse proximal promoters predict tissue-specific expression PNAS, April 18, 2006; 103(16): 6275 - 6280. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Roepcke, S. Grossmann, S. Rahmann, and M. Vingron T-Reg Comparator: an analysis tool for the comparison of position weight matrices Nucleic Acids Res., July 1, 2005; 33(suppl_2): W438 - W441. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Cartharius, K. Frech, K. Grote, B. Klocke, M. Haltmeier, A. Klingenhoff, M. Frisch, M. Bayerlein, and T. Werner MatInspector and beyond: promoter analysis based on transcription factor binding sites Bioinformatics, July 1, 2005; 21(13): 2933 - 2942. [Abstract] [Full Text] [PDF] |
||||






