Bioinformatics Advance Access originally published online on August 19, 2004
Bioinformatics 2005 21(3):307-313; doi:10.1093/bioinformatics/bth480
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics vol. 21 issue 3 © Oxford University Press 2005; all rights reserved.
Similarity of position frequency matrices for transcription factor binding sites
1 Cold Spring Harbor Laboratory 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
2 Department of Physics and Astronomy, State University of New York Stony Brook, NY 11794, USA
3 Computer Science Department, Portland State University PO Box 751, Portland, OR 97207, USA
*To whom correspondence should be addressed.
Motivation: Transcription-factor binding sites (TFBS) in promoter sequences of higher eukaryotes are commonly modeled using position frequency matrices (PFM). The ability to compare PFMs representing binding sites is especially important for de novo sequence motif discovery, where it is desirable to compare putative matrices to one another and to known matrices.
Results: We describe a PFM similarity quantification method based on product multinomial distributions, demonstrate its ability to identify PFM similarity and show that it has a better false positive to false negative ratio compared to existing methods.
We grouped TFBS frequency matrices from two libraries into matrix families and identified the matrices that are common and unique to these libraries. We identified similarities and differences between the skeletal-muscle-specific and non-muscle-specific frequency matrices for the binding sites of Mef-2, Myf, Sp-1, SRF and TEF of Wasserman and Fickett. We further identified known frequency matrices and matrix families that were strongly similar to the matrices given by Wasserman and Fickett. We provide methodology and tools to compare and query libraries of frequency matrices for TFBSs.
Availability: Software is available to use over the Web at http://rulai.cshl.edu/MatCompare
Contact: dschones{at}cshl.edu
Supplementary information: Database and clustering statistics, matrix families and representatives are available at http://rulai.cshl.edu/MatCompare/Supplementary
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y. Zhang, W. Wu, Y. Cheng, D. C. King, R. S. Harris, J. Taylor, F. Chiaromonte, and R. C. Hardison Primary sequence and epigenetic determinants of in vivo occupancy of genomic DNA by GATA1 Nucleic Acids Res., September 18, 2009; (2009) gkp747v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Fan, P. B. Bitterman, and O. Larsson Regulatory element identification in subsets of transcripts: Comparison and integration of current computational methods RNA, August 1, 2009; 15(8): 1469 - 1482. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Zhang, M. Xu, S. Li, and Z. Su Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes Nucleic Acids Res., June 1, 2009; 37(10): e72 - e72. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Tokovenko, R. Golda, O. Protas, M. Obolenskaya, and A. El'skaya COTRASIF: conservation-aided transcription-factor-binding site finder Nucleic Acids Res., April 1, 2009; 37(7): e49 - e49. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. T. Fulp, G. Cho, E. D. Marsh, I. M. Nasrallah, P. A. Labosky, and J. A. Golden Identification of Arx transcriptional targets in the developing basal forebrain Hum. Mol. Genet., December 1, 2008; 17(23): 3740 - 3760. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. J. Pape, S. Rahmann, and M. Vingron Natural similarity measures between position frequency matrices with an application to clustering Bioinformatics, February 1, 2008; 24(3): 350 - 357. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Bryne, E. Valen, M.-H. E. Tang, T. Marstrand, O. Winther, I. da Piedade, A. Krogh, B. Lenhard, and A. Sandelin JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update Nucleic Acids Res., January 11, 2008; 36(suppl_1): D102 - D106. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Kheradpour, A. Stark, S. Roy, and M. Kellis Reliable prediction of regulator targets using 12 Drosophila genomes Genome Res., December 1, 2007; 17(12): 1919 - 1931. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Das, T. A. Clark, A. Schweitzer, M. Yamamoto, H. Marr, J. Arribere, S. Minovitsky, A. Poliakov, I. Dubchak, J. E. Blume, et al. A correlation with exon expression approach to identify cis-regulatory elements for tissue-specific alternative splicing Nucleic Acids Res., July 10, 2007; (2007) gkm485v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Martinez, A. D. Smith, B. Li, M. Q. Zhang, and K. S. Harrod Computational prediction of novel components of lung transcriptional networks Bioinformatics, January 1, 2007; 23(1): 21 - 29. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Haberer, M. T. Mader, P. Kosarev, M. Spannagl, L. Yang, and K. F.X. Mayer Large-Scale cis-Element Detection by Analysis of Correlated Expression and Sequence Conservation between Arabidopsis and Brassica oleracea Plant Physiology, December 1, 2006; 142(4): 1589 - 1602. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Fang, S. Fan, X. Zhang, and M. Q. Zhang Predicting methylation status of CpG islands in the human brain Bioinformatics, September 15, 2006; 22(18): 2204 - 2209. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Smith, P. Sumazin, Z. Xuan, and M. Q. Zhang DNA motifs in human and mouse proximal promoters predict tissue-specific expression PNAS, April 18, 2006; 103(16): 6275 - 6280. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Roepcke, S. Grossmann, S. Rahmann, and M. Vingron T-Reg Comparator: an analysis tool for the comparison of position weight matrices Nucleic Acids Res., July 1, 2005; 33(suppl_2): W438 - W441. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Cartharius, K. Frech, K. Grote, B. Klocke, M. Haltmeier, A. Klingenhoff, M. Frisch, M. Bayerlein, and T. Werner MatInspector and beyond: promoter analysis based on transcription factor binding sites Bioinformatics, July 1, 2005; 21(13): 2933 - 2942. [Abstract] [Full Text] [PDF] |
||||






