Bioinformatics Advance Access originally published online on September 28, 2004
Bioinformatics 2005 21(5):596-600; doi:10.1093/bioinformatics/bti041
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CIS: compound importance sampling method for proteinDNA binding site p-value estimation



1 School of Computer Science & Engineering, The Hebrew University Jerusalem 91904, Israel
2 Hadassah Medical School, The Hebrew University Jerusalem 91120, Israel
*To whom correspondence should be addressed.
Motivation: A key aspect of transcriptional regulation is the binding of transcription factors to sequence-specific binding sites that allow them to modulate the expression of nearby genes. Given models of such binding sites, one can scan regulatory regions for putative binding sites and construct a genome-wide regulatory network. In such genome-wide scans, it is crucial to control the amount of false positive predictions. Recently, several works demonstrated the benefits of modeling dependencies between positions within the binding site. Yet, computing the statistical significance of putative binding sites in this scenario remains a challenge.
Results: We present a general, accurate and efficient method for computing p-values of putative binding sites that is applicable to a large class of probabilistic binding site and background models. We demonstrate the accuracy of the method on synthetic and real-life data.
Availability: The procedure for scanning DNA sequences and computing the statistical significance of putative binding site scores is available upon request at http://compbio.cs.huji.ac.il/CIS/
Contact: nir{at}cs.huji.ac.il
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Yu. Mitrophanov and M. Borodovsky Statistical significance in biological sequence analysis Brief Bioinform, March 1, 2006; 7(1): 2 - 24. |
||||
