Bioinformatics Advance Access originally published online on March 3, 2005
Bioinformatics 2005 21(10):2254-2263; doi:10.1093/bioinformatics/bti361
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Non-additivity in proteinDNA binding
1Department of Physics and Astronomy and BioMaps Institute, Rutgers, The State University of New Jersey 136 Frelinghuysen Road, Piscataway, NJ 08854-8019, USA
2Laboratoire de Biochimie Théorique, CNRS UPR 9080, Institut de Biologie Physico-Chimique 13 rue Pierre et Marie Curie, Paris 75005, France
*To whom correspondence should be addressed.
Motivation: Localizing protein binding sites within genomic DNA is of considerable importance, but remains difficult for protein families, such as transcription factors, which have loosely defined target sequences. It is generally assumed that protein affinity for DNA involves additive contributions from successive nucleotide pairs within the target sequence. This is not necessarily true, and non-additive effects have already been experimentally demonstrated in a small number of cases. The principal origin of non-additivity involves the so-called indirect component of proteinDNA recognition which is related to the sequence dependence of DNA deformation induced during complex formation. Non-additive effects are difficult to study because they require the identification of many more binding sequences than are normally necessary for describing additive specificity (typically via the construction of weight matrices).
Results: In the present work we will use theoretically estimated binding energies as a basis for overcoming this problem. Our approach enables us to study the full combinatorial set of sequences for a variety of DNA-binding proteins, make a detailed analysis of non-additive effects and exploit this information to improve binding site predictions using either weight matrices or support vector machines. The results underline the fact that, even in the presence of significant deformation, non-additive effects may involve only a limited number of dinucleotide steps. This information helps to reduce the number of binding sites which need to be identified for successful predictions and to avoid problems of over-fitting.
Availability: The SVM software is available upon request from the authors.
Contact: anirvans{at}physics.rutgers.edu
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
N. A. Temiz and C. J. Camacho Experimentally based contact energies decode interactions responsible for protein-DNA affinity and the role of molecular waters at the binding interface Nucleic Acids Res., June 15, 2009; (2009) gkp289v2. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ahmad, O. Keskin, A. Sarai, and R. Nussinov Protein-DNA interactions: structural, thermodynamic and clustering patterns of conserved residues in DNA-binding proteins Nucleic Acids Res., October 1, 2008; 36(18): 5922 - 5932. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu and G. D. Stormo Context-dependent DNA recognition code for C2H2 zinc-finger transcription factors Bioinformatics, September 1, 2008; 24(17): 1850 - 1857. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Gunewardena and Z. Zhang A hybrid model for robust detection of transcription factor binding sites Bioinformatics, February 15, 2008; 24(4): 484 - 491. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Faiger, M. Ivanchenko, and T. E. Haran Nearest-neighbor non-additivity versus long-range non-additivity in TATA-box structure and its implications for TBP-binding mechanism Nucleic Acids Res., July 26, 2007; 35(13): 4409 - 4419. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. GuhaThakurta Computational identification of transcriptional regulatory elements in DNA sequence Nucleic Acids Res., July 19, 2006; 34(12): 3585 - 3598. [Abstract] [Full Text] [PDF] |
||||

