Bioinformatics Advance Access published online on February 22, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti336
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 State Scientific Centre "GosNIIGenetika", 1st Dorozhny pr. 1, Moscow, 117545, Russia
* To whom correspondence should be addressed.
Motivation: Transcription regulatory protein factors often bind DNA as homo- or hetero-dimers. Thus they recognize structured DNA motifs that are inverted or direct repeats or spaced motif pairs. However, these motifs are often difficult to identify due to their high divergency. The motif structure included explicitely into the motif recognition algorithm improves recognition efficiency for highly divergent motifs as well as estimation of motif geometric parameters. Result: We present a modification of the Gibbs sampling motif extraction algorithm, SeSiMCMC (Sequence Similarities by Markov Chain Monte-Carlo), which finds structured motifs of these types, as well as non-structured motifs, in a set of unaligned DNA sequences. It employs improved estimators of motif and spacer lengths. The probability that a sequence does not contain any motif is accounted for in a rigorous Bayesian manner. We have applied the algorithm to a set of upstream regions of genes from two Escherichia coli regulons involved in respiration. We have demonstrated that accounting for a symmetric motif structure allows the algorithm to identify weak motifs more accurately. In the examples studied, ArcA binding sites were demonstrated to have the structure of a direct spaced repeat, whereas NarP binding sites exhibited the palindromic structure. Availability: The WWW interface of the program, its FreeBSD (4.0) and Windows 32 console executables, and some additional documentation are available at http://bioinform.genetika.ru/SeSiMCMC.
Received January 27, 2005
Accepted February 16, 2005
Article
A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length
2 State Scientific Centre "GosNIIGenetika", 1st Dorozhny pr. 1, Moscow, 117545, Russia; Institute of Information Transmission Problems, Russian Academy of Sciences, Bolshoi Karetny per. 19, Moscow, 127994, Russia
3 Institute of Information Transmission Problems, Russian Academy of Sciences, Bolshoi Karetny per. 19, Moscow, 127994, Russia; Dept. of Bioengineering and Bioinformatics, Moscow State University, Lab. Bldg B, Vorobiovy Gory 1-73, Moscow 119992, Russia
4 State Scientific Centre "GosNIIGenetika", 1st Dorozhny pr. 1, Moscow, 117545, Russia; Dept. of Bioengineering and Bioinformatics, Moscow State University, Lab. Bldg B, Vorobiovy Gory 1-73, Moscow 119992, Russia
5 State Scientific Centre "GosNIIGenetika", 1st Dorozhny pr. 1, Moscow, 117545, Russia; Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova 32, Moscow 119991, Russia
A. V. Favorov, E-mail: favorov{at}sensi.org
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
I. V. Kulakovskiy, A. V. Favorov, and V. J. Makeev Motif discovery and motif finding from genome-mapped DNase footprint data Bioinformatics, September 15, 2009; 25(18): 2318 - 2325. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-Y. Chen, H.-K. Tsai, C.-M. Hsu, M.-J. May Chen, H.-G. Hung, G. T.-W. Huang, and W.-H. Li Discovering gapped binding sites of yeast transcription factors PNAS, February 19, 2008; 105(7): 2527 - 2532. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Boulette and S. M. Payne Anaerobic Regulation of Shigella flexneri Virulence: ArcA Regulates fur and Iron Acquisition Genes J. Bacteriol., October 1, 2007; 189(19): 6957 - 6967. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Wijaya, K. Rajaraman, S.-M. Yiu, and W.-K. Sung Detection of generic spaced motifs using submotif pattern mining Bioinformatics, June 15, 2007; 23(12): 1476 - 1485. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Ma, Y. Pan, J. Zheng, A. J. Levine, and R. Nussinov Sequence analysis of p53 response-elements suggests multiple binding modes of the p53 tetramer to DNA targets Nucleic Acids Res., May 14, 2007; 35(9): 2986 - 3001. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Chakravarty, J. M. Carlson, R. S. Khetani, C. E. DeZiel, and R. H. Gross SPACER: identification of cis-regulatory elements with non-contiguous critical residues Bioinformatics, April 15, 2007; 23(8): 1029 - 1031. [Abstract] [Full Text] [PDF] |
||||



