Bioinformatics, Vol 14, 55-67, Copyright © 1998 by Oxford University Press
I Rigoutsos and A Floratos
MOTIVATION: The discovery of motifs in biological sequences is an important
problem. RESULTS: This paper presents a new algorithm for the discovery of
rigid patterns (motifs) in biological sequences. Our method is
combinatorial in nature and able to produce all patterns that appear in at
least a (user-defined) minimum number of sequences, yet it manages to be
very efficient by avoiding the enumeration of the entire pattern space.
Furthermore, the reported patterns are maximal: any reported pattern cannot
be made more specific and still keep on appearing at the exact same
positions within the input sequences. The effectiveness of the proposed
approach is showcased on a number of test cases which aim to: (i) validate
the approach through the discovery of previously reported patterns; (ii)
demonstrate the capability to identify automatically highly selective
patterns particular to the sequences under consideration. Finally,
experimental analysis indicates that the algorithm is output sensitive,
i.e. its running time is quasi- linear to the size of the generated output.
ARTICLES
Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm [published erratum appears in Bioinformatics 1998;14(2):229]
Computational Biology Center, IBM Thomas J. Watson Research Center, York Town Heights, NY 10598, USA.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y. Shimoda, H. Mitsui, H. Kamimatsuse, K. Minamisawa, E. Nishiyama, Y. Ohtsubo, Y. Nagata, M. Tsuda, S. Shinpo, A. Watanabe, et al. Construction of Signature-tagged Mutant Library in Mesorhizobium loti as a Powerful Tool for Functional Genomics DNA Res, July 25, 2008; (2008) dsn017v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-Y. Chien, D. T.-H. Chang, C.-Y. Chen, Y.-Z. Weng, and C.-M. Hsu E1DS: catalytic site prediction based on 1D signatures of concurrent conservation Nucleic Acids Res., July 1, 2008; 36(suppl_2): W291 - W296. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tsirigos and I. Rigoutsos Human and mouse introns are linked to the same processes and functions through each genome's most frequent non-conserved motifs Nucleic Acids Res., June 1, 2008; 36(10): 3484 - 3493. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. O. Hoque, M. S. Kim, K. L. Ostrow, J. Liu, G. B. A. Wisman, H. L. Park, M. L. Poeta, C. Jeronimo, R. Henrique, A. Lendvai, et al. Genome-Wide Promoter Analysis Uncovers Portions of the Cancer Methylome Cancer Res., April 15, 2008; 68(8): 2661 - 2670. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. He and J. Parkinson SubSeqer: a graph-based approach for the detection and identification of repetitive elements in low-complexity sequences Bioinformatics, April 1, 2008; 24(7): 1016 - 1017. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-M. Hsu, C.-Y. Chen, and B.-J. Liu Corrigendum Nucleic Acids Res., March 27, 2008; 36(4): 1400 - 1406. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Miller, S. Hanke, A. M. Hinsby, C. Friis, S. Brunak, M. Mann, and N. Blom Motif Decomposition of the Phosphotyrosine Proteome Reveals a New N-terminal Binding Motif for SHIP2 Mol. Cell. Proteomics, January 1, 2008; 7(1): 181 - 192. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Cai, B. Hartnett, C. Gustafsson, and J. Peccoud A syntactic model to design and verify synthetic genetic constructs derived from standard biological parts Bioinformatics, October 15, 2007; 23(20): 2760 - 2767. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. E. Davey, R. J. Edwards, and D. C. Shields The SLiMDisc server: short, linear motif discovery in proteins Nucleic Acids Res., July 13, 2007; 35(suppl_2): W455 - W459. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. J.E. Stulemeijer, J. W. Stratmann, and M. H.A.J. Joosten Tomato Mitogen-Activated Protein Kinases LeMPK1, LeMPK2, and LeMPK3 Are Activated during the Cf-4/Avr4-Induced Hypersensitive Response and Have Distinct Phosphorylation Specificities Plant Physiology, July 1, 2007; 144(3): 1481 - 1494. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Wijaya, K. Rajaraman, S.-M. Yiu, and W.-K. Sung Detection of generic spaced motifs using submotif pattern mining Bioinformatics, June 15, 2007; 23(12): 1476 - 1485. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Papadopoulos and R. Agarwala COBALT: constraint-based alignment tool for multiple protein sequences Bioinformatics, May 1, 2007; 23(9): 1073 - 1079. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Ye, W. A. Kosters, and A. P. IJzerman An efficient, versatile and scalable pattern growth approach to mine frequent patterns in unaligned protein sequences Bioinformatics, March 15, 2007; 23(6): 687 - 693. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Thibault, J. Yudin, P. Wong, V. Tsitrin, R. Sprangers, R. Zhao, and W. A. Houry Specificity in substrate and cofactor recognition by the N-terminal domain of the chaperone ClpX PNAS, November 21, 2006; 103(47): 17724 - 17729. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. E. Davey, D. C. Shields, and R. J. Edwards SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent Nucleic Acids Res., July 19, 2006; 34(12): 3546 - 3554. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. GuhaThakurta Computational identification of transcriptional regulatory elements in DNA sequence Nucleic Acids Res., July 19, 2006; 34(12): 3585 - 3598. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Neduva and R. B. Russell DILIMOT: discovery of linear motifs in proteins. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W350 - W355. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-M. Hsu, C.-Y. Chen, and B.-J. Liu MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W356 - W361. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Rigoutsos, T. Huynh, K. Miranda, A. Tsirigos, A. McHardy, and D. Platt Short blocks from the noncoding parts of the human genome have instances within nearly all known genes and relate to biological processes PNAS, April 25, 2006; 103(17): 6605 - 6610. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Sivakumar, C. Wilton, and L. Holm From sequences to a functional unit Physiol Genomics, March 13, 2006; 25(1): 1 - 8. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q.-w. Dong, X.-l. Wang, and L. Lin Application of latent semantic analysis to protein remote homology detection Bioinformatics, February 1, 2006; 22(3): 285 - 290. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Jensen, M. P. Styczynski, I. Rigoutsos, and G. N. Stephanopoulos A generic motif discovery algorithm for sequential data Bioinformatics, January 1, 2006; 22(1): 21 - 28. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Goldovsky, P. Janssen, D. Ahren, B. Audit, I. Cases, N. Darzentas, A. J. Enright, N. Lopez-Bigas, J. M. Peregrin-Alvarez, M. Smith, et al. CoGenT++: an extensive and extensible data environment for computational genomics Bioinformatics, October 1, 2005; 21(19): 3806 - 3810. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. G. Beiko, C. X. Chan, and M. A. Ragan A word-oriented approach to alignment validation Bioinformatics, May 15, 2005; 21(10): 2230 - 2239. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Mahony, D. Hendrix, A. Golden, T. J. Smith, and D. S. Rokhsar Transcription factor binding site identification using the self-organizing map Bioinformatics, May 1, 2005; 21(9): 1807 - 1814. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Wouters, I. Rigoutsos, C. K. Chu, L. L. Feng, D. B. Sparrow, and S. L. Dunwoodie Evolution of distinct EGF domains with specific functions Protein Sci., April 1, 2005; 14(4): 1091 - 1103. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Kiesler, M. E. Hase, D. Brodin, and N. Visa Hrp59, an hnRNP M protein in Chironomus and Drosophila, binds to exonic splicing enhancers and is required for expression of a subset of mRNAs J. Cell Biol., March 28, 2005; 168(7): 1013 - 1025. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Otaki, S. Ienaka, T. Gotoh, and H. Yamamoto Availability of short amino acid sequences in proteins Protein Sci., March 1, 2005; 14(3): 617 - 625. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Huynh and I. Rigoutsos The web server of IBM's Bioinformatics and Pattern Discovery group: 2004 update Nucleic Acids Res., July 1, 2004; 32(suppl_2): W10 - W15. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Wong, R. M. Raab, I. Rigoutsos, G. N. Stephanopoulos, and J. K. Kelleher Metabolic and transcriptional patterns accompanying glutamine depletion and repletion in mouse hepatoma cells: a model for physiological regulatory networks Physiol Genomics, January 15, 2004; 16(2): 247 - 255. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Rigoutsos, P. Riek, R. M. Graham, and J. Novotny Structural details (kinks and non-{alpha} conformations) in transmembrane helices are intrahelically determined and can be predicted by sequence pattern descriptors Nucleic Acids Res., August 1, 2003; 31(15): 4625 - 4631. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Huynh, I. Rigoutsos, L. Parida, D. Platt, and T. Shibuya The web server of IBM's Bioinformatics and Pattern Discovery group Nucleic Acids Res., July 1, 2003; 31(13): 3645 - 3650. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Klass, F. V. Murphy IV, S. Fouts, M. Serenil, A. Changela, J. Siple, and M. E. A. Churchill The role of intercalating residues in chromosomal high-mobility-group protein DNA binding, bending and specificity Nucleic Acids Res., June 1, 2003; 31(11): 2852 - 2864. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Ettwiller, J. Rung, and E. Birney Discovering Novel cis-Regulatory Motifs Using Functional Networks Genome Res., May 1, 2003; 13(5): 883 - 895. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Rigoutsos, J. Novotny, T. Huynh, S. T. Chin-Bow, L. Parida, D. Platt, D. Coleman, and T. Shenk In Silico Pattern-Based Analysis of the Human Cytomegalovirus Genome J. Virol., April 1, 2003; 77(7): 4326 - 4344. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Rigoutsos, T. Huynh, A. Floratos, L. Parida, and D. Platt Dictionary-driven protein annotation Nucleic Acids Res., September 1, 2002; 30(17): 3901 - 3916. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Shibuya and I. Rigoutsos Dictionary-driven prokaryotic gene finding Nucleic Acids Res., June 15, 2002; 30(12): 2710 - 2725. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. J. Bussemaker, H. Li, and E. D. Siggia Building a dictionary for genomes: Identification of presumptive regulatory sites by statistical analysis PNAS, August 10, 2000; (2000) 180265397. [Abstract] [Full Text] |
||||
![]() |
H. J. Bussemaker, H. Li, and E. D. Siggia From the Cover: Building a dictionary for genomes: Identification of presumptive regulatory sites by statistical analysis PNAS, August 29, 2000; 97(18): 10096 - 10100. [Abstract] [Full Text] [PDF] |
||||











