Skip Navigation

This Article
Right arrow Full Text (Print PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Milosavljevic, A.
Right arrow Articles by Jurka, J.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Milosavljevic, A.
Right arrow Articles by Jurka, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© Oxford University Press

Discovering simple DNA sequences by the algorithmic significance method

Aleksandar Milosavljevic 1 and Jerzy Jurka

Linus Pauling Institute of Science and Medicine 440 Page Mill Rd, Palo Alto, CA 94306, USA

1To whom reprint requests should be sent. Present address: Biological and Medical Research Division, Bldg 202, Argonne National Laboratory, Argonne, IL 60439–4833, USA

A new method, ‘algorithmic significance’, is proposed as a tool for discovery of patterns in DNA sequences. The main idea is that patterns can be discovered by finding ways to encode the observed data concisely. In this sense, the method can be viewed as a formal version of the Occam's Razor principle. In this paper the method is applied to discover significantly simple DNA sequences. We define DNA sequences to be simple if they contain repeated occurrences of certain ‘words’ and thus can be encoded in a small number of bits. Such definition includes minisatellites and microsatellites. A standard dynamic programming algorithm for data compression is applied to compute the minimal encoding lengths of sequences in linear time. An electronic mail server for identification of simple sequences based on the proposed method has been installed at the Internet address pythia@anl.gov.


Received on July 20, 1992; accepted on January 5, 1993

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
V. Paar, N. Pavin, M. Rosandic, M. Gluncic, I. Basar, R. Pezer, and S. D. Zinic
ColorHOR--novel graphical algorithm for fast scan of alpha satellite higher-order repeats and HOR annotation for GenBank sequence of human genome
Bioinformatics, April 1, 2005; 21(7): 846 - 852.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
K. J. Kalafus, A. R. Jackson, and A. Milosavljevic
Pash: Efficient Genome-Scale Sequence Anchoring by Positional Hashing
Genome Res., April 1, 2004; 14(4): 672 - 678.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
J. C. Jiang, P. A. Kirchman, M. Zagulski, J. Hunt, and S. M. Jazwinski
Homologs of the Yeast Longevity Gene LAG1 in Caenorhabditis elegans and Human
Genome Res., December 1, 1998; 8(12): 1259 - 1272.
[Abstract] [Full Text]


Home page
Genome ResHome page
L. C. Bailey Jr., S. Fischer, J. Schug, J. Crabtree, M. Gibson, and G. C. Overton
GAIA: Framework Annotation of Genomic Sequence
Genome Res., March 1, 1998; 8(3): 234 - 250.
[Abstract] [Full Text]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.