Bioinformatics, Vol 14, 715-725, Copyright © 1998 by Oxford University Press
P Vincens, L Buffat, C Andre, JP Chevrolat, JF Boisvieux and S Hazout2
MOTIVATION: Complete genomic sequences will become available in the future.
New methods to deal with very large sequences (sizes beyond 100 kb)
efficiently are required. One of the main aims of such work is to increase
our understanding of genome organization and evolution. This requires
studies of the locations of regions of similarity. RESULTS: We present here
a new tool, ASSIRC ('Accelerated Search for SImilarity Regions in
Chromosomes'), for finding regions of similarity in genomic sequences. The
method involves three steps: (i) identification of short exact chains of
fixed size, called 'seeds', common to both sequences, using hashing
functions; (ii) extension of these seeds into putative regions of
similarity by a 'random walk' procedure; (iii) final selection of regions
of similarity by assessing alignments of the putative sequences. We used
simulations to estimate the proportion of regions of similarity not
detected for particular region sizes, base identity proportions and seed
sizes. This approach can be tailored to the user's specifications. We
looked for regions of similarity between two yeast chromosomes (V and IX).
The efficiency of the approach was compared to those of conventional
programs BLAST and FASTA, by assessing CPU time required and the regions of
similarity found for the same data set. AVAILABILITY: Source programs are
freely available at the following address: ftp://ftp.biologie.ens.
fr/pub/molbio/assirc.tar.gz CONTACT: vincens@biologie.ens.fr,
hazout@urbb.jussieu.fr
ARTICLES
A strategy for finding regions of similarity in complete genome sequences
1Departement de Biologie (FR 36), Ecole Normale Superieure, 46 rue d'Ulm, 75230 Paris Cedex 05, France.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G. Achaz, F. Boyer, E. P. C. Rocha, A. Viari, and E. Coissac Repseek, a tool to retrieve approximate repeats from large DNA sequences Bioinformatics, January 1, 2007; 23(1): 119 - 121. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Trindade, R. van Berloo, M. Fiers, and R. G. F. Visser PRECISE: Software for Prediction of cis-Acting Regulatory Elements J. Hered., September 1, 2005; 96(5): 618 - 622. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. L. Chan, T. W. Lam, W. K. Sung, P. W. H. Wong, S. M. Yiu, and X. Fan The mutated subsequence problem and locating conserved genes Bioinformatics, May 15, 2005; 21(10): 2271 - 2278. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Achaz, P. Netter, and E. Coissac Study of Intrachromosomal Duplications Among the Eukaryote Genomes Mol. Biol. Evol., December 1, 2001; 18(12): 2280 - 2288. [Abstract] [Full Text] [PDF] |
||||


