Skip Navigation

Bioinformatics 2005 21(Suppl 1):i351-i358; doi:10.1093/bioinformatics/bti1018
This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Price, A. L.
Right arrow Articles by Pevzner, P. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Price, A. L.
Right arrow Articles by Pevzner, P. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

De novo identification of repeat families in large genomes

Alkes L. Price , Neil C. Jones and Pavel A. Pevzner *

Department of Computer Science and Engineering, University of California San Diego La Jolla, CA 92093-0114, USA

*To whom correspondence should be addressed.

Every time we compare two species that are closer to each other than either is to humans, we get nearly killed by unmasked repeats.

Webb Miller (Personal communication)

Motivation: De novo repeat family identification is a challenging algorithmic problem of great practical importance. As the number of genome sequencing projects increases, there is a pressing need to identify the repeat families present in large, newly sequenced genomes. We develop a new method for de novo identification of repeat families via extension of consensus seeds; our method enables a rigorous definition of repeat boundaries, a key issue in repeat analysis.

Results: Our RepeatScout algorithm is more sensitive and is orders of magnitude faster than RECON, the dominant tool for de novo repeat family identification in newly sequenced genomes. Using RepeatScout, we estimate that ~2% of the human genome and 4% of mouse and rat genomes consist of previously unannotated repetitive sequence.

Availability: Source code is available for download at http://www-cse.ucsd.edu/groups/bioinformatics/software.html

Contact: ppevzner{at}cs.ucsd.edu


Received on January 15, 2005; accepted on March 27, 2005

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Gen Biol EvolHome page
C. Feschotte, U. Keswani, N. Ranganathan, M. L. Guibotsy, and D. Levine
Exploring Repetitive DNA Landscapes Using REPCLASS, a Tool That Automates the Classification of Transposable Elements in Eukaryotic Genomes
Gen Biol Evol, August 12, 2009; 2009(0): 205 - 220.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Abrusan, N. Grundmann, L. DeMester, and W. Makalowski
TEclass--a tool for automated classification of unknown eukaryotic transposable elements
Bioinformatics, May 15, 2009; 25(10): 1329 - 1330.
[Abstract] [Full Text] [PDF]


Home page
The Plant GenomeHome page
L. A. Mueller, R. K. Lankhorst, S. D. Tanksley, J. J. Giovannoni, R. White, J. Vrebalov, Z. Fei, J. van Eck, R. Buels, A. A. Mills, et al.
A Snapshot of the Emerging Tomato Genome Sequence
The Plant Genome, March 1, 2009; 2(1): 78 - 92.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
B. Paten, J. Herrero, K. Beal, S. Fitzgerald, and E. Birney
Enredo and Pecan: Genome-wide mammalian consistency-based multiple alignment with paralogs
Genome Res., November 1, 2008; 18(11): 1814 - 1828.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Saha, S. Bridges, Z. V. Magbanua, and D. G. Peterson
Empirical comparison of ab initio repeat finding programs
Nucleic Acids Res., April 1, 2008; 36(7): 2284 - 2294.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
B. A. Kronmiller and R. P. Wise
TEnest: Automated Chronological Annotation and Visualization of Nested Plant Transposable Elements
Plant Physiology, January 1, 2008; 146(1): 45 - 59.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
C. M. Bergman and H. Quesneville
Discovering and detecting transposable elements in genome sequences
Brief Bioinform, November 1, 2007; 8(6): 382 - 392.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
J. K. Hane, R. G.T. Lowe, P. S. Solomon, K.-C. Tan, C. L. Schoch, J. W. Spatafora, P. W. Crous, C. Kodira, B. W. Birren, J. E. Galagan, et al.
Dothideomycete Plant Interactions Illuminated by Genome Sequencing and EST Analysis of the Wheat Pathogen Stagonospora nodorum
PLANT CELL, November 1, 2007; 19(11): 3347 - 3368.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Hou, P. Berman, C.-H. Hsu, and R. S. Harris
HomologMiner: looking for homologous genomic groups in whole genomes
Bioinformatics, April 15, 2007; 23(8): 917 - 925.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Achaz, F. Boyer, E. P. C. Rocha, A. Viari, and E. Coissac
Repseek, a tool to retrieve approximate repeats from large DNA sequences
Bioinformatics, January 1, 2007; 23(1): 119 - 121.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. J. Chaisson, B. J. Raphael, and P. A. Pevzner
Microinversions in mammalian evolution
PNAS, December 26, 2006; 103(52): 19824 - 19829.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Tempel, M. Giraud, D. Lavenier, I.-C. Lerman, A.-S. Valin, I. Couee, A. E. Amrani, and J. Nicolas
Domain organization within repeated DNA sequences: application to the study of a family of transposable elements
Bioinformatics, August 15, 2006; 22(16): 1948 - 1954.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Toth, G. Deak, E. Barta, and G. B. Kiss
PLOTREP: a web tool for defragmentation and visual analysis of dispersed genomic repeats.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W708 - W713.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.