Skip Navigation

Bioinformatics 2005 21(Suppl 3):iii31-iii38; doi:10.1093/bioinformatics/bti1200
This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Cartwright, R. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Cartwright, R. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oxfordjournals.org

DNA assembly with gaps (Dawg): simulating sequence evolution

Reed A. Cartwright

Department of Genetics, University of Georgia Athens, GA 30602-7223, USA

Motivation: Relationships amongst taxa are inferred from biological data using phylogenetic methods and procedures. Very few known phylogenies exist against which to test the accuracy of our inferences. Therefore, in the absence of biological data, simulated data must be used to test the accuracy of methods which produce these inferences. Researchers have limited or non-existent options for simulations useful for studying the impact of insertions, deletions, and alignments on phylogenetic accuracy.

Results: To satisfy this gap I have developed a new algorithm of indel formation and incorporated it into a new, flexible, and portable application for sequence simulation. The application, called Dawg, simulates phylogenetic evolution of DNA sequences in continuous time using the robust general time reversible model with gamma and invariant rate heterogeneity and a novel length-dependent model of indel formation. On completion, Dawg produces the true alignment of the simulated sequences. Unlike other applications, Dawg allows indel lengths to be explicitly distributed via a biologically realistic power law. Many options are available to allow users to customize their simulations and results. Because simulating with indels would be problematic if biologically realistic parameters could not be estimated, a script is provided with Dawg that can estimate the parameters of indel formation from sequence data. Dawg was applied to the sequences of four chloroplast trnK introns. It was used to parametrically bootstrap an estimation of the rate of indel formation for the phylogeny. Because Dawg can assist in parametric bootstrapping of sequence data it is useful beyond phylogenetics, such as studying alignment algorithms or parameters of molecular evolution.

Availability: Dawg 1.0.0 can be obtained at the following websites: http://www.genetics.uga.edu/sw/ or http://scit.us/dawg/. The package includes source code, example files, a brief manual and helper scripts. Binary distributions are available for Windows and Macintosh OS X. A development page for Dawg exists at http://scit.us/dawg/, with links to a Subversion repository, mailing lists and updated versions.

Contact: rac{at}uga.edu


Received on May 29, 2005; accepted on August 16, 2005

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
B. G. Hall
Simulating DNA Coding Sequence Evolution with EvolveAGene 3
Mol. Biol. Evol., April 1, 2008; 25(4): 688 - 695.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Yang and L. Zhang
Performance comparison between k-tuple distance and four model-based distances in phylogenetic tree reconstruction
Nucleic Acids Res., March 1, 2008; 36(5): e33 - e33.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Bot.Home page
J. Shaw, E. B. Lickey, E. E. Schilling, and R. L. Small
Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: the tortoise and the hare III
Am. J. Botany, March 1, 2007; 94(3): 275 - 288.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
C. L. Strope, S. D. Scott, and E. N. Moriyama
indel-Seq-Gen: A New Protein Family Simulator Incorporating Domains, Motifs, and Indels
Mol. Biol. Evol., March 1, 2007; 24(3): 640 - 649.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Kim and S. Sinha
Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment
Bioinformatics, February 1, 2007; 23(3): 289 - 297.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.