Skip Navigation


Bioinformatics Advance Access originally published online on January 19, 2008
Bioinformatics 2008 24(5):711-712; doi:10.1093/bioinformatics/btn026
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/5/711    most recent
btn026v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gog, S.
Right arrow Articles by Ohlebusch, E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gog, S.
Right arrow Articles by Ohlebusch, E.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

GENESIS: genome evolution scenarios

Simon Gog *, Martin Bader and Enno Ohlebusch

Institute of Theoretical Computer Science, University of Ulm, D-89069 Ulm, Germany

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 REFERENCES
 

Summary: We implemented a software tool called GENESIS for three different genome rearrangement problems: Sorting a unichromosomal genome by weighted reversals and transpositions (SwRT), sorting a multichromosomal genome by reversals, translocations, fusions and fissions (SRTl), and sorting a multichromosomal genome by weighted reversals, translocations, fusions, fissions and transpositions (SwRTTl).

Availability: Source code can be obtained by the authors, or use the web interface http://www.uni-ulm.de/in/theo/research/genesis.html

Contact: simon.gog{at}uni-ulm.de


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 REFERENCES
 
During evolution, DNA molecules are subject to local and global mutations. Local mutations (point mutations) consist of the substitution, insertion or deletion of single nucleotides, while global mutations (genome rearrangements) change the DNA molecules on a large scale. In unichromosomal genomes, the most common rearrangements are inversions (also called reversals in bioinformatics), where—from a mathematical point of view—a section of the genome is excised, reversed in orientation and re-inserted. Biologically, inversions can be caused by replication errors. But also large-scale duplications, deletions (gene loss), insertions (e.g. horizontal gene transfer) and transpositions play a role. In a transposition, a section of the genome is excised and inserted at a new position in the genome; this may or may not also involve an inversion. In genomes with multiple chromosomes further genome rearrangements are translocations (in a reciprocal translocation, two non-homologous chromosomes break and exchange fragments), fusions (where two chromosomes fuse) and fissions (where a chromosome breaks into two parts).

Since the 1980s, computer scientists have developed several algorithms for reconstructing genome rearrangement scenarios that transform one genome into another genome (or equivalently, to sort a signed permutation into the identity permutation). These algorithms can be categorized according to the genome rearrangement operations they can deal with, and as to whether they take multiple chromosomes into account. For unichromosomal genomes, the following results are known. If only reversals are allowed, Hannenhalli and Pevzner's (1999) algorithm yields an exact solution to the problem. The currently best algorithm for transpositions is a 1.375-approximation (Elias and Hartman, 2006). For equally weighted reversals and transpositions, Hartman and Sharan (2005) devised a 1.5-approximation algorithm. Bader and Ohlebusch [2007] extended their algorithm to a 1.5-approximation algorithm for any weight ratio between 1:1 and 1:2 (reversals:transpositions). Another program for the weighted case is DERANGE (Blanchette et al., 1996), but the authors of Blacchette et al. (1996) did not provide a guaranteed approximation ratio.

For multichromosomal genomes, less results have been obtained so far. The most realistic solution to the problem was given by Hannenhalli and Pevzner (1995). Their algorithm takes reversals, translocations, fusions and fissions into account, and returns an exact solution. To the best of our knowlegde, the new algorithm presented below is the first that augments their algorithm with transpositions.

We have implemented the following three algorithms:

  1. The algorithm for SwRT by Bader and Ohlebusch (2007) with quadratic running time. Moreover, the combination of this algorithm with a greedy strategy resulted in a practicable method. The price to be paid for this improvement is a cubic running time.
  2. The algorithm for SRTl by Hannenhalli and Pevzner (1995) with the improvements of Tesler (2002) and of Ozery-Flato and Shamir (2003). The running time is quadratic.
  3. An algorithm for SwRTTl, which is a combination of the two algorithms above. The running time of the algorithm is cubic. Our experiments show that it produces good results for biologically reasonable weights. To guarantee an approximation ratio of 2, we further combined the new algorithm with the strategy presented by Yancopoulus et al., 2005, albeit for a different set of rearrangement operations.

GENESIS consists of two parts: The programs themselves and the web interface. The web interface is a simple front-end: One can choose the algorithm, set source and target genome, and get the resulting rearrangement scenario. For the sake of readability, the web interface limits the size of permutations to 80, whereas the main programs can handle permutations of several thousands elements.


    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 REFERENCES
 
All our algorithms work with the reality-desire diagram (also called breakpoint graph) as described in Bafna and Perzner (1996). Let c({pi}) denote the number of cycles in the reality-desire diagram, codd({pi}) the number of cycles with an odd number of reality-edges (called odd cycles), and ceven({pi}) the number of cycles with an even number of reality-edges (called even cycles). For the weighted algorithms, let wr be the weight of reversals, translocations, fusions and fissions, and let wt be the weight of transpositions. We make the assumption that wr ≤ wt ≤ 2wr.

The algorithm for SwRT is an implementation of the algorithm devised in Bader and Ohlebusch (2007). The score of a permutation is defined by {sigma}({pi}) = codd({pi}) + (2 – 2 · wr / wt)ceven({pi}). In each step, the algorithm searches for a small starting sequence op1, ... , opk with k ≤ 4 and weight w and an increment {Delta} {sigma} = {sigma}(opk(... (op1 ({pi})...))) – {sigma}({pi}) of the score such that {Delta} {sigma}/w ≥ 4/(3wt), and applies this starting sequence. This step is repeated until the permutation is sorted. As the score is maximized for the identity permutation and the maximum possible gain in score per weight is 2/wt, this yields a 1.5 approximation. We combined the algorithm with a greedy strategy that generates different starting sequences, out of which the one with the best gain in score per weight is applied. A further improvement of the output can be obtained if the greedy strategy is provided with a lookahead, which additionally considers the gain in score per weight of following starting sequences. Our experiments showed that the resulting method with limited lookahead is practicable.

The algorithm for SRTL is an efficient implementation of the algorithm proposed in Hannenhalli and Pevzner (1995). First, the reality-desire diagram is built. Elements corresponding to chromosome boundaries are connected such that the distance between the genomes remains the same. Then, in each step the algorithm applies a reversal or translocation that reduces the distance Formula by one. In the formula, n is the number of genes, m the number of chromosomes, and the remaining parameters as well as details are explained in Ozery-Flato and Shamir (2003).

The algorithm for SwRTTl is a mixture of both algorithms. First, it creates the reality-desire diagram. Elements corresponding to chromosome boundaries are connected such that {sigma}({pi}) is maximized. Then, the algorithm generates possible starting sequences as in the greedy strategy of the SwRT-algorithm. However, some of these sequences may contain forbidden transpositions because chromosome boundaries are not allowed in transposed segments. These starting sequences are ignored, and out of the remaining sequences the one with the best increment of score per weight is applied. If all the starting sequences are forbidden, then there must be a translocation that increases the number of cycles by one, and the algorithm applies this translocation. Again, these steps are repeated until the permutation is sorted.

Figure 1 exemplifies the application of our programs to two mitochondrial genomes. A more complex example consists of the whole genomes of man and mouse (data is available on our website). In this case SRTl produces a scenario with 35 operations, while our new algorithm SwRTTl finds a scenario with only 33 operations.


Figure 1
View larger version (30K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Rearrangement scenario between the mitochondrial genomes of Drosophila melanogaster (Garesse, 1988) and Anopheles quadrimaculatus (michell et al., 1993). The SwRT/SwRTTI-algorithm has found a scenario consisting of one reversal and one transpositions, whereas the SRTI algorithm calculates a scenario consisting of four reversals.

 

    Acknowledgments
 
Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Chris Stoeckert

Received on November 23, 2007; revised on January 14, 2008; accepted on January 14, 2008

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 REFERENCES
 

    Bader M, Ohlebusch E. Sorting by weighted reversals, transpositions, and inverted transpositions. J. Comput. Biol (2007) 14:615–636.[CrossRef][Medline]

    Bafna V, Pevzner PA. Genome rearrangements and sorting by reversals. SIAM J. Comput (1996) 25:272–289.[CrossRef]

    Blanchette M, et al. Parametric genome rearrangement. Gene (1996) 172:GC 11–17.[CrossRef][Medline]

    Elias I, Hartman T. A 1.375-approximation algorithm for sorting by transpositions. IEEE/ACM Trans. Comput. Biol. Bioinformatics (2006) 3:369–379.[CrossRef]

    Garesse R. Drosophila melanogaster mitochondrial DNA: Gene organization and evolutionary considerations. Genetics (1988) 118:649–663.[Abstract/Free Full Text]

    Hannenhalli S, Pevzner PA. Transforming men into mice (polynomial algorithm for genetic distance problem). In: Proc. 36th Annual IEEE Symposium on Foundations of Computer Science. (1995) 581–592.

    Hannenhalli S, Pevzner PA. Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. J. ACM (1999) 46:1–27.[Medline]

    Hartman T, Sharan R. A 1.5-approximation algorithm for sorting by transpositions and transreversals. J. Comput. Syst. Sci (2005) 70:300–320.[CrossRef]

    Michell S, et al. The mitochondrial genome of Anopheles quadrimaculatus species: complete nucleotide sequence and gene organization. Genome (1993) 36:1058–1073.[Medline]

    Ozery-Flato M, Shamir R. Two notes on genome rearrangement. J. Bioinformatics Comput. Biol (2003) 1:71–94.[CrossRef]

    Tesler G. Efficient algorithms for multichromosomal genome rearrangements. J. Comput. Syst. Sci (2002) 65:587–609.[CrossRef]

    Yancopoulos S, et al. Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics (2005) 21:3340–3346.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/5/711    most recent
btn026v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gog, S.
Right arrow Articles by Ohlebusch, E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gog, S.
Right arrow Articles by Ohlebusch, E.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?