Skip Navigation


Bioinformatics Advance Access originally published online on September 25, 2007
Bioinformatics 2007 23(21):2957-2958; doi:10.1093/bioinformatics/btm468
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/21/2957    most recent
btm468v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Bernt, M.
Right arrow Articles by Middendorf, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bernt, M.
Right arrow Articles by Middendorf, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

CREx: inferring genomic rearrangements based on common intervals

Matthias Bernt 1,*, Daniel Merkle 1, Kai Ramsch 1, Guido Fritzsch 2,3, Marleen Perseke 4, Detlef Bernhard 4, Martin Schlegel 4, Peter F. Stadler 2,3,5,6,7 and Martin Middendorf 1

1Parallel Computing and Complex Systems Group, 2Bioinformatics Group, Department of Computer Science, 3Interdisciplinary Center for Bioinformatics, 4Institute of Biology II, University of Leipzig, 5Fraunhofer Institute IZI, Leipzig, Germany, 6Department of Theoretical Chemistry, University of Vienna, Austria and 7Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CREX
 4 CASE STUDY AND...
 ACKNOWLEDGEMENT
 REFERENCES
 

Summary: We present the web-based program CREx for heuristically determining pairwise rearrangement events in unichromosomal genomes. CREx considers transpositions, reverse transpositions, reversals and tandem-duplication-random-loss (TDRL) events. It supports the user in finding parsimonious rearrangement scenarios given a phylogenetic hypothesis. CREx is based on common intervals, which reflect genes that appear consecutively in several of the input gene orders.

Availability: CREx is freely available at http://pacosy.informatik.uni-leipzig.de/crex

Contact: merkle{at}informatik.uni-leipzig.de


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CREX
 4 CASE STUDY AND...
 ACKNOWLEDGEMENT
 REFERENCES
 
Genomic rearrangement operations present a powerful approach to determine the phylogenetic relationship of species. In this context, unichromosomal genomes are usually encoded as signed permutations, in which each element represents a gene and the sign determines the orientation of the gene. Inversions (mostly referred to as ‘reversals’ in the bioinformatics literature) are by far the best-studied gene order rearrangement operations. Given a set of gene orders, software packages such as GRAPPA (Moret et al., 2005) and amGRP (Bernt et al., 2007a) try to find (i) a topology of a binary tree with the gene orders as leaf nodes and (ii) gene orders for the inner nodes of that tree, such that the overall reversal distance of all edges is minimal.

Phylogenetic reconstruction methods based on more general sets of rearrangement operations are rare, presumably because for some of the operations of interest the problem of computing an exact distance is unsolved or computationally hard. On the other hand, there is ample evidence in the biological literature that—particularly in mitochondrial genomes—rearrangements are the consequence of multiple mechanisms (Boore 1999). Biological findings thus strongly influenced methods for phylogenetic reconstruction methods; it is well known, for instance, that gene groups are often preserved during evolution. Algorithmically, this property is reflected by so-called common intervals. Strong common interval trees, which are close relatives of PQ-trees (Booth and Lueker, 1976), are a data structure that is particularly well suited for inferring rearrangement scenarios (Bérard et al., 2007). Algorithms for gene order rearrangements that preserve common intervals are also discussed by Bernt et al., (2007b). Strong common intervals, on the other hand, were already used to reconstruct rearrangement scenarios with the focus on reversals and transpositions (Parida 2006). CREx is based on these approaches and extends them to a larger set of operations.


    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CREX
 4 CASE STUDY AND...
 ACKNOWLEDGEMENT
 REFERENCES
 
Formally, a common interval is a subset of genes that appear consecutively in two (or more) input gene orders (Bérard et al., 2007). Two intervals I and J are said to commute if either I sub J, or I sup J, or I {cap} J = Ø holds. A common interval is a strong interval if it commutes with every common interval. A strong interval tree (SIT) for two gene orders, finally, is a rooted tree that has exactly one leaf for each gene and exactly one inner node for each strong common interval. The edges of the tree are defined by the inclusion order of the set of strong intervals. There exist two types of inner nodes. An inner node is called (increasing or decreasing) linear if its child nodes are in a left-to-right or right-to-left order, otherwise the child nodes are not ordered and the inner node is called prime.

The basic principle for computing heuristic gene order rearrangement scenarios with CREx is to detect patterns in the SITs that reflect the corresponding genome rearrangement operations. For example, reversals are very simple to detect because this operation is reflected as a sign difference of parent node and child nodes in the SIT. Prime nodes are good indicators for TDRL events. Both transpositions and reverse transpositions also lead to recognizable patterns in the SIT (for details see the webpage of CREx). CREx uses a stepwise approach to suggesting a genome rearrangement scenario: first it identifies transpositions and reverse transpositions, then reversals are identified based on the sign differences between connected nodes in the SIT. In these first two steps, CREx only operates on linear nodes. In the third step, the prime nodes are analyzed to identify combinations of reversal and TDRL operations (including transpositions), which can explain the corresponding prime nodes.

TDRLs are of particular interest for phylogenetic analysis because the distance measure between gene orders that is based on the minimum number of TDRLs is not symmetric (Chaudhuri et al., 2006). In particular, in many cases a rearrangement can be explained by a single TDRL in one direction, while reversing the rearrangement would require more than a single operation. It follows that TDRLs imply the evolutionary direction of the rearrangement in many cases and hence allow the re-construction of the ancestral state from a comparison of two gene orders without considering an outgroup. In contrast, reversals, transpositions and reverse transpositions are inherently symmetric, hence ancestral states cannot be reconstructed without outgroup information.


    3 CREX
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CREX
 4 CASE STUDY AND...
 ACKNOWLEDGEMENT
 REFERENCES
 
CREx is a web-based application for analyzing gene orders based on the application server Zope. The algorithms handling the common interval data structures and for computing scenarios were implemented in C++. They were integrated into Zope via Python modules. For drawing publication ready downloadable versions of the SITs ReportLab was used.

After uploading the gene order data in FASTA format to CREx, a matrix is computed and displayed, with elements colored according to an evolutionary distance measure, such that gene orders with a small evolutionary distance can be easily identified. Therefore, the breakpoint distance, the reversal distance or the number of common intervals is utilized. Columns, rows and individual elements of the matrix can be selected; for the selected elements the SITs are computed and displayed as a tree or as a family diagram (Bergeron and Stoye, 2006). As briefly described in Section 2, the structure of the SIT can be used to infer genomic rearrangement operations connecting a pair of input gene orders. CREx uses its heuristic approach to suggest a rearrangement scenario based on common intervals. CREx allows the user to select individual operations of the scenario to highlight the affected common intervals.


    4 CASE STUDY AND DISCUSSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CREX
 4 CASE STUDY AND...
 ACKNOWLEDGEMENT
 REFERENCES
 
While testing CREx we found that, to the best of our knowledge, all reported cases of TDRL events in mitochondrial gene orders in the literature were correctly identified (Arndt and Smith 1998; Inoue et al., 2003). We furthermore used CREx to analyze the genomic rearrangement history of the complete mitochondrial genomes of echinoderms (including the newly sequenced mitogenomes of the crinoid Antedon mediterranea and the ophiuroid Ophiura albida), see Perseke (et al., 2007).

For the sake of space, we can discuss here only a small example to illustrate the functionality of CREx. We analyze the mitochondrial gene order rearrangement history of Asteroidea (A), Echinoidea (E), and Holothuroidea (H) (within these groups all sequenced species have the same gene order). CREx identifies three TDRL events for the evolution H -> E, and one TDRL event, which was analyzed in detail by Arndt and Smith (1998), for E -> H. This implies that the ancestral state conforms E. For A {leftrightarrow} E a single reversal is needed. As CREx suggests the same reversal and the same TDRL for A {leftrightarrow} H (not shown here), we can immediately infer the genome rearrangement events for the phylogeny in Figure 1a. A more detailed analysis including all published echinoderm mitochondrial genomes and different phylogenetic hypotheses is presented in Perseke et al. (2007). The SIT for E ->H is given in Figure 1b, the corresponding TDRL is depicted in Figure 1c.


Figure 1
View larger version (28K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. (a) Mitochondrial gene order rearrangement scenario inferred by CREx for the given phylogeny of Asteroidea (A) Echinoidea (E) and Holothuroidea (H) showing one reversal (REV) and one TDRL; (b) SIT for E -> H and (c) TDRL of E ->H as suggested by CREx; (b) and (c) are exported from CREx.

 

    ACKNOWLEDGEMENT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CREX
 4 CASE STUDY AND...
 ACKNOWLEDGEMENT
 REFERENCES
 
This work was supported by the DFG grant STA 850/3-1 and SCHL 229/14-1 as part of the SPP-1174 (Metazoan Deep Phylogeny).

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Keith Crandall

Received on June 27, 2007; accepted on September 7, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CREX
 4 CASE STUDY AND...
 ACKNOWLEDGEMENT
 REFERENCES
 

    Arndt A, Smith MJ. Mitochondrial gene rearrangement in the sea cucumber genus cucumaria. Mol. Biol. Evol. (1998) 15:9–16.[Medline]

    Bérard S, et al. Perfect sorting by reversals is not always difficult. IEEE/ACM Trans. Comput. Biol. Bioinform. (2007) 4:4–16.[CrossRef]

    Bergeron A, Stoye J. On the similarity of sets of permutations and its applications to genome comparison. J. Comput. Biol. (2006) 13:1345–1354.

    Bernt M, et al. Using median sets for inferring phylogenetic trees. Bioinformatics (2007a) 23:e129–e135.[Abstract/Free Full Text]

    Bernt M, et al. A fast and exact algorithm for the perfect reversal median problem. In: Bioinformatics Research and Applications, Proc. of ISBRA 2007, number 4463 in LNBI (2007b) Berlin/Heidelberg: Springer. 305–316.

    Boore JL. Survey and summary animal mitochondrial genomes. Nucleic Acids Res. (1999) 27:1767–1780.[Abstract/Free Full Text]

    Booth KS, Lueker GS. Testing for the consecutive ones property, interval graphs, and planarity using pq-tree algorithms. J. Comput. Syst. Sci. (1976) 13:335–379.

    Chaudhuri K, et al. On the tandem duplication-random loss model of genome rearrangement. Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm. ACM Press, New York, NY (2006) pp. 564–570.

    Inoue JG, et al. Evolution of the deep-sea gulper eel mitochondrial genomes: large-scale gene rearrangements originated within the eels. Mol. Biol. Evol. (2003) 20:1917–1924.[Abstract/Free Full Text]

    Moret B, et al. Reconstructing phylogenies from gene-content and gene-order data. Mathematics of Evolution and Phylogenic, Chapter 12 (2005) pp. 321–352.

    Parida L. Using pq structures for genomic rearrangement phylogeny. J. Comput. Biol. (2006) 13:1685–1700.[CrossRef][Web of Science][Medline]

    Perseke M, et al. Evolution of mitochondrial gene orders in echinoderms (submitted) (2007).


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/21/2957    most recent
btm468v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Bernt, M.
Right arrow Articles by Middendorf, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bernt, M.
Right arrow Articles by Middendorf, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?