Skip Navigation


Bioinformatics Advance Access originally published online on July 16, 2008
Bioinformatics 2008 24(20):2399-2400; doi:10.1093/bioinformatics/btn364
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/20/2399    most recent
btn364v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Griebel, T.
Right arrow Articles by Böcker, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Griebel, T.
Right arrow Articles by Böcker, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

EPoS: a modular software framework for phylogenetic analysis

Thasso Griebel *, Malte Brinkmeyer and Sebastian Böcker

Faculty of Mathematics and Computer Science, Friedrich-Schiller-University Jena, 07743 Jena, Germany

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUALIZATIONS
 3 DATA MANAGEMENT
 4 METHODS
 5 CONCLUSION
 REFERENCES
 

Summary: Estimating Phylogenies of Species (EPoS) is a modular software framework for phylogenetic analysis, visualization and data management. It provides a plugin-based system that integrates a storage facility, a rich user interface and the ability to easily incorporate new methods, functions and visualizations. EPoS ships with persistent data management, a set of well-known phylogenetic algorithms and a multitude of tree visualization methods and layouts. Implemented algorithms cover distance-based tree construction, consensus trees and various graph-based supertree methods. The rendering system can be customized for, say, different edge and node styles.

Availability: Executables and source code are available under the LGPL license at http://www.bio.informatik.uni-jena.de/epos.

Contact: thasso{at}minet.uni-jena.de

Supplementary information: The homepage contains tutorials and documentation for both users and programmers who want to develop plugins and extensions.


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUALIZATIONS
 3 DATA MANAGEMENT
 4 METHODS
 5 CONCLUSION
 REFERENCES
 
Estimating Phylogenies of Species (EPoS) is a modular software framework for phylogenetic analysis that supports data management, computational methods and visualizations. There exists a wide variety of tools for phylogenetic analysis, but most tools show significant problems regarding usability, data handling and exchange. Algorithmic packages are often command line based and enforce a good understanding of the software environment. On the other hand, visualization tools usually suffer from poor or no support for computational methods. Most programs rest upon their own, unique file formats, which makes data exchange between the programs difficult. Even in a single phylogenetic analysis, a user is required to adopt to a multitude of different interfaces, and has to manually convert data formats.

EPoS fills this gap by combining a powerful graphical user interface (GUI) with a plugin system that allows simple integration of new algorithms, visualizations and data structures. It offers a simple way to incorporate new modules into the framework. In fact, the system itself is built from a set of core modules, which allows extensions in all directions. Limitations only concern the GUI and interaction model. The consistent EPoS GUI is used to manage and store all data and start available computational methods. Thus, the phylogenetic analysis workflow is uncoupled from data and applied methods. EPoS ensures that new computational methods never disrupt existing workflows. Visualizations, on the contrary, can be extended in any direction.


Figure 1
View larger version (81K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Screenshot of the EPoS user interface with project tree (left), a distance matrix and several phylogenetic trees (right).

 

    2 VISUALIZATIONS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUALIZATIONS
 3 DATA MANAGEMENT
 4 METHODS
 5 CONCLUSION
 REFERENCES
 
EPoS contains views for trees, alignments and matrices. The alignment view allows for manual manipulations by modifying gaps, and the comprehensive tree view offers different layouts, colorizations, annotations and export functions. New views for various data types can easily be integrated into the framework, such as new tree layouts. The build-in tree view module focuses on interactive tree analysis and provides functionality to display large trees with several thousand leaves, without loosing the ability to smoothly interact with the view. When comparing two trees side-by-side, interacting with one tree can trigger actions in the second view, such as highlighting the best corresponding node (Munzner et al., 2003), see Section 4 below.


    3 DATA MANAGEMENT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUALIZATIONS
 3 DATA MANAGEMENT
 4 METHODS
 5 CONCLUSION
 REFERENCES
 
To simplify data handling, EPoS creates a persistent workspace that contains all data using a transparent and extendable back-end module. Changes in, say, the visualization of a tree (such as colors or layout) are persistently stored in a tree visualization object. EPoS uses an embedded database as default storage location, but there is no need for the user to manually interact with the database. Data can also be stored on a remote database server.

EPoS’ Application Programming Interface (API) allows data objects to carry private data and supplementary properties. For example, web services can be used to obtain additional information on an object without modifying the objects implementation. This feature can also be used by computational methods that need supplementary information besides tree structure: The Ranked Tree algorithm (Bryant et al., 2004) requires information about divergence dates in the input trees, see Section 4. Such data is simply added to the trees as a supplementary property.


    4 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUALIZATIONS
 3 DATA MANAGEMENT
 4 METHODS
 5 CONCLUSION
 REFERENCES
 
All computational methods are integrated into a pipeline system. This allows combinations of methods to be executed sequentially, where the data flow is handled automatically by the system. EPoS provides pipelines for different computational methods. It supports distance-based tree reconstruction methods including Neighbor Joining (Saitou and Nei, 1987) and Agglomerative Clustering, consensus construction such as Adams- and N-Consensus and several supertree methods that merge trees with overlapping leave sets. EPoS directly supports Aho's Build (Aho et al., 1981), MinCut (Semple and Steel, 2000), modified MinCut (Page, 2002), Ranked Tree (Bryant et al., 2004) and Ancestral Build (Berry and Semple, 2006) as graph-based supertree algorithms. In addition, we implemented a tree comparison method based on the best corresponding node from (Munzner et al., 2003). This method matches each node from one tree to a corresponding node in the other tree. This is based on the comparison of leaf sets under the compared nodes. We extended the method to handle internal labels and propagate subtree scores upwards. No external software packages have to be installed to use any of these algorithms. New methods can be easily integrated into EPoS, as explained in the web tutorial.

The execution environment is another extendable part within the framework. In this way, EPoS is not limited to the local machine for executing pipelines. In the future, this will allow data and jobs to be moved to other machines or compute grids.


    5 CONCLUSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUALIZATIONS
 3 DATA MANAGEMENT
 4 METHODS
 5 CONCLUSION
 REFERENCES
 
EPoS provides a scalable and extendable software framework for phylogenetic analysis. EPoS combines computational methods, data visualization tools and data management into one environment. The simplicity of the underlying plugin mechanism allows developers to easily integrate their own tools and algorithm into the framework, and to benefit from method provided by others. The process of connecting algorithms to data and data to visualization is completely covered by the system. Developers do not have to worry about persistency and data integrity. Users can access new computational methods without adapting to a new software environment.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Martin Bishop

Received on April 11, 2008; revised on June 20, 2008; accepted on July 14, 2008

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUALIZATIONS
 3 DATA MANAGEMENT
 4 METHODS
 5 CONCLUSION
 REFERENCES
 

    Aho AV, et al. Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J. Comput. (1981) 10:405–421.[CrossRef]

    Berry V, Semple C. Fast computation of supertrees for compatible phylogenies with nested taxa. Syst. Biol. (2006) 55:270–288.[CrossRef][Web of Science][Medline]

    Bryant D, et al. Supertree methods for ancestral divergence dates and other applications (2004) Kluwer: Computational Biology Series. 129–150.

    Munzner T, et al. Treejuxtaposer: scalable tree comparison using focus+context with guaranteed visibility. ACM Trans. Graph. (2003) 22:453–462.[CrossRef]

    Page R.DM. Modified mincut supertrees. In: Proceedings of Workshop on Algorithms in Bioinformatics (WABI 2002) (2002) Springer. 537–552. Vol. 2452 of LNCS.

    Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. (1987) 4:406–425.[Abstract]

    Semple C, Steel M. A supertree method for rooted trees. Discrete Appl. Math. (2000) 105:147–158.[CrossRef]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/20/2399    most recent
btn364v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Griebel, T.
Right arrow Articles by Böcker, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Griebel, T.
Right arrow Articles by Böcker, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?