Bioinformatics Advance Access originally published online on July 14, 2006
Bioinformatics 2006 22(18):2308-2309; doi:10.1093/bioinformatics/btl389
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SynView: a GBrowse-compatible approach to visualizing comparative genome data
1 Center for Tropical and Emerging Global Diseases, University of Georgia Athens, GA 30602-2606, USA
2 Department of Genetics, University of Georgia Athens, GA 30602-7223, USA
3 Department of Computer Science, University of Georgia Athens, GA 30602-7404, USA
4 Penn Genomics Institute, University of Pennsylvania Philadelphia, PA 19104-6017, USA
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Summary: We present SynView, a simple and generic approach to dynamically visualize multi-species comparative genome data. It is a light-weight application based on the popular and configurable web-based GBrowse framework. It can be used with a variety of databases and provides the user with a high degree of interactivity. The tool is written in Perl and runs on top of the GBrowse framework. It is in use in the PlasmoDB (http://www.PlasmoDB.org) and the CryptoDB (http://www.CryptoDB.org) projects and can be easily integrated into other cross-species comparative genome projects.
Availability: The program and instructions are freely available at http://www.ApiDB.org/apps/SynView/
Contact: jkissing{at}uga.edu
| INTRODUCTION |
|---|
|
|
|---|
Comparative genome analyses of multiple species separated by varying degrees of evolutionary distances currently serve a multitude of roles. They can serve as an essential means to identify genes and gene regulatory elements, to facilitate annotation, and to study genome evolution. Visualization tools are useful in performing a systematic comparison of genomic sequences. The Generic Genome Browser, also known as GBrowse (Stein et al., 2002), is a portable, flexible, web-based application for displaying genomic annotations and features. It has been widely adopted by major model organism genome projects. However, the lack of support for multi-species genome visualization leaves GBrowse unsuitable for comparative genome projects. Current comparative genome viewers, such as SynBrowse (Pan et al., 2005), SYBIL (http://sybil.sourceforge.net) and VISTA (Frazer et al., 2004), require independent setup, customization, and system maintenance, even when the well-developed GBrowse has already been implemented by the project. Therefore, in order to maximally exploit the rich annotation and features provided by GBrowse and to avoid duplication of effort in the setup and maintenance of a new synteny visualization tool, we developed SynView, a light-weight, interactive and customizable comparative genomic visualization tool based on the popular GBrowse framework. It can display both the genome comparison and its associated functional annotations in the same working environment. It is seamlessly integrated with GBrowse and inherits GBrowse functionality, such as semantic zooming and feature filtering. SynView can be easily adopted by other cross-species genome comparison projects owing to its flexibility and versatility in system customization and database connection.
| IMPLEMENATION |
|---|
|
|
|---|
By default, GBrowse does not support drawing connections between features on different tracks. However, the Bio::Graphics::Panel (http://www.bioperl.org) module used by GBrowse accepts a postgrid argument that allows drawing callback-generated images onto the panel's background after the grid lines are drawn. Taking advantage of this feature, SynView draws trapezoids highlighting relationships between any pair of features within a single GBrowse panel.
Mapping data containing features to be compared can be generated by a variety of different mechanisms. To map a query genome to a reference genome, it is possible to use Mummer (Kurtz et al., 2004) or BLASTN (Altschul et al., 1997) among others. To identify orthologous groups of proteins, we use OrthoMCL (Chen et al., 2006) because it provides a scalable method for constructing orthologous groups across multiple taxa, using a Markov Clustering algorithm to group orthologs and paralogs.
SynView can read mapping data and allow a user to compare multiple genomes at several scales. Figure 1 illustrates several features of SynView. In this example, we compare a portion of two eukaryotic genomes, Homo sapiens and Fugu rubripes. H.sapiens chromosome 12 is the reference genome and F.rubripes, Ensembl build version 4.0, is the query genome. Genome alignments are generated by BLASTN, and then based on the genome alignment, the coordinates of the genes on the query genome are updated and mapped to the reference genome. In this example, SynView reads the mapping data generated by OrthoMCL. It draws a pink-shaded trapezoid between each pair of orthologs to highlight the relationship. Trapezoids can be drawn based on gene orthology, similarity span or exon as illustrated by Figure 1A and B, respectively. A set of customizable colors, from light to dark pink are used to differentiate the levels of similarity, Figure 1C. SynView will highlight the background of a gene (orange in this example) if its orthologous gene is not in the current displayed frame. SynView can also display comparisons of sequences from multiple species. Figure 1D illustrates a comparison of three Plasmodium species.
|
| DISCUSSION |
|---|
|
|
|---|
SynView was developed for use in the CryptoDB project (Heiges et al., 2006), and has since been adopted by the PlasmoDB project (Bahl et al., 2003; Kissinger et al., 2002) to compare several genomes that are more divergent than Cryptosporidium. SynView differs from other comparative genomics visualization tools in that it is compatible with and runs on top of GBrowse. As a result it can easily be integrated with other projects.
We find SynView's colored trapezoids to be much less overwhelming and more straightforward to interpret in contrast to SynBrowse, which draws a complicated spider-web of lines between every shared exon/intron boundary. Interactivity is another noteworthy feature of SynView. Rather than presenting an overwhelmingly large amount of information to the user at once, SynView gives the user control over the items to be displayed. Additional data, including the exact location of syntenic genes, can be retrieved at will via a mouse-over feature. Furthermore, each trapezoid region can be configured as mouse clickable and hyperlink to any specific description pages.
As the number of genomes to be compared increases, the complexity of the comparison will consequently skyrocket in terms of both computational time and visualization demands. Visualization alone will provide rather limited support to large-scale comparison of gene orders and detection of rearrangements. Therefore, we are in the process of developing scripts to automatically compare and detect potential synteny breakpoints and gene order patterns. This feature will facilitate the progress of genome comparisons and sequence annotation.
| Acknowledgments |
|---|
The authors thank Steve Fischer and Mark Heiges for their valuable suggestions and comments. The authors are grateful to the collaborative ApiDB efforts directed towards database integration. This project has been funded in whole or in part with Federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contract No. HHSN266200400037C.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Nikolaus Rajewsky
Received on March 8, 2006; revised on July 8, 2006; accepted on July 8, 2006
| REFERENCES |
|---|
|
|
|---|
Altschul, S.F., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, . 25, 33893402
Bahl, A., et al. (2003) PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data. Nucleic Acids Res, . 31, 212215
Chen, F., et al. (2006) OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res, . 34, D363368
Frazer, K.A., et al. (2004) VISTA: computational tools for comparative genomics. Nucleic Acids Res, . 32, W273279
Heiges, M., et al. (2006) CryptoDB: a Cryptosporidium bioinformatics resource update. Nucleic Acids Res, . 34, D419D422
Kissinger, J.C., et al. (2002) The Plasmodium genome database. Nature, 419, 490492[CrossRef][Medline].
Kurtz, S., et al. (2004) Versatile and open software for comparing large genomes. Genome Biol, . 5, R12[CrossRef][Medline].
Pan, X., et al. (2005) SynBrowse: a synteny browser for comparative sequence analysis. Bioinformatics, 21, 34613468
Stein, L.D., et al. (2002) The generic genome browser: a building block for a model organism system database. Genome Res, . 12, 15991610
This article has been cited by other articles:
![]() |
E. Courcelle, Y. Beausse, S. Letort, O. Stahl, R. Fremez, C. Ngom-Bru, J. Gouzy, and T. Faraut Narcisse: a mirror view of conserved syntenies Nucleic Acids Res., January 11, 2008; 36(suppl_1): D485 - D490. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Arnaiz, S. Cain, J. Cohen, and L. Sperling ParameciumDB: a community resource that integrates the Paramecium tetraurelia genome sequence with genetic data Nucleic Acids Res., January 12, 2007; 35(suppl_1): D439 - D444. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

