Skip Navigation


Bioinformatics Advance Access originally published online on January 28, 2008
Bioinformatics 2008 24(4):584-585; doi:10.1093/bioinformatics/btm627
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/4/584    most recent
btm627v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gront, D.
Right arrow Articles by Kolinski, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gront, D.
Right arrow Articles by Kolinski, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Utility library for structural bioinformatics

Dominik Gront * and Andrzej Kolinski

University of Warsaw, Faculty of Chemistry, Pasteura 1 02-093 Warsaw, Poland

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 BIOSHELL OVERVIEW
 3 SUMMARY
 REFERENCES
 

Summary: In this Note we present a new software library for structural bioinformatics. The library contains programs, computing sequence- and profile-based alignments and a variety of structural calculations with user-friendly handling of various data formats. The software organization is very flexible. Algorithms are written in Java language and may be used by Java programs. Moreover the modules can be accessed from Jython (Python scripting language implemented in Java) scripts. Finally, the new version of BioShell delivers several utility programs that can do typical bioinformatics task from a command-line level.

Availability The software is available for download free of charge from its website: http://bioshell.chem.uw.edu.pl. This website provides also numerous examples, code snippets and API documentation.

Contact: dgront{at}chem.uw.edu.pl


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 BIOSHELL OVERVIEW
 3 SUMMARY
 REFERENCES
 
BioShell (Gront and Kolinski, 2006), a suite of programs published a few years ago has been designed as an extension of a Unix shell with commands related to common bioinformatics tasks. After some time of development it became obvious that a set of independent programs is not the most efficient way for performing a variety of computational tasks. Adding new programs and extending their list of command-line options was not enough to meet the needs of all users, both from inside and outside of our laboratory. Therefore the idea of BioShell has been radically modified. The package has been rewritten in a highly modular and object-oriented fashion. After two years of its development, BioShell moves toward a general biomodeling scripting language.


    2 BIOSHELL OVERVIEW
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 BIOSHELL OVERVIEW
 3 SUMMARY
 REFERENCES
 
In this article we present a large library of modules written in JAVA language. The library is not dedicated solely for JAVA programmers. The new BioShell functionality follows a novel approach to toolkit construction and may be accessed in three ways:

  • as a set of command line tools. This has not been changed in respect to the previous BioShell distribution, besides a few new commands (executable programs) that were added recently. From this point of view BioShell resembles such command line packages as EMBOSS (Rice et al., 2000).
  • as a library of modules for Python language. Java classes may be directly called from Python language providing the Python interpreter itself has been implemented in JAVA, such as Jython.1 This way of BioShell usage follows such Bio* projects as BioPython (Hamelryck and Manderick, 2003) or BioPerl (Stajich et al., 2002). When compared to these two scripting packages, BioShell offers a wider range of possible applications, focusing on various aspects of structure analysis rather than solely on the sequence-based bioinformatics. Scripting with BioShell is intended to be the main way to access its modules.
  • finally the jbcl (Java BioComputing Library) that holds over 80% of BioShell code can be directly used to develop JAVA programs. From this side BioShell resembles such JAVA-oriented libraries as BioJAVA (Pocock et al., 2000).
The outlined above software architecture is very helpful for those learning BioShell. Some part of daily work may be done by command-line tools without any programming. Larger projects usually involve preparation of a script. Finally, when the investigated idea has been proved and a script found to be useful, that script may be compiled by a jythonc utility from Jython package into a Java bytecode (class file).

BioShell is designed to be easy to learn and to use. In particular, the authors devoted a lot of attention to avoiding an explosion of different kinds of objects, data structures, etc. BioShell methods return a basic data type (such as double integers or String) wherever it is possible. Therefore the number of objects that a user must learn is highly limited. BioShell by itself does not utilizes any external packages or libraries. The only components that must be installed to use BioShell are Java Runtime Environment and Jython interpreter.

2.1 BioShell modules
The package integrates some of our previous projects: T-Pile: (Gront and Kolinski, 2007), BBQ (Gront et al., 2007) and HCPM (Gront and Kolinski, 2005). There also many new modules. Currently the source code counts almost 40.000 lines of Java code in 260 classes. Therefore we must limit ourselves to list only the most important (and probably the most frequently used) packages:

jbcl.data.dict- dictionaries, holding various constants, facts etc, for example formulas of ligands common in PDB files, Van der Waals atomic radii or standard geometry of amino acid residues (bond lengths and planar angles).
jbcl.data.formats- parsers for most popular file formats, such as PDB, FASTA, DSSP, PIR, MOL2 and others.
jbcl.data.types- classes that represent objects typical for bioinformatics: proteins, residues, sequence profiles etc.
jbcl.calc.alignment- classes in this package calculate sequence alignments of various flavors.
jbcl.calc.structural- classes that calculate structural properties, such as protein–protein similarity (crmsd, drsmd, GDT-TS, TM-score), torsion and planar angles and interatomic distances.


Figure 1
View larger version (29K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Graphic representation of jbcl utility library presents the components of BioShell package: calc module for calculating alignments and structural properties, contains also various numerical and statistical utilities; commands and algorithms are used by BioShell command-line applications; data modules provide core objects, dictionaries and file format parsers; finally modeling contains tools for Monte Carlo sampling. Each branch level corresponds to a level of the modules hierarchy. For example, a PDB reader class may be found in jbcl.data.formats package; NormalKernel used in kernel density estimation is located in jbcl.calc.statistics.kernels.

 
2.2 BioShell examples
Because of the limited space in Bioinformatics Application Notes, we cannot bring here any detailed example. The project's website provides four very detailed tutorials:
  1. Parsing PDB (Berman et al., 2000) files. Our PDB parser reads both *.ent and *.ent.gz files. It is also possible to download a protein directly from PDB website.2 On average it takes ~1 sec to read a protein from a gzipped file. BioShell employs also JAVA serialization technique which gives additional speed-up. When it is applied to PDB I/O operations, it takes ~0.2 sec. to read a PDB entry,
  2. Support for reduced-space modeling,
  3. Calculating various structural properties,
  4. Various kinds of sequence-based alignments, from the very basic to 1D threading.
Moreover, jbcl API documentations contains numerous code snippets and example scripts with many more to come.


    3 SUMMARY
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 BIOSHELL OVERVIEW
 3 SUMMARY
 REFERENCES
 
In this contribution we present a software package for bioinformatics calculations that can be used as a set of stand-alone applications, JAVA utility library or as a set of modules for Jython. In contrast to other existing libraries such as BioPython and BioJava, BioShell offers higher flexibility and wider scope of functionality, ranging from sequence-based to structure-based bioinformatics. The current JAVA implementation of BioShell (the previous version has been written in C++) makes the package platform independent. It has been successfully tested on Linux, MacOS, Windows XP and Vista systems. We hope that BioShell modules will be a valuable addition to the publicly available biocomputing software.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Anna Tramontano

1 http://www.jython.org Back

2 http://www.rcsb.org/ Back

Received on November 20, 2007; revised on November 20, 2007; accepted on December 15, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 BIOSHELL OVERVIEW
 3 SUMMARY
 REFERENCES
 

    Berman HM, et al. The protein data bank. Nucleic Acids Res (2000) 28:235–242.[Abstract/Free Full Text]

    Gront D, Kolinski A. HCPM – program for hierarchical clustering of protein models. Bioinformatics (2005) 21:3179–3180.[Abstract/Free Full Text]

    Gront D, Kolinski A. Bioshell – a package of tools for structural biology computations. Bioinformatics (2006) 22:621–622.[Abstract/Free Full Text]

    Gront D, Kolinski A. T-pile- a package for thermodynamic calculations for biomolecules. Bioinformatics (2007) 23:1840–1842.[Abstract/Free Full Text]

    Gront D, et al. Backbone building from quadrilaterals: A fast and accurate algorithm for protein backbone reconstruction from alpha carbon coordinates. J. Comput. Chem (2007) 28:1593–1597.[CrossRef][Web of Science][Medline]

    Hamelryck T, Manderick B. Pdb file parser and structure class implemented in python. Bioinformatics (2003) 19:2308–2310.[Abstract/Free Full Text]

    Pocock M, et al. BioJava: open source components for bioinformatics. ACM SIGBIO Newsletter (2000) 20:10–12.[CrossRef]

    Rice P, et al. Emboss: the european molecular biology open software suite. Trends Genet (2000) 16:276–277.[CrossRef][Web of Science][Medline]

    Stajich JE, et al. The bioperl toolkit: Perl modules for the life sciences. Genome Res (2002) 12:1611–1618.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
J. I. Sulkowska, P. Sulkowski, and J. Onuchic
Dodging the crisis of folding proteins with knots
PNAS, March 3, 2009; 106(9): 3119 - 3124.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. I. Sulkowska, P. Sulkowski, P. Szymczak, and M. Cieplak
Stabilizing effect of knots on proteins
PNAS, December 16, 2008; 105(50): 19714 - 19719.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/4/584    most recent
btm627v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gront, D.
Right arrow Articles by Kolinski, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gront, D.
Right arrow Articles by Kolinski, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?