Skip Navigation


Bioinformatics Advance Access originally published online on April 12, 2005
Bioinformatics 2005 21(12):2925-2926; doi:10.1093/bioinformatics/bti437
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/12/2925    most recent
bti437v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zheng, Y.
Right arrow Articles by Yang, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zheng, Y.
Right arrow Articles by Yang, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

STARS: statistics on inter-atomic distances and torsion angles in protein secondary structures

Yu Zheng and Daiwen Yang *

Department of Biological Sciences and Department of Chemistry, National University of Singapore 14 Science Drive 4, Singapore 117543

*To whom correspondence should be addressed.


    Abstract
 TOP
 Abstract
 INTRODUCTION
 OVERVIEW OF STARS
 REFERENCES
 

Summary: A graphics package has been developed for performing statistics on interatomic distances and torsion angles in protein secondary structures (STARS) from a protein crystal structure database. It allows one to obtain both the graphical view and the text format of distributions of the distances and angles for atoms located in 10 types of protein secondary structures. STARS will facilitate assignment of ambiguous NOESY peaks, structure determination by nuclear magnetic resonance, structure validation and comparison of protein folds.

Availability: All data, documents and execute files are freely downloadable at http://stars.zhengyuhome.com. The software works appropriately on Windows system, without any compilation or installation.

Contact: dbsydw{at}nus.edu.sg


    INTRODUCTION
 TOP
 Abstract
 INTRODUCTION
 OVERVIEW OF STARS
 REFERENCES
 
Structure determination by nuclear magnetic resonance (NMR) and structure validation involve estimation of interatomic distances and dihedral angles. The atom–atom distances are often derived from nuclear Overhauser effects (NOEs), whereas dihedral angles are derived from J-coupling constants and chemical shifts. Assigning each NOE peak in NOE spectroscopy (NOESY) to a specific pair of atoms is a challenging task even for a small protein because of the chemical shift degeneracy of different protons. Knowledge of interatomic distances for atoms located in each type of secondary structure facilitates the assignment of ambiguous NOEs resulting from chemical shift degeneracy on the basis of secondary structures that can be predicted with fair accuracy from chemical shifts or from amino acid sequence with computational techniques alone. However, if some of the NOE assignments are available (e.g. sequential NOEs), the distance knowledge helps in the determination of protein secondary structures too. Similarly, knowledge of dihedral angles for different types of secondary structures is very useful for deriving structural constraints from J-coupling constants. It can also be used to build internal motional models based on experimental J-coupling data. Besides applications to NMR, information about interatomic distance and torsion angle may be used to validate protein structures and compare protein folds.

Statistics on the distance and dihedral angle are often derived from many known protein structures. It is tedious to obtain the information from a large number of proteins. To the best of our knowledge, there is no tool available for computing the statistics though many tools can calculate distances and dihedral angles for only one given protein structure at one time. Here, we present a software tool for statistics on interatomic distances and dihedral angles in protein secondary structures (STARS). STARS provides highly interactive visualization of statistical results. Its friendly window-based interface makes it extremely easy to use.


    OVERVIEW OF STARS
 TOP
 Abstract
 INTRODUCTION
 OVERVIEW OF STARS
 REFERENCES
 
Composition of database
With the aid of CullPDB (Hobohm et al., 1992), a non-redundant database of protein crystal structures was generated by extracting structural data from Protein Data Bank. Hydrogen atoms were added using MOLMOL (Koradi et al., 1996). Proteins selected for our database meet the following criteria:

  1. sequence identity <20%,
  2. resolution ≤1.6 Å and R-factor ≤0.25 and
  3. residue number >50 and without non-standard amino acid and chain break.
The resulting database consisted of 576 protein chains, containing 124 037 amino acid residues. Additional structures can be added to the database by users for their own interests.

Definition
The definitions and identifiers of amino acids, atoms and torsion angles used in STARS comply with the IUPAC recommendations in 1998 (Markley et al., 1998). Secondary structure and chirality were assigned automatically for all proteins in the database using the DSSP method (Kabsch and Sander, 1983). On the basis of biologist's preference, however, ß-sheets were subdivided into three types. Totally, 10 types of secondary structures were defined, including {alpha}-helix, 310-helix, {pi}-helix, antiparallel-, parallel-ß-sheets and the combination of these two sheets, turn, bend, ß-bridge and random coil. To obtain statistics on atom–atom distances and torsion angles, only relative positions among atoms in a protein chain are required. When the first and second atoms are located at residues i and i + n, respectively (where i is a positive integer while n is an integer), the relative position of the second atom with respect to the first one is denoted as n. The definition of residues i, J, K, j and k in a ß-sheet is shown in Figure 1, the relative positions of the second atoms in residues J + n, K + n, j + n and k + n with respect to the first atom in residue i are referred to as J + n, K + n, j + n and k + n.



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 1 Definition of residues i, J, j, K, k in antiparallel (a), parallel (b) and mixed parallel and antiparallel (c and d) ß-sheets. i is the residue under investigation, at which the first atom is located as shown in the main window (Fig. 2a); J is the ß-bridge partner of residue i in an antiparallel ladder, with H-bond (i,J), where i and J are the hydrogen donor and acceptor residues respectively, and H-bond (J,i); j is the ß-bridge partner of residue i in a parallel ladder, with H-bond(i, j–1) and H-bond(j + 1, i); K is the ß-bridge partner of residue i in an antiparallel ladder, with H-bond(i + 1, K–1) and H-bond(K + 1, i – 1); k is the ß-bridge partner of residue i in a parallel ladder, with H-bond(i + 1, k) and Hbond(k, i – 1).

 
User interface
The STARS interface is intentionally uncluttered (Fig. 2). The main window is shown in Figure 2a. The users can define the number of proteins used in the statistics and let the program select proteins randomly. Alternatively, the structures can be selected manually from a selection window (Fig. 2d) in which proteins can be sorted by name, resolution, chain length or R-value. The statistics can be done over all residues, or the residues in one or more specific secondary structures selected by users (Fig. 2a). The relative position(s) of the second atom(s) with respect to the first atom can be specified by a single expression (e.g. 2 or J – 1), a series of expressions (e.g. –2, 0, 1,J – 1, K + 1), a range of numbers (e.g. –2 ~ 2 or j–2 ~ j + 2), or a combination of different expressions (e.g. –3, –1 –1, k–2 ~ k+1). If some of the specified atoms are not located in the selected secondary structure(s), the output will not contain distances or angles involved in these atoms. The statistics can be obtained in a single (Fig. 2a and b) or batch mode (Fig. 2c). Since the batch mode uses a parallel process algorithm, it is ~10 times faster than the single mode for obtaining the same amount of information. With a job editor, jobs can be created, saved, loaded, edited, sorted, deleted or moved easily in the job list, and submitted at the user's convenience. The statistic results are displayed in a 3D color-bar-style chart in the result analysis window (Fig. 2d). Almost all features of the chart can be reset by users in terms of color, zoom, mark, label, rotation, range, grid, etc. The software allows the users to view, compare, select, sort, save or load statistics through a result display window. All data files are saved as a common ASCII format which can be read by a normal text editor. A detailed manual is accessible by clicking the help button in the main window or pressing the F1 key.



View larger version (43K):
[in this window]
[in a new window]
 
Fig. 2 STARS user interface. (a) Main window with the page for interatomic distance statistics in a single mode. (b) Pages for torsion angle statistics in a single mode and result analysis. (c) Pages for statistics in a batch mode. (d) Windows for selection of protein structures and display of results.

 


    Acknowledgments
 
This research was supported by a grant from the Biomedical Research Council (BMRC) and Agency for Science, Technology and Research, A*Star of Singapore.

Received on February 28, 2005; revised on March 28, 2005; accepted on April 5, 2005

    REFERENCES
 TOP
 Abstract
 INTRODUCTION
 OVERVIEW OF STARS
 REFERENCES
 

    Hobohm, U., et al. (1992) Selection of representative protein data sets. Protein Sci., 1, 409–417[Web of Science][Medline].

    Kabsch, W. and Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22, 2577–2637[CrossRef][Web of Science][Medline].

    Koradi, R., et al. (1996) MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graphics, 14, 51–55[CrossRef][Web of Science][Medline].

    Markley, J., et al. (1998) Recommendations for the presentation of NMR structures of proteins and nucleic acids. Pure Appl. Chem., 70, 117–142.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/12/2925    most recent
bti437v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zheng, Y.
Right arrow Articles by Yang, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zheng, Y.
Right arrow Articles by Yang, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?