Skip Navigation


Bioinformatics Advance Access originally published online on January 19, 2007
Bioinformatics 2007 23(5):637-638; doi:10.1093/bioinformatics/btl679
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
23/5/637    most recent
btl679v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Google Scholar
Right arrow Articles by Pugalenthi, G.
Right arrow Articles by Chakrabarti, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pugalenthi, G.
Right arrow Articles by Chakrabarti, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

SMotif: a server for structural motifs in proteins

Ganesan Pugalenthi 1, P. N. Suganthan 1,*, R. Sowdhamini 2,* and Saikat Chakrabarti 3,*

1School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, 2National Centre for Biological Sciences, Bangalore 560 065, India and 3National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 Methodology
 3 Results
 4 Conclusions
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: SMotif is a server that identifies important structural segments or motifs for a given protein structure(s) based on conservation of both sequential as well as important structural features such as solvent inaccessibility, secondary structural content, hydrogen bonding pattern and residue packing. This server also provides three-dimensional orientation patterns of the identified motifs in terms of inter-motif distances and torsion angles. These motifs may form the common core and therefore, can also be employed to design and rationalize protein engineering and folding experiments.

Availability: SMotif server is available via the URL http://caps.ncbs.res.in/SMotif/index.html.

Contact: chakraba{at}mail.nih.gov, mini{at}ncbs.res.in or EPNSugan{at}ntu.edu.sg

Supplementary information: Supplementary data are available at Bioinformatics online.


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 Methodology
 3 Results
 4 Conclusions
 ACKNOWLEDGEMENTS
 REFERENCES
 
Previous studies (Farber and Petsko, 1990; Kannan et al., 2001) have pointed to a small number of structural elements that are required for retention of fold and function of a protein. Though subsequences forming similar substructures do not always show high sequence similarity, these common substructures contain conserved key amino acid positions and have important implications in protein folding (Friedberg and Margalit, 2002). Sequence-based representations, however, are only an approximation to the underlying structural and functional information. Therefore, motifs identified at three-dimensional structure level provide significant and reliable information.

Here we present a web server, SMotif that identifies set of important structural segments or motifs for a given protein structure(s) based on conservation of both sequential as well as important structural features (Chakrabarti et al., 2003; Chakrabarti and Sowdhamini, 2004). Such motifs among structurally aligned proteins are recognized by the conservation of amino acid preference and solvent inaccessibility and are examined for the conservation of other important structural features like secondary structural content, hydrogen-bonding pattern and residue packing. Spatial orientations of the motifs, in terms of inter-motif distances and torsion angles, are also examined. These motifs may form the common core by maintaining a particular spatial pattern when compared across different proteins belonging to the same family or superfamily. Such motifs can also be employed to design and rationalize protein engineering and folding experiments.


    2 Methodology
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 Methodology
 3 Results
 4 Conclusions
 ACKNOWLEDGEMENTS
 REFERENCES
 
2.1 Identification of structural motifs
Structural motifs are identified by the presence of at least three consecutive solvent-buried (inaccessible) residues that have higher amino acid exchange scores. Conservation of more structural parameters like secondary structural content, hydrogen bonding and residue packing (Ooi number; Nishikawa and Ooi, 1986) are also examined among structurally aligned multiple proteins. The SMotif server identifies structural motifs following the same principle as described in Chakrabarti et al. (2003).

In the SMotif algorithm, solvent accessibility is measured using the PSA program from JOY4.0 suite (Mizuguchi et al., 1998). Residues that have accessible surface area less than 7% are treated as solvent buried or inaccessible. At every alignment position, all possible pairs of proteins and their observed amino acids are scored using a standard 20 x 20 substitution matrix (Johnson and Overington, 1993) derived from structure-based sequence alignments of homologous protein families. SSTRUC program that is part of JOY4.0 suite of programs is used to identify secondary structural positions. The HBOND program, also part of JOY4.0 suite, has been used to identify hydrogen bonds. Residue packing has been measured in terms of Ooi number that provides the number of residues surrounding each C{alpha} atom of residues in a protein. Higher Ooi numbers correspond to high residue packing and suggest that the residue is in a well-packed environment.

2.2 Input options
Structural motifs are identified from the alignment submitted by the users. Separately, users can upload only the protein sequence or structure where homologous protein sequences and structures are retrieved by running a PSI-BLAST (Altschul et al., 1997) against SWISSPROT sequence database (Apweiler et al., 1997) and a structure database (PDB: Berman et al., 2000). Homologous structures are superimposed using the program STAMP (Russell and Burton, 1994) and subsequent structural alignment is used to identify motif regions.

2.3 Output options
2.3.1 Display of structural motifs on alignment
Structural motifs are projected on the alignment using different color codes for visual clarity. Important structural features are also marked on the alignment and provided as an additional output file.

2.3.2 3D graphical display of structural motifs
Interactive 3D views of the structural motifs on the individual and superposed protein structures are displayed for better understanding and visualization.

2.3.3 Spatial orientation patterns of the motifs
Spatial orientations are represented in terms of inter-motif distances and angular orientations of the identified motif regions. Structural motifs are converted into vector representation and the distances and virtual torsion angles between all possible pairs of motifs are calculated using standard vector algebra.


    3 Results
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 Methodology
 3 Results
 4 Conclusions
 ACKNOWLEDGEMENTS
 REFERENCES
 
SMotif algorithm has been benchmarked against alignments of proteins that are related at the superfamiliy level. About 52 such structural alignments are considered as a test set for which structural motifs have previously been identified by careful manual intervention (Chakrabarti and Sowdhamini, 2004). Results (please see Supplementary Materials for details) from the benchmarking study suggest high sensitivity (~75%) and accuracy (~82%) for the SMotif algorithm.

This web server can be quite powerful to extract structurally important regions of protein folds rapidly and effectively. For example, the average sequence identity between the members of the Transglutaminase superfamily can be as low as 9.5% (at the full-length alignment) and 11% (at the structural-motif regions) and it is still possible by SMotif server to extract the structurally conserved regions (shown in Fig. SM1, Supplementary Materials).


    4 Conclusions
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 Methodology
 3 Results
 4 Conclusions
 ACKNOWLEDGEMENTS
 REFERENCES
 
Structural motifs identified on the basis of conservation of important structural properties like solvent inaccessibility, secondary structure content, hydrogen-bonding interactions and compactness of residues possess value and can provide useful information regarding homologous core of similar protein structures. SMotif provides a fast and interactive interface to identify and visualize such important structural segments and therefore, can be a useful tool to design and rationalize protein engineering and folding experiments.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 Methodology
 3 Results
 4 Conclusions
 ACKNOWLEDGEMENTS
 REFERENCES
 
We thank Shameer Khadar for his help in installing the SMotif server. G.P. and P.N.S. acknowledge the financial support offered by the A*Star (Agency for Science, Technology and Research). S.C. acknowledges Intramural Research Program of the National Library of Medicine at NIH/DHHS. R.S. acknowledges National Centre for Biological Sciences (TIFR) for infrastructural support. Funding to pay the Open Access publication charges was provided by the Wellcome Trust, UK, as part of the Senior Research Fellowship of R.S.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Dmitrij Frishman

Received on October 27, 2006; revised on December 15, 2006; accepted on January 5, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 Methodology
 3 Results
 4 Conclusions
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Apweiler R, et al. Protein sequence annotation in the genome era: the annotation concept of SWISS-PROT, TREMBL. (1997) Proceedings of the 5th International Conference on ISMB. 33–43.

    Altschul SF, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. (1997) 25:3389–3402.[Abstract/Free Full Text]

    Berman HM, et al. The protein data bank. Nucleic Acids Res. (2000) 28:235–242.[Abstract/Free Full Text]

    Chakrabarti S, Sowdhamini R. Regions of minimal structural variation among members of protein domain superfamilies: application to remote homology detection and modeling using distant relationships. FEBS Lett. (2004) 569:31–36.[CrossRef][Web of Science][Medline]

    Chakrabarti S, et al. SMoS: a database of structural motifs of protein superfamilies. Protein Eng. (2003) 16:791–793.[Abstract/Free Full Text]

    Farber GK, Petsko GA. The evolution of alpha/beta barrel enzymes. Trends Biochem. Sci. (1990) 15:228–234.[CrossRef][Web of Science][Medline]

    Friedberg I, Margalit H. Persistently conserved positions in structurally similar, sequence dissimilar proteins: roles in preserving protein fold and function. Protein Sci. (2002) 11:350–360.[CrossRef][Web of Science][Medline]

    Johnson MS, Overington JP. A structural basis for sequence comparisons. an evaluation of scoring methodologies. J. Mol. Biol. (1993) 233:716–738.[CrossRef][Web of Science][Medline]

    Kannan N, et al. Clusters in alpha/beta barrel proteins: implications for protein structure, function, and folding: a graph theoretical approach. Proteins (2001) 43:103–112.[CrossRef][Web of Science][Medline]

    Mizuguchi K, et al. JOY: protein sequence-structure representation and analysis. Bioinformatics (1998) 14:617–623.[Abstract/Free Full Text]

    Nishikawa K, Ooi TJ. Radial locations of amino acid residues in a globular protein: correlation with the sequence. J. Biochem. (Tokyo) (1986) 100:1043–1047.[Abstract/Free Full Text]

    Russell RB, Barton GJ. Structural features can be unconserved in proteins with similar folds. An analysis of side-chain to sidechain contacts secondary structure and accessibility. J. Mol. Biol. (1994) 244:332–350.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
G. Pugalenthi, K. Tang, P. N. Suganthan, and S. Chakrabarti
Identification of structurally conserved residues of proteins in absence of structural homologs using neural network ensemble
Bioinformatics, January 15, 2009; 25(2): 204 - 210.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Pugalenthi, P. N. Suganthan, R. Sowdhamini, and S. Chakrabarti
MegaMotifBase: a database of structural motifs in protein families and superfamilies
Nucleic Acids Res., January 1, 2008; 36(suppl_1): D218 - D221.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
23/5/637    most recent
btl679v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Google Scholar
Right arrow Articles by Pugalenthi, G.
Right arrow Articles by Chakrabarti, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pugalenthi, G.
Right arrow Articles by Chakrabarti, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?