Skip Navigation


Bioinformatics Advance Access originally published online on January 2, 2008
Bioinformatics 2008 24(3):426-427; doi:10.1093/bioinformatics/btm622
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
24/3/426    most recent
btm622v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Google Scholar
Right arrow Articles by Sippl, M. J.
Right arrow Articles by Wiederstein, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sippl, M. J.
Right arrow Articles by Wiederstein, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2008 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

A note on difficult structure alignment problems

Manfred J. Sippl * and Markus Wiederstein

Center of Applied Molecular Engineering, Division of Bioinformatics, Department of Molecular Biology, University of Salzburg, Hellbrunnerstr. 34, 5020 Salzburg, Austria

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: Progress in structural biology depends on several key technologies. In particular tools for alignment and superposition of protein structures are indispensable. Here we describe the use of the TopMatch web service, an effective computational tool for protein structure alignment, for the visualization of structural similarities, and for highlighting relationships found in protein classifications. We provide several instructive examples.

Availability: TopMatch is available as a public web service at http://services.came.sbg.ac.at

Contact: sippl{at}came.sbg.ac.at

Today we face an explosion of newly determined protein structures in part fueled by the various protein structure initiatives. As a result the public repository (PDB) will soon surpass 50 000 entries (Berman et al., 2000). This data base represents our knowledge of protein molecules but the amount of information is overwhelming. To make progress the structures need to be organized, classified and quantified in various ways. For this task and the subsequent retrieval, analysis and visualization of the often intricate relationships structure comparison techniques are indispensable.

Michael Levitt and coworkers (Kolodny et al., 2005) recently presented a most comprehensive analysis of major structure alignment programs. They remark that comparing the various programs is a delicate task and by highlighting the limitations of existing methods they conclude that there is a need for better structural alignment methods. It is indeed surprising that after half a century of protein structure research no generally accepted standards for protein structure alignment have emerged.

A particular difficulty is that as long as existing structural similarities remain undetected we cannot check whether or not any particular method is able to recognize that relationship. According to Kolodny et al., 2005 such difficult examples may be found in existing protein structure classifications by searching for similarities among distinct SCOP (Andreeva et al., 2007) folds or distinct CATH (Greene et al., 2007) architectures or topologies. Here we take up this suggestion and provide a small selection of examples drawn from ongoing classification projects. In these projects we make extensive use of a suite of structure alignment techniques called TopMatch. TopMatch is the successor of ProSup, a program previously used in several large scale structure comparison projects (e.g. Sippl et al., 2001).

We have now completed a web service to make the TopMatch program accessible to the structural biology community. The quality of alignments is essential but ease of use, speed and in particular proper visualization are important ingredients in the interpretation and analysis of structure alignments. The chief goal of this communication is to demonstrate the use of this service by a set of instructive examples drawn from ongoing structure classification initiatives (Suhrer et al., 2007a, b).

In the description of alignments we call the first structure the query (q) and the second structure the target (t). In general a query and target can be aligned in many different ways (Feng and Sippl, 1996). Hence, TopMatch reports a ranked list of alignments. The alignments are characterized by a small set of parameters. The most significant of these is the length of an alignment (the number of residue pairs that are structurally equivalent). We call this the absolute similarity S(q,t). From the alignment we compute a sequence score using a structure derived substitution matrix (Prlic et al., 2000). If this score is positive it is added to S(q,t) and this combined score is used to rank the alignments. Additional useful parameters are the root-mean-square error of superposition (RMS), percentage of sequence identity (Identity), the relative similarity s(q, t) = 100 x 2 S(q, t)/(Lq + Lt), and the relative query and target cover defined as cq = 100 x S(q,t)/Lq and ct = 100 x S(q, t)/Lt, respectively (here Lq and Lt are the respective sequence lengths). Relative similarity and relative cover are simple and intuitive measures describing the extent of mutual similarity amongst two structures.

Figure 1 illustrates the application of TopMatch using a small set of examples. We first demonstrate that for the investigation of structural similarities it is often necessary but also convenient to take into account the manifold of distinct alignments. We then present several examples that may be considered difficult in the sense of Kolodny et al., 2005 where the respective structures reside in distinct SCOP folds and CATH topologies although they share extensive structure similarity.


Figure 1
View larger version (34K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Structure alignments of SCOP and CATH domains. The figure shows five structural alignments (ae) that are difficult in the sense that in SCOP or CATH they are assigned to distinct folds or topologies. A counter example is (f) where the two folds are in the same SCOP superfamily although the similarity is comparatively low. The query is always in blue, the target in green and the regions of similar structure are colored red (query) and orange (target). Table 1 shows the parameters for the respective alignments. Figures (a) and (b) show two distinct solutions for the structural alignment of 1eud-A and 1ccw-A. The first alignment (a) relates 1ccw-A to the C-terminal part of 1eud-A, the second (b) relates 1ccw-A to the N-terminal part of 1eud-A. This implies considerable structural similarity within 1eud-A. This is indeed the case as shown in (c): In SCOP 1eud-A is represented by two domains, d1euda1 and d1euda2, corresponding to these regions which are classified as distinct folds (classification codes c.2.1.8 and c.23.4.1). (d) The SCOP domains d1gt8a4 and d1mo9a1 are classified as two distinct folds (c.4.1.1 and c.3.1.5, respectively). This has to be contrasted with (f) where two domains of considerably less similarity are classified within the same superfamily. (e) Superposition of CATH domains 1te2B02 and 1zolA02. The two domains belong to the two distinct topologies 1.10.150 (1te2B02) and 1.10.164 (1zolA02). (f) Superposition of SCOP domains d1lt3a_ and d1efya2. The two domains reside in the same SCOP superfamily called ADP-ribosylation (d.166.1) but in the two distinct SCOP families called ADP-ribosylating toxins (d.166.1.1) and Poly-ADP-ribose polymerase, C-terminal domain (d.166.1.2), respectively.

 

View this table:
[in this window]
[in a new window]

 
Table 1. Parameters for alignments shown in Figure 1

 
We note that the 2D projections shown in Figure 1 do not fully reveal the often complex, intricate, or obscure relationships. We therefore encourage the interested reader to contemplate these examples in 3D using the TopMatch service. We have spent considerable efforts to make the use of this service as convenient as possible. For example, whereas computation of structural alignments of SCOP and CATH domains and their visualization generally requires that the domain definitions are supplied by the user, TopMatch recognizes the domain names automatically. Additional information on the efficient use of TopMatch and proper interpretation of the results is provided by the web service.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 ACKNOWLEDGEMENTS
 REFERENCES
 
The structure superposition program TopMatch is provided by Proceryon GmbH. Figure 1 was prepared using PyMOL (http://www.pymol.org).

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Burkhard Rost

Received on November 23, 2007; revised on December 12, 2007; accepted on December 13, 2007

    REFERENCES
 TOP
 ABSTRACT
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Andreeva A, et al. Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res (2007) doi:10.1093/nar/gkm993.

    Berman HM, et al. The Protein Data Bank. Nucleic Acids Res (2000) 28:235–242.[Abstract/Free Full Text]

    Feng ZK, Sippl MJ. Optimum superimposition of protein structures: ambiguities and implications. Fold. Des (1996) 1:123–132.[CrossRef][Web of Science][Medline]

    Greene LH, et al. The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution. Nucleic Acids Res (2007) 35:D291–D297.[Abstract/Free Full Text]

    Kolodny R, et al. Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. J. Mol. Biol (2005) 346:1173–1188.[CrossRef][Web of Science][Medline]

    Prlic A, et al. Structure-derived substitution matrices for alignment of distantly related sequences. Protein Eng (2000) 13:545–550.[Abstract/Free Full Text]

    Sippl MJ, et al. Assessment of the CASP4 Fold Recognition Category. Proteins (2001) 45:55–67.[Web of Science][Medline]

    Suhrer SJ, et al. QSCOP-BLAST–fast retrieval of quantified structural information for protein sequences of unknown structure. Nucleic Acids Res (2007a) 35(Web Server issue):W411–W415.[Abstract/Free Full Text]

    Suhrer SJ, et al. QSCOP–SCOP quantified by structural relationships. Bioinformatics (2007b) 23:513–514.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
T. S. Wong, S. Rajagopalan, S. M. Freund, T. J. Rutherford, A. Andreeva, F. M. Townsley, M. Petrovich, and A. R. Fersht
Biophysical characterizations of human mitochondrial transcription factor A and its binding to tumor suppressor p53
Nucleic Acids Res., November 1, 2009; 37(20): 6765 - 6783.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Micheletti and H. Orland
MISTRAL: a tool for energy-based multiple structural alignment of proteins
Bioinformatics, October 15, 2009; 25(20): 2663 - 2669.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. J. Suhrer, M. Wiederstein, M. Gruber, and M. J. Sippl
COPS--a novel workbench for explorations in fold space
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W539 - W544.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
T. J. Lawton, L. A. Sayavedra-Soto, D. J. Arp, and A. C. Rosenzweig
Crystal Structure of a Two-domain Multicopper Oxidase: IMPLICATIONS FOR THE EVOLUTION OF MULTICOPPER BLUE PROTEINS
J. Biol. Chem., April 10, 2009; 284(15): 10174 - 10180.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
Y. W. Tan, S. L. Chan, T. C. Ong, L. Y. Yit, Y. S. Tiong, F. T. Chew, J. Sivaraman, and Y. K. Mok
Structures of Two Major Allergens, Bla g 4 and Per a 4, from Cockroaches and Their IgE Binding Epitopes
J. Biol. Chem., January 30, 2009; 284(5): 3148 - 3157.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Carrillo-Tripp, C. M. Shepherd, I. A. Borelli, S. Venkataraman, G. Lander, P. Natarajan, J. E. Johnson, C. L. Brooks III, and V. S. Reddy
VIPERdb2: an enhanced and web API enabled relational database for structural virology
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D436 - D442.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. J. Sippl, S. J. Suhrer, M. Gruber, and M. Wiederstein
A discrete view on fold space
Bioinformatics, March 15, 2008; 24(6): 870 - 871.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. J. Sippl
On distance and similarity in fold space
Bioinformatics, March 15, 2008; 24(6): 872 - 873.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
24/3/426    most recent
btm622v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Google Scholar
Right arrow Articles by Sippl, M. J.
Right arrow Articles by Wiederstein, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sippl, M. J.
Right arrow Articles by Wiederstein, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?