Skip Navigation


Bioinformatics Advance Access originally published online on January 31, 2007
Bioinformatics 2007 23(7):789-792; doi:10.1093/bioinformatics/btm018
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
23/7/789    most recent
btm018v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (8)
Google Scholar
Right arrow Articles by Nicola, G.
Right arrow Articles by Vakser, I. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nicola, G.
Right arrow Articles by Vakser, I. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

A simple shape characteristic of protein–protein recognition

George Nicola 1 and Ilya A. Vakser 2,*

1Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Rd, La Jolla, CA 92037, USA and 2Center for Bioinformatics and Department of Molecular Biosciences, The University of Kansas, 2030 Becker Drive, Lawrence, KS 66047, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Motivation: Observation of co-crystallized protein–protein complexes and low-resolution protein–protein docking studies suggest the existence of a binding-related anisotropic shape characteristic of protein–protein complexes.

Results: Our study systematically assessed the global shape of proteins in a non-redundant database of co-crystallized protein–protein complexes by measuring the distance of the surface residues to the protein's center of mass. The results show that on average the binding site residues are closer to the center of mass than the non-binding surface residues. Thus, the study directly detects an important and simple binding-related characteristic of protein shapes. The results provide an insight into one of the fundamental properties of protein structure and association.

Contact: vakser{at}ku.edu


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Protein–protein interactions play the key role in life processes. The structural information on these interactions is necessary for understanding these interactions, explaining the fundamental principles of molecular recognition and mechanisms of protein association, and exploring the practical implications for structure-based drug design and other applications.

The foundation of knowledge about structures of protein–protein complexes is provided by experimental studies, primarily X-ray crystallography. The rapidly expanding Protein Data Bank (POB) (Berman et al., 2000) contains increasing amount of co-crystallized protein–protein complexes, which serve as a unique resource for studying protein interfaces and other structural and physicochemical characteristics of protein interactions. If one defines protein–protein complex as two separate chains associated through a ‘biological’ (not crystal packing) interface, the current number of the complexes runs up to tens of thousands (Douguet et al., 2006), depending on the biological interface criteria. Excluding homologous complexes, the number of non-redundant protein–protein pairs is reduced typically to several hundreds, depending on the widely-ranging criteria for non-redundancy (Douguet et al., 2006).

Along with experimental determination of protein–protein structures, computational approaches are increasingly important as a source of structures and as a means of studying the structures. The field of protein–protein structure prediction (docking) techniques is rapidly developing (Gray, 2006; Marshall and Vakser, 2005; Szilagyi et al., 2005; Vajda and Camacho, 2004), taking advantage of better algorithms as well as expanding data sets of experimentally determined protein interfaces.

The knowledge of the binding site is a byproduct of protein–protein docking. However, the docking predictions are often unreliable. Moreover, in many cases the binding partner or its structure is unknown, thus making the docking impossible. Thus, independent from docking, prediction of protein binding sites is important. Such predictions are based on a variety of considerations, including evolutionary conservation (Glaser et al., 2003; Pazos and Sternberg, 2004; Res and Lichtarge, 2005) and physicochemical characteristics (Keskin et al., 2004; Larsen et al., 1998; Lijnzaad and Argos, 1997; Ofran and Rost, 2003; Young et al., 1994; Zhou and Shan, 2001). Along with these, geometry is an important determinant of the binding site. Studies suggest that cavities in protein surface correlate with small ligand binding sites (del Sol et al., 2006; Ho and Marshall, 1990; Nayal and Honig, 2006), as well as protein binding sites (Binkowski et al., 2005; Rajamani et al., 2004). Moreover, characteristics of the entire protein shape, like principal axes of inertia, are correlated with ligand binding (Foote and Raman, 2000).

In the current study, we relate a simple protein shape characteristic to observed protein–protein binding modes. The results can be interpreted in terms of low-resolution protein–protein recognition.


    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The non-redundant database of 475 protein–protein complexes (Glaser et al., 2001) from PDB contains independent chains from co-crystallized protein pairs. The selection criteria were protein size ≥30 residues and the interface area ≥1000 Å2. The non-redundancy was achieved by requiring that no two complexes have both proteins homologous (≥30% sequence identity). The database has been extensively used in systematic studies of protein–protein recognition (e.g. Gray et al., 2003; Papoian and Wolynes, 2003; Tovchigrechko and Vakser, 2001; Tovchigrechko et al., 2002; Vakser et al., 1999).

The surface residues were detected by PSA program (Sali and Blundell, 1993). A residue was considered to be at the interface if its Cß-Cß (C{alpha}, in case of Gly) distance from any residue of the other protein was ≤7 Å. The position the C{alpha} atom was used to calculate the distance between the residue and the center of mass of the protein.


    3 RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
3.1 Rationale
The size asymmetry in small ligand (organic compound)–receptor (protein) interaction typically plays out in the binding site on the receptor being a cavity (del Sol et al., 2006; Ho and Marshall, 1990; Nayal and Honig, 2006). A binding cavity on a small ligand is obviously geometrically impossible; however, geometry imposes no such restriction on the macromolecular receptor. Geometrically, the binding site on the receptor can be of any type—concave, flat, or convex. The fact that the observed binding sites are concave in all likelihood has to do with the free energy aspects of ligand–receptor interaction (which are beyond the scope of this study).

Transitioning to protein–protein complexes, one still tends to think of the larger protein as the ‘receptor’ and the smaller one as the ‘ligand’. This largely unspoken tradition is common in protein–protein docking, where the larger protein is often assigned to be ‘stationary’ and the smaller one ‘moves’ to dock with the larger one. Beyond the semantics of this issue and the fact that in some docking algorithms a smaller ‘moving’ molecule saves computer time [e.g. FFT approaches (Katchalski-Katzir et al., 1992)], casual observation of co-crystallized protein–protein complexes suggests that the smaller protein often binds in the cavity of the bigger one (e.g. enzyme–protein inhibitor complexes). One important exception would be the antigen–antibody complexes, where regardless of the antibody's target size, the antigenic site is typically convex (Novotny et al., 1986).

Quantitatively, the concave character of the binding site on the larger protein was suggested by earlier low-resolution docking studies (Tovchigrechko and Vakser, 2001; Vakser et al., 1999). These geometry-only based studies, where all structural details smaller than ~7 Å are deleted, showed that within a complex, the smaller protein typically has more freedom in angular orientation than the larger protein, suggesting the existence of a prominent binding-related anisotropic shape characteristic of the larger protein (e.g. a binding cavity). Often the anisotropic character of the larger protein docking orientation may be explained by a large flat interface rather than a cavity. In any case, the low-resolution docking, along with studies of binding sites geometry, principal axes of inertia, etc. (see Introduction), conclusively point to the existence of large shape characteristics of proteins that distinguish the binding site. The current study directly detects a simple geometric characteristic of the binding site on the larger protein that facilitates protein–protein recognition.

3.2 Binding site statistics
The non-redundant data set of 475 protein–protein complexes (see Methods) was used to calculate the distance of surface residues to the center of mass in the larger protein in a complex. In case of equal-size homodimers, the ‘larger’ protein was chosen randomly. For each complex, the shape of the larger protein was assessed according to the formula d = < di > /< do >, where < di > is the average distance of the interface residues to the protein center of mass, and < do > is that of the non-interface surface residues. Thus, if d < 1 the binding site is closer than average and, if d > 1, farther than average to the center of mass. The number of complexes with different d-values is shown in Figure 1 for the entire database (see Methods), as well as for small (1000–2000 Å2) and large (>4000 Å2) interfaces. The data clearly shows a tendency of the interface residues to be closer than average to the center of mass. The effect is not detectable for the small interfaces, but increases dramatically for the large interfaces.


Figure 1
View larger version (17K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Number of complexes with different average distances of the interface residues to the center of mass of the larger protein. The data is shown for all complexes in the database (A), and separately for (B) small interfaces (1000–2000 Å2) and (C) large interfaces (>4000 Å2). Complexes with d < 1 have the binding site closer to the center of mass than the non-binding surface, and complexes with d > 1 have the binding site farther from the center of mass than the non-binding surface. See text for details.

 
The paradigm is illustrated in Figure 2. Examples of actual interfaces are shown in Figure 3. Arguably, a small interface is geometrically less likely than a large one to have a deep concavity or significant flatness detectable by a simple measure of the average distance to the protein center of mass. On the other hand, a large interface on the larger protein within a complex geometrically can be of any type—concave, convex or flat (Fig. 2). The fact that it is by far more likely to be close to the protein center of mass than the rest of the surface does not follow from geometry, but is rather due to free energy aspects of protein binding/folding. The analysis of such possible reasons is beyond the scope of this article, which simply describes quantitative phenomenological detection of this prominent shape characteristic.


Figure 2
View larger version (19K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Schematic illustration of some possible protein–protein complex geometries. The distances from the center of mass (arrows) illustrate (A) more likely geometries (binding site on average is closer to the center of mass than the non-binding surface) and (B) less likely geometries. For simplicity, the illustration shows proteins of different size. However, the same paradigm of binding site close to the center of mass applies to homodimers.

 

Figure 3
View larger version (31K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3. Examples of protein–protein interfaces. A cross section through the structure shows (A) small interface with undetectable binding-related shape anisotropy (1138 Å2), (B) large flat interface (7004 Å2), and (C) large concave interface (4055 Å2).

 
The anisotropic character of the protein shape, in principle, can be used for the binding site prediction. However, our estimates (data not shown) indicate that the simple d measure alone may not be sensitive enough for a useful predictive procedure. The significance of this study is rather in discovery of an important (and simple) characteristic of protein shapes. The results provide an insight into one of the fundamental properties of protein structure and association.

3.3 Future directions
The current study will further develop in four directions. First, a distinction will be made between the obligate complexes (where the components exist in the co-crystallized folds only within the complex) and the non-obligate ones. One possibility is that the binding site concavity may be more pronounced in the non-obligate complexes (at least those with large interfaces), whereas the interfaces in multisubunit (presumably obligate) complexes may be more flat. Second, different complex types, according to their function (e.g. enzyme-inhibitor, electron transfer, etc.), will be tested separately. Third, a more detailed subdivision of the characteristic geometry types (Fig. 2) will be explored. Fourth, more sophisticated geometric determinants (e.g. describing the global shape, capturing local curvature, detecting the surface roughness) will be studied with regard to their capability to correlate with the binding site position.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The authors wish to thank Andrey Tovchigrechko for helpful comments. The study was supported by NIH R01 GM074255. Funding to pay the Open Access publication charges was provided by NIH.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Anna Tramontano

Received on December 18, 2006; revised on January 15, 2007; accepted on January 17, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS AND DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Berman HM, et al. The Protein Data Bank. Nucleic Acids Res (2000) 28:235–242.[Abstract/Free Full Text]

    Binkowski TA, et al. Protein surface analysis for function annotation in high-throughput structural genomics pipeline. Protein Sci (2005) 14:2972–2981.[CrossRef][Web of Science][Medline]

    del Sol A, et al. Residue centrality, functionally important residues, and active site shape: analysis of enzyme and non-enzyme families. Protein Sci (2006) 15:2120–2128.[CrossRef][Web of Science][Medline]

    Douguet D, et al. DOCKGROUND resource for studying protein-protein interfaces. Bioinformatics (2006) 22:2612–2618.[Abstract/Free Full Text]

    Foote J, Raman A. A relation between the principal axes of inertia and ligand binding. Proc. Natl. Acad. Sci. USA (2000) 97:978–983.[Abstract/Free Full Text]

    Glaser F, et al. ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics (2003) 19:163–164.[Abstract/Free Full Text]

    Glaser F, et al. Residue frequencies and pairing preferences at protein-protein interfaces. Proteins (2001) 43:89–102.[CrossRef][Web of Science][Medline]

    Gray JJ. High-resolution protein–protein docking. Curr. Opin. Struct. Biol (2006) 16:183–193.[CrossRef][Web of Science][Medline]

    Gray JJ, et al. Protein–protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J. Mol. Biol (2003) 331:281–299.[CrossRef][Web of Science][Medline]

    Ho CMW, Marshall GR. Cavity search: an algorithm for the isolation and display of cavity-like binding regions. J. Comput. Aided Mol. Des (1990) 4:337–354.[CrossRef][Web of Science][Medline]

    Katchalski-Katzir E, et al. Molecular surface recognition: determination of geometric fit between proteins and their ligands by correlation techniques. Proc. Natl. Acad. Sci. USA (1992) 89:2195–2199.[Abstract/Free Full Text]

    Keskin O, et al. A new, structurally nonredundant, diverse data set of protein–protein interfaces and its implications. Protein Sci (2004) 13:1043–1055.[CrossRef][Web of Science][Medline]

    Larsen TA, et al. Morphology of protein-protein interfaces. Structure (1998) 6:421–427.[Medline]

    Lijnzaad P, Argos P. Hydrophobic patches on protein subunit interfaces: charactersitics and prediction. Proteins (1997) 28:333–343.[CrossRef][Web of Science][Medline]

    Marshall GR, Vakser IA. Protein-protein docking methods. In: Proteomics and Protein-Protein Interaction: Biology, Chemistry, Bioinformatics, and Drug Design.—Waksman G, ed. (2005) New York: Springer. 115–146.

    Nayal M, Honig B. On the nature of cavities on protein surfaces: application to the identification of drug-binding sites. Proteins (2006) 63:892–906.[CrossRef][Web of Science][Medline]

    Novotny J, et al. Antigenic determinants in proteins coincide with surface regions accessible to large probes (antibody domains). Proc. Natl. Acad. Sci. USA (1986) 83:226–230.[Abstract/Free Full Text]

    Ofran Y, Rost B. Analysing six types of protein–protein interfaces. J. Mol. Biol (2003) 325:377–387.[CrossRef][Web of Science][Medline]

    Papoian GA, Wolynes PG. The physics and bioinformatics of binding and folding – an energy landscape perspective. Biopolymers (2003) 68:333–349.[CrossRef][Web of Science][Medline]

    Pazos F, Sternberg MJE. Automated prediction of protein function and detection of functional sites from structure. Proc. Natl. Acad. Sci. USA (2004) 101:14754–14759.[Abstract/Free Full Text]

    Rajamani D, et al. Anchor residues in protein–protein interactions. Proc. Natl. Acad. Sci. USA (2004) 101:11287–11292.[Abstract/Free Full Text]

    Res I, Lichtarge O. Character and evolution of protein–protein interfaces. Phys. Biol (2005) 2:S36–S43.[CrossRef][Web of Science][Medline]

    Sali A, Blundell TL. Comparative protein modeling by satisfaction of spatial restraints. J. Mol. Biol (1993) 234:779–815.[CrossRef][Web of Science][Medline]

    Szilagyi A, et al. Prediction of physical protein-protein interactions. Phys. Biol (2005) 2:S1–S16.[CrossRef][Web of Science][Medline]

    Tovchigrechko A, Vakser IA. How common is the funnel-like energy landscape in protein-protein interactions? Protein Sci (2001) 10:1572–1583.[CrossRef][Web of Science][Medline]

    Tovchigrechko A, et al. Docking of protein models. Protein Sci (2002) 11:1888–1896.[CrossRef][Web of Science][Medline]

    Vajda S, Camacho CJ. Protein–protein docking: is the glass half-full or half-empty? Trends Biotechnol (2004) 22:110–116.[CrossRef][Web of Science][Medline]

    Vakser IA, et al. A systematic study of low-resolution recognition in protein-protein complexes. Proc. Natl. Acad. Sci. USA (1999) 96:8477–8482.[Abstract/Free Full Text]

    Young L, et al. A role for surface hydrophobicity in protein-protein recognition. Protein Sci (1994) 3:717–729.[Web of Science][Medline]

    Zhou HX, Shan Y. Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins (2001) 44:336–343.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
23/7/789    most recent
btm018v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (8)
Google Scholar
Right arrow Articles by Nicola, G.
Right arrow Articles by Vakser, I. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nicola, G.
Right arrow Articles by Vakser, I. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?