Skip Navigation


Bioinformatics Advance Access originally published online on September 23, 2008
Bioinformatics 2008 24(22):2634-2635; doi:10.1093/bioinformatics/btn497
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/22/2634    most recent
btn497v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Liu, S.
Right arrow Articles by Vakser, I. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Liu, S.
Right arrow Articles by Vakser, I. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

DOCKGROUND protein–protein docking decoy set

Shiyong Liu 1, Ying Gao 1 and Ilya A. Vakser 1,2,*

1 Center for Bioinformatics and 2 Department of Molecular Biosciences, The University of Kansas, 2030 Becker Drive, Lawrence, KS 66047, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 CONCLUSION
 Funding
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: A protein–protein docking decoy set is built for the DOCKGROUND unbound benchmark set. The GRAMM-X docking scan was used to generate 100 non-native and at least one near-native match per complex for 61 complexes. The set is a publicly available resource for the development of scoring functions and knowledge-based potentials for protein docking methodologies.

Availability: The decoys are freely available for download at http://dockground.bioinformatics.ku.edu/UNBOUND/decoy/decoy.php

Contact: vakser{at}ku.edu


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 CONCLUSION
 Funding
 ACKNOWLEDGEMENTS
 REFERENCES
 
Computational techniques for structural modeling of protein–protein interactions are rapidly developing, both in terms of methodology and computing power (Gray, 2006; Vajda and Camacho, 2004). An important activity in the field of protein–protein docking is the community-wide Critical Assessment of Predicted Interactions (CAPRI; http://capri.ebi.ac.uk; Wodak, 2007), which allows comparison of different computational methods on a set of prediction targets.

A number of databases of protein–protein complexes have been compiled and used to investigate physicochemical and structural preferences at protein–protein interfaces (Davis and Sali, 2005; Douguet et al., 2006; Gao et al., 2007; Keskin et al., 2004; Kundrotas and Alexov, 2007; Lu et al., 2003). It is essential for the protein–protein databases to be comprehensive, automatically updated and fully querying, like the ones in the DOCKGROUND project (Douguet et al., 2006; Gao et al., 2007).

Benchmark sets of complexes with both bound and unbound structures have been developed for validation of docking approaches (Gao et al., 2007; Mintseris et al., 2005). The sets contain ~100 crystallographically determined pairs of proteins. An important part in developing intermolecular potentials and scoring functions is decoy sets of structures (false positive matches). Reliable docking procedures have to distinguish between decoys and correct matches. Development of protein–protein docking decoys started in our lab in 1998. The number of decoys was further expanded by Sternberg and co-workers, and then by Baker, Gray and co-workers (RosettaDock, http://depts.washington.edu/bakerpg), Weng and co-workers (ZDOCK, http://zlab.bu.edu) and others. Currently available decoy sets typically are ranked by scoring functions that involve force field terms, statistical potentials, etc. The ZDOCK set contains tens of thousands of matches per complex, which complicates testing and optimization of computationally expensive scoring functions. The RosettaDock set consists of minimized structures with replaced side chains, targeted for high-resolution (post-refinement) scoring, which may be inappropriate for low-resolution scoring of post-scan/pre-refinement complexes with structural clashes and gaps. Some complexes in the above sets do not contain near-native matches. The decoy set presented in this article, built within the DOCKGROUND project (http://dockground.bioinformatics.ku.edu), involves post-scan matches based on shape complementarity alone and contains 100 decoys per complex plus near-native matches for each complex. Thus, it is an unbiased set that it is optimally suited for testing and optimization of the post-scan scoring functions.


    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 CONCLUSION
 Funding
 ACKNOWLEDGEMENTS
 REFERENCES
 
The docking was performed by our GRAMM-X FFT docking procedure (Tovchigrechko and Vakser, 2005). The procedure performs exhaustive sampling of the translation/rotation space with the soft Lennard–Jones potential, based on our GRAMM algorithm, which has been extensively published and validated over the years (Katchalski-Katzir et al., 1992; Vakser, 1995, 1997; Vakser et al., 1999). The scan stage grid translation step was 1.5 Å and rotation step 6{circ}.

DOCKGROUND project is an expanding resource for the development of docking techniques and studies of protein interfaces (http://dockground.bioinformatics.ku.edu; Douguet et al., 2006; Gao et al., 2007). The docking decoys were built for the unbound docking benchmark set Version 2, which contains structures with crystallographically determined bound (co-crystallized) and unbound (crystallized separately) forms. The set was built based on the following selection criteria: sequence identity between bound and unbound structures >97%, sequence identity between complexes <30%, deleted homomultimers (sequence identity between chains <70%) and deleted crystal packing complexes and structures in wrong format. The total number of complexes in the set was 99.

GRAMM-X scan was applied to the set to build docking decoys. The following characteristics from the CAPRI evaluation protocol were computed for 500 000 matches per complex: RMSD of the backbone atoms of the ligand (the smaller the component of the complex; the receptor being the larger one), RMSD of the backbone atoms of the interface residues, the number of native residue–residue contacts in the predicted complex divided by the number of contacts in the native complex and the number of non-native residue–residue contacts in the predicted complex divided by the total number of contacts in the complex. Matches with ligand RMSD<5.0 Å were defined as the near-native ones. The set contains 100 lowest energy non-native structures and at least one near-native structure per complex. The total number of complexes in the decoy set is 61 and includes only complexes where at least one near-native match was found.


    3 RESULTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 CONCLUSION
 Funding
 ACKNOWLEDGEMENTS
 REFERENCES
 
The RMSD between bound and unbound structure reflects the degree of conformational change upon the complex formation. Table 1 shows the average statistics for the three groups of complexes. The average RMSDs between bound and unbound structure are rather small. This corresponds to the earlier estimates indicating that the majority of protein complexes have small backbone conformational change between bound and unbound forms (Gao et al., 2007).


View this table:
[in this window]
[in a new window]

 
Table 1. Average statistics on protein–protein docking decoys

 
GRAMM-X was unable to detect near-native matches in complexes with large conformational changes (primarily due to the domain shifts). Thus such complexes are not present in the decoy set.

The native structures, as opposed to the near-native ones, were deliberately excluded from the set because they are never achievable in practical docking and thus would be an unrealistic reference point for the development of docking methodologies. An example of docking decoys for a particular complex is shown in Figure 1. Application of popular scoring functions ZRANK (http://zdock.bu.edu/software.php) and DFIRE (http://sparks.informatics.iupui.edu) placed the near-native structure in top 10 matches in 40–50% of complexes.


Figure 1
View larger version (65K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Example of docking decoys. Matches represented by the ligand's center of mass are shown for 1e96 enzyme-inhibitor complex. the receptor (in green) and the ligand (in cyan) are shown in co-crystallized configuration. The native match is in yellow (not part of the decoy set), 10 near-native matches are in red and 100 non-native matches are in blue.

 

    4 CONCLUSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 CONCLUSION
 Funding
 ACKNOWLEDGEMENTS
 REFERENCES
 
A protein–protein docking decoy set is built for the DOCKGROUND unbound benchmark set. The GRAMM-X docking scan was used to generate 100 non-native and at least one near-native match per complex for 61 complexes. The set is a publicly available resource for the development of scoring functions and knowledge-based potentials for protein docking methodologies.


    Funding
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 CONCLUSION
 Funding
 ACKNOWLEDGEMENTS
 REFERENCES
 
National Institutes of Health (grant R01 GM074255).

Conflict of Interest: none declared.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 CONCLUSION
 Funding
 ACKNOWLEDGEMENTS
 REFERENCES
 
The authors wish to thank Andrey Tovchigrechko for assistance with GRAMM-X docking.


    FOOTNOTES
 
Associate Editor: Burkhard Rost

Received on April 9, 2008; revised on September 12, 2008; accepted on September 16, 2008

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 RESULTS
 4 CONCLUSION
 Funding
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Davis FP, Sali A. PIBASE: a comprehensive database of structurally defined protein interfaces. Bioinformatics (2005) 21:1901–1907.[Abstract/Free Full Text]

    Douguet D, et al. DOCKGROUND resource for studying protein-protein interfaces. Bioinformatics (2006) 22:2612–2618.[Abstract/Free Full Text]

    Gao Y, et al. DOCKGROUND system of databases for protein recognition studies: unbound structures for docking. Proteins (2007) 69:845–851.[CrossRef][Web of Science][Medline]

    Gray JJ. High-resolution protein–protein docking. Curr. Opin. Struct. Biol. (2006) 16:183–193.[CrossRef][Web of Science][Medline]

    Katchalski-Katzir E, et al. Molecular surface recognition: determination of geometric fit between proteins and their ligands by correlation techniques. Proc. Natl Acad. Sci. USA (1992) 89:2195–2199.[Abstract/Free Full Text]

    Keskin O, et al. A new, structurally nonredundant, diverse data set of protein–protein interfaces and its implications. Protein Sci. (2004) 13:1043–1055.[CrossRef][Web of Science][Medline]

    Kundrotas PJ, Alexov E. PROTCOM: searchable database of protein complexes enhanced with domain–domain structures. Nucleic Acids Res. (2007) 35:D575–D579.[Abstract/Free Full Text]

    Lu H, et al. Development of unified statistical potentials describing protein-protein interactions. Biophys. J. (2003) 84:1895–1901.[Web of Science][Medline]

    Mintseris J, et al. Protein-protein docking benchmark 2.0: an update. Proteins (2005) 60:214–216.[CrossRef][Web of Science][Medline]

    Tovchigrechko A, Vakser IA. Development and testing of an automated approach to protein docking. Proteins (2005) 60:296–301.[CrossRef][Web of Science][Medline]

    Vajda S, Camacho CJ. Protein--protein docking: is the glass half-full or half-empty? Trends Biotechnol. (2004) 22:110–116.[CrossRef][Web of Science][Medline]

    Vakser IA. Protein docking for low-resolution structures. Protein Eng. (1995) 8:371–377.[Abstract/Free Full Text]

    Vakser IA. Evaluation of GRAMM low-resolution docking methodology on the hemagglutinin-antibody complex. Proteins (1997) (Suppl. 1):226–230.

    Vakser IA, et al. A systematic study of low-resolution recognition in protein-protein complexes. Proc. Natl Acad. Sci. USA (1999) 96:8477–8482.[Abstract/Free Full Text]

    Wodak SJ. From the Mediterranean coast to the shores of Lake Ontario: CAPRI's premiere on the American continent. Proteins (2007) 69:697–698.[Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
24/22/2634    most recent
btn497v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Liu, S.
Right arrow Articles by Vakser, I. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Liu, S.
Right arrow Articles by Vakser, I. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?