Skip Navigation


Bioinformatics Advance Access originally published online on October 17, 2006
Bioinformatics 2006 22(24):3101-3102; doi:10.1093/bioinformatics/btl530
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/24/3101    most recent
btl530v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Woolley, G. A.
Right arrow Articles by Zhang, F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Woolley, G. A.
Right arrow Articles by Zhang, F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

sGAL: a computational method for finding surface exposed sites in proteins suitable for Cys-mediated cross-linking

G. Andrew Woolley *, En-shiun Lee and Fuzhong Zhang

Department of Chemistry, University of Toronto 80 St. George Street, Toronto, ON M5S 3H6, Canada

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 REFERENCES
 

sGAL is a computer program designed to find pairs of sites suitable for introducing chemical cross-links into proteins. sGAL takes a protein structure file in PDB format as input, truncates each residue sequentially to its gamma side chain atom to mimic mutation to Cys, and calculates the exposed surface area of the gamma atom. The user then inputs the minimum and maximum lengths of the cross-linker. sGAL provides as output pairs of residues that would have exposed gamma atom separations that fall within this range. Furthermore, if a line joining the pair of gamma atoms contacts more than a given number of buried atoms, that pair is discarded. In this way, sites for which the protein would sterically interfere with cross-linking are avoided.

Availability: http://www.chem.utoronto.ca/staff/GAW/links.html; (Surface Racer is also required see: http://monte.biochem.wisc.edu/~tsodikov/surface.html).

Contact: awoolley{at}chem.utoronto.ca

Chemical cross-links can be introduced into proteins in order to alter or stabilize particular protein structures (Kluger et al., 2004). Examples include cross-linking of hemoglobin to prevent subunit dissociation, thereby allowing it to function as a blood substitute (Roach et al., 2004) and cross-linking, combined with mass spectrometry to obtain moderate resolution structural information (Jacobsen et al., 2006). Chemical cross-linkers intended to stabilize, and not distort, proteins must target exposed surface sites. In addition, the protein structure must not interfere with the reaction of the cross-linker. For instance two reactive sites may be an appropriate distance apart but on opposite faces of a protein. Since the cross-link cannot pass through the protein without some distortion, such sites would not be suitable for cross-linking. Cys residues are often chosen as reactive sites since they can be targeted with high chemical selectivity, have low natural abundance, and can be readily installed by site directed mutagenesis. The assumption is usually made that replacement of a surface exposed residue with Cys does not lead to conformational rearrangements at other sites.

We have introduced a series of thiol-reactive cross-linkers intended for intramolecular cross-linking of peptides and proteins that carry the additional feature of photo-switchability (Woolley, 2005). These cross-linkers can be photo-switched between cis and trans conformations effectively changing their end to end distance. Isomerization of the cross-linker can be used to promote helix folding or unfolding in peptides and proteins (Burns et al., 2004; Woolley, 2005). In principle, such photo-switchable cross-linkers could be used to promote other types of protein conformational change. For example, in a beta sheet protein, cross-linker isomerization may be used to promote protein folding or unfolding by imposing constraints on strand-strand distances. In order to explore such possibilities in a systematic manner it is useful to have a systematic means for identifying potential sites for cross-linking. Although, such sites can often be identified by visual inspection of a protein 3D structure, this procedure is time-consuming and unlikely to be comprehensive.

The sGAL algorithm starts with a protein structure file in PDB format (Berman et al., 2000). As an example, to facilitate description of the program, we will take the coordinate file for the Fyn tyrosine kinase SH3 domain (PDB accession code: 1SHF [PDB] ). The file is first modified by removing coordinates for water molecules or other solvent molecules using a standard viewer, such as Accelrys DSVisualizer, DeepView etc. (Guex et al., 1997). For the 1SHF example, there are two protein molecules in the unit cell; we delete molecule B. The input file is then saved in standard PDB format. In this case residue numbering runs from V84 at the N-terminus to D162 at the C-terminus.

Beginning with V84, sGAL removes any atoms beyond the gamma atom of the residue. This mimics mutating the residue to Cys for which the nucleophilic S atom is the gamma atom. For residues with two gamma atoms (Val, Ile and Thr) both are retained. Residues without a gamma atom (Ala and Gly) are left unchanged together with Pro. The conformational properties of Pro and Gly may lead to unwanted conformational changes if these residues are replaced by Cys and so these residues are not often chosen for mutation. Evaluating a gamma atom position for an Ala mutation would require building the side chain; this would involve assumptions about rotamer preference as well as tests for steric clash. All other residues have a unique gamma atom.

The modified PDB file is then sent to Surface Racer to calculate an atom-by-atom surface accessible surface area (Tsodikov et al., 2002). Surface Racer was written by Tsodikov, O. V., Record, M. T. Jr. and Sergeev, Y. V. for fast exact calculation of accessible and molecular surface areas as well as average surface curvature (Tsodikov et al., 2002). It is available at http://monte.biochem.wisc.edu/~tsodikov/surface.html. We use the Richards van der Waals radii set and a probe radius of 1.4 Å to simulate an aqueous environment (Lee et al., 1971). The output of Surface Racer is then used to annotate the original PDB file with the exposed surface area of the gamma atom of the residue in question.

After annotating the PDB file with the surface areas of the atoms associated with the first residue, sGAL reads the full PDB file again (with no truncated residues) and moves to the next residue in sequence to perform the truncation and surface area calculation again. Surface Racer is therefore called once for each residue in the sequence (except Ala, Pro and Gly). Note that one cannot simply calculate the exposed surface areas of gamma atoms without doing the truncation procedure since most gamma atoms are shielded from solvent by delta atoms (e.g. the two delta methyl groups of Leu shield the gamma atom effectively). The annotated PDB file produced by sGAL thus provides data on residues which, if mutated to Cys, would have the S atom exposed to react with a cross-linker. For the Fyn SH3 domain example, using a >20 Å2 cutoff for surface exposure, sGAL returns the sites shown as spheres in Figure 1A.


Figure 1
View larger version (36K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1 (A) All surface exposed gamma atoms (>20 Å2); (B) All surface exposed gamma atom pairs between 16 and 17 Å apart (joined by black lines); (C) All surface exposed gamma atoms between 16 and 17 Å apart after internal atoms correction.

 
After carrying out the surface exposed area calculations, sGAL requests user input of the desired maximum and minimum distances between gamma atoms. A distance range of at least 3 Å is recommended if one wishes to account for all possible combinations of Cys rotamers that may occur experimentally, If the cross-linker to be used is flexible, a broader range may be required. Pairwise distances are calculated using the Cartesian coordinates of all gamma atoms that meet the criterion for being surface exposed (e.g. >20 Å2; user selected). For each pairwise distance that falls between the minimum and maximum distances specified, sGAL gives the coordinates of the pair together with the distance between the atoms. Using a distance range from 16 to 17 Å with the coordinates from 1SHF and >20 Å2 exposed surface area we find the gamma atom pairs corresponding shown in Figure 1B.

sGAL performs a further check to evaluate whether installing a cross-linker linking the pair of gamma atoms in question would be likely to lead to steric clashes with the rest of the protein. To check for steric clashes, the distance between all protein atoms and the line connecting the selected gamma atom pair is calculated. For each gamma atom pair, the number of internal atoms (with zero exposed surface area that are near (<3 Å) the line is calculated. The user inputs a tolerance for this value, (e.g. exclude gamma atom pairs with >9 internal atoms near the line linking them). sGAL then outputs a list of gamma atoms pairs that satisfy the criteria. For example, using the Fyn SH3 domain, a distance range between 16 to 17 Å, and excluding pairs with >9 internal atoms, we get the pairs shown in Figure 1C. By varying the number of excluded atoms and the value for surface exposure, one can select only highly exposed pairs that would be compatible with a sterically bulky cross-linker, or more highly buried pairs.

We have used sGAL as an aid for choosing sites suitable for introducing cross-linkers in a variety of proteins. In the example above, sGAL predicts that a 16–17 Å long cross-linker will link Cys atoms installed at positions 97 and 116 in the Fyn SH3 protein; we have confirmed this experimentally. Furthermore, by adding the length of an extended Lys side chain to the cross-linker length one inputs in sGAL, Lys-Lys or Cys-Lys cross-linking sites can be identified. In this manner the sGAL can be used to predict the Lys67-Cys316 cross-link produced when rhodopsin is treated with chemical cross-linkers (Jacobsen et al., 2006) and the ßLys82-ß'Lys82 cross-link produced when hemoglobin is treated (Kluger et al., 2004). The sGAL output is meant to be used as a guide so that the protein engineer can take the residues identified by sGAL, mutate them to Cys using a standard viewer and visually inspect the proposed cross-linking site. Sites may be deemed unsuitable because, for instance, one residue is part of a highly mobile loop or disordered terminus. The power of sGAL resides in the fact that the user can be sure that all possible sites have been identified before making a choice about which ones to pursue experimentally.

sGAL is written as a Java Applet using j2sdk1.4.2_01 and uses Visual Basic Script to automate SurfaceRacer. Surface Racer executable is available from (http://monte.biochem.wisc.edu/~tsodikov/surface.html). sGAL requires SurfaceRacer 3.0 installed on the same directory in order to run. The Java jar file is signed to accommodate the Java web security framework. It is developed on WindowsXP with Internet Explorer and Firefox Mozilla. The executable and source code are available from (http://www.chem.utoronto.ca/staff/GAW/links.html).


    Acknowledgments
 
This work has been supported by the Canadian Institutes of Health Research Training Program in Protein Folding and by Natural Sciences and Engineering Research Council (Canada).

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Anna Tramontano

Received on August 18, 2006; revised on September 25, 2006; accepted on October 10, 2006

    REFERENCES
 TOP
 ABSTRACT
 REFERENCES
 

    Berman, H.M., et al. (2000) The Protein Data Bank. Nucleic Acids Res, . 28, 235–242[Abstract/Free Full Text].

    Burns, D.C., et al. (2004) Origins of helix-coil switching in a light-sensitive peptide. Biochemistry, 43, 15329–15338[CrossRef][Medline].

    Guex, N., et al. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis, 18, 2714–2723[CrossRef][Web of Science][Medline].

    Jacobsen, R.B., et al. (2006) Structure and dynamics of dark-state bovine rhodopsin revealed by chemical cross-linking and high-resolution mass spectrometry. Protein Sci, . 15, 1303–1317[CrossRef][Web of Science][Medline].

    Kluger, R., et al. (2004) Chemical cross-linking and protein-protein interactions—a review with illustrative protocols. Bioorg. Chem, . 32, 451–472[CrossRef][Web of Science][Medline].

    Lee, B., et al. (1971) The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol, . 55, 379–400[CrossRef][Web of Science][Medline].

    Roach, T.A., et al. (2004) A novel site-directed affinity reagent for cross-linking human hemoglobin: bis[2-(4-phosphonooxyphenoxy)carbonylethyl]phosphinic acid. J. Med. Chem, . 47, 5847–5859[CrossRef][Web of Science][Medline].

    Tsodikov, O., et al. (2002) A novel computer program for fast exact calculation of accessible and molecular surface areas and average surface curvature. J. Comput. Chem, . 23, 600–609[CrossRef][Web of Science][Medline].

    Woolley, G.A. (2005) Photocontrolling peptide alpha helices. Acc. Chem. Res, . 38, 486–493[CrossRef][Web of Science][Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/24/3101    most recent
btl530v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Woolley, G. A.
Right arrow Articles by Zhang, F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Woolley, G. A.
Right arrow Articles by Zhang, F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?