Skip Navigation


Bioinformatics Advance Access originally published online on April 12, 2005
Bioinformatics 2005 21(12):2856-2860; doi:10.1093/bioinformatics/bti444
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/12/2856    most recent
bti444v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (7)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Varrazzo, D.
Right arrow Articles by Niccolai, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Varrazzo, D.
Right arrow Articles by Niccolai, N.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

Three-dimensional computation of atom depth in complex molecular structures

Daniele Varrazzo 1, Andrea Bernini 1,2, Ottavia Spiga 1,2, Arianna Ciutti 2, Stefano Chiellini 2, Vincenzo Venditti 1, Luisa Bracci 1 and Neri Niccolai 1,2,*

1Biomolecular Structure Research Center and Department of Molecular Biology, Università di Siena I-53100 Siena, Italy
2SienaBioGrafix Srl I-53100 Siena, Italy

*To whom correspondence should be addressed.


    Abstract
 TOP
 Abstract
 INTRODUCTION
 ALGORITHM
 IMPLEMENTATION
 RESULTS AND DISCUSSION
 REFERENCES
 

Motivation: For a complex molecular system the delineation of atom–atom contacts, exposed surface and binding sites represents a fundamental step to predict its interaction with solvent, ligands and other molecules. Recently, atom depth has been also considered as an additional structural descriptor to correlate protein structure with folding and functional properties. The distance between an atom and the nearest water molecule or the closest surface dot has been proposed as a measure of the atom depth, but, in both cases, the 3D character of depth is largely lost. In the present study, a new approach is proposed to calculate atom depths in a way that the molecular shape can be taken into account.

Results: An algorithm has been developed to calculate intersections between the molecular volume and spheres centered on the atoms whose depth has to be quantified. Many proteins with different size and shape have been chosen to compare the results obtained from distance-based and volume-based depth calculations. From the wealth of experimental data available for hen egg white lysozyme, H/D exchange rates and TEMPOL induced paramagnetic perturbations have been analyzed both in terms of depth indexes and of atom distances to the solvent accessible surface. The algorithm here proposed yields better correlations between experimental data and atom depth, particularly for those atoms which are located near to the protein surface.

Availability: Instructions to obtain source code and the executable program are available either from http://sienabiografix.com or http://sadic.sourceforge.net

Contact: niccolai{at}unisi.it

Supplementary information: http://www.Sienabiogzefix.com/publication


    INTRODUCTION
 TOP
 Abstract
 INTRODUCTION
 ALGORITHM
 IMPLEMENTATION
 RESULTS AND DISCUSSION
 REFERENCES
 
Structural biology is nowadays rapidly growing, due to a synergistic post-genomic effect of the large developments of X-ray crystallography, nuclear magnetic resonance (NMR) and bioinformatics. In the Protein Data Bank (PDB) (Berman et al., 2000), a wealth of resolved and predicted protein structures are available and, on this basis, structural descriptors have been developed to correlate accessible molecular surface (Lee and Richards, 1971; Richmond, 1984; Quillin and Matthews, 2000; Totrov and Abagyan, 1996; Gerstein et al., 1995), molecular volumes (Richmond, 1984) and potential binding sites (Lo Conte et al., 1999; Shulman-Peleg et al., 2004; Tsuchiya et al., 2004; Innis et al., 2004; Gutteridge et al., 2003) with functional features, protein folding and structural stability (Serrano et al., 1992).

Recently, the calculation of atom depth from the protein surface has been proposed as an additional criterion to define protein structures more accurately (Pintar et al., 2003b; Chakravarty and Varadarajan, 1999) by exploring the interior of the molecule, as the strength of van der Waals (VdW) and electrostatic interactions might be dependent on the distance from the molecular surface (Chakravarty and Varadarajan, 1999; Richards, 1977). Moreover, once the deepest molecular moieties can be defined, a systematic analysis of their properties can be carried out to gain information on molecular structure and stability.

Atom depth can be defined as the distance between an atom and the nearest surface water molecule, either experimentally defined or hypothetically present. However, by evaluating the closest distance between an atom and a dot of the solvent accessible surface (Chakravarty and Varadarajan, 1999) or the distance between an atom and its closest solvent accessible neighbor (Pintar et al., 2003a), contributions from the 3D molecular shape to the actual atom depth are largely lost.

Hence, to estimate atom depth a new approach reflecting the molecular shape is proposed here by measuring intersections between the molecular volume and a sphere of a suitable radius, centered on the atom whose depth has to be quantified. It is apparent, indeed, that smaller the exposed volume, deeper is the 3D insertion of the investigated atom inside the molecular structure. Since, in general, depth is a very relative quantity which depends on the overall size and shape of the object under discussion, an atom depth index is suggested as a more appropriate parameter to discuss atom insertions within each investigated molecular systems. These depth indexes, calculated by using SADIC (Simple Atom Depth Index Calculator) algorithm, for instance, could constitute a new rational basis to reanalyze inner and outer amino acid compositions of proteins or to improve the analysis of depth-related physical phenomena. Among the latter molecular processes, H/D isotopic exchange of protein amide hydrogens is particularly relevant being commonly referred to as molecular surface exposures. Exchange rates are very frequently determined from NMR (Roder et al., 1985) or mass spectrometry (Miranker et al., 1996) studies to explore protein conformations and dynamics. It has also been shown that NMR studies of through-the-space interactions, occurring between paramagnetic probes and protein nuclei, can be interpreted in terms of protein surface exposures (Niccolai et al., 2001, 2003; Pintacuda and Otting, 2002).

In the present report, volume-based and distance-based atom depth have been evaluated and compared for proteins of different size and shape. Calculated depths have been also correlated with H/D exchange and paramagnetic perturbation data available for hen egg white lysozyme (HEWL).


    ALGORITHM
 TOP
 Abstract
 INTRODUCTION
 ALGORITHM
 IMPLEMENTATION
 RESULTS AND DISCUSSION
 REFERENCES
 
SADIC algorithm is based on the simple idea of sampling the space around each atom of a given molecule by evaluating, for selected distances from the atom center, the portion of volume that is external to any protein atom. In other words, such volume, henceforth called the exposed volume, represents the space external to the molecular surface comprised at a distance r in all directions around the atom. Therefore, the size of the exposed volume is a direct measure of atom depth with respect to the molecular surface, as smaller the exposed volume, deeper is the atom within the molecular structure. When dealing with exposed volumes instead of linear distances, as previously proposed for depth calculations (Chakravarty and Varadarajan, 1999; Pintar et al., 2003b), the information on surface shape is considered. SADIC algorithm can yield an accurate indication of atom depth, since distances from the atom center to the solvent exposed surface are simultaneously evaluated in all directions. It follows that atoms located in protruding loops have exposed volumes greater than those exhibited by atoms which are equally close to the surface but located at the bottom of a pocket.

In principle, this algorithm can be used to analyze local depth for objects having any size and shape, provided that only an ‘inside’ and an ‘outside’ can be unambiguously assigned. The 3D model of a molecule, as an assembly of sphere shaped atoms, satisfies this requirement, since all the points located inside one of these spheres, i.e. closer to an atom center than the VdW atom radius, can be considered ‘inside’ the molecule.

In order to approximate the volume of the intersection between the molecule and a sphere with a given radius r and center C, the sphere interior is split into units whose volume is known. For each volume unit a representative point is taken: we approximate the exposed volume by testing the sampling points against the molecule and summing the volume relative to all the points outside the molecular model.

The choice of r is of critical importance, since too small or too large r values would yield null or large exposed volumes, respectively at a very similar extent for all the atoms of the investigated molecule. It should be noticed that the values obtained by sampling inside a sphere of radius r can be effectively used to calculate exposition for each sphere of radius r' ≤ r. To exploit this possibility better, sampling points are chosen over concentric spheres with growing radii r0 ··· rn = r.

A simplistic pattern to sample a sphere interior consists of a regular grid in spherical coordinates. This method has the drawback to produce the same number of samples at each radius, thus yielding more packed points toward the poles, and the center and coarser points toward the equator and the outside. In order to overcome the problem, a different sampling pattern is used by SADIC, as described in detail in the Supplementary information section.


    IMPLEMENTATION
 TOP
 Abstract
 INTRODUCTION
 ALGORITHM
 IMPLEMENTATION
 RESULTS AND DISCUSSION
 REFERENCES
 
The current SADIC implementation is written in Python and C programming languages (see Supplementary information). The program consists of an object-oriented library providing classes responsible to generate sampling patterns, to model solid objects (they can be subclassed to add new capabilities to the framework) and to parse PDB. (Berman et al., 2000) entity files. An executable with a command line interface is provided: the program can read PDB entity files either from a local file system or from an external database through its URL (http, ftp and file protocols are supported) or its pdb ID code. The user can perform a molecule sampling on a list of given points in the space or on a selection of entity atoms, either absolutely referred by serial number or selected by atom name (e.g. using ‘CA’ to refer to protein backbone {alpha}-carbon), residue number, chain identifier. If the entity file contains more than one structure, as is the case of NMR determined structures, the sampling can be separately performed on a selection of structures. In this case, average and SD of the results may be automatically calculated.


    RESULTS AND DISCUSSION
 TOP
 Abstract
 INTRODUCTION
 ALGORITHM
 IMPLEMENTATION
 RESULTS AND DISCUSSION
 REFERENCES
 
SADIC program has been developed to obtain a new tool for a structural characterization of complex molecules, such as proteins and nucleic acids, by considering the atom depth. Since depth is a characteristic which can be conveniently discussed only in relation to the size of each investigated system, SADIC outputs are more conveniently analyzed as atom depth indexes, D, rather than absolute exposed volumes.

Thus, for an atom i of a given molecule and a sampling radius r, a depth index Di,r may be defined as

(1)
where Vi,r is the exposed volume of a sphere of radius r centered on atom i and V0,r is the exposed volume of the same sphere when centered on an isolated atom.

As already pointed out in the Algorithm section, to avoid flattening of the algorithm outputs towards similar Di,rs, the selection of the r value represents a very critical step. In Figure 1 the evolution of Di,rs of a representative selection of HEWL C{alpha} carbons is shown: Thr47 and Ile58 C{alpha} atoms are both equally close to the solvent exposed surface, but in the convex and concave molecular regions, respectively. Each of the Trp28 and Ser50 C{alpha} atoms are, instead, deeply inserted in one of the two HEWL domains. For small and large r values, all Di,rs converge to 0 and 2, respectively and the atom depth index calculated by SADIC loses its structural information. Conversely, in an intermediate region of r values, centered for HEWL at ~9 Å, a large dispersion of Di,rs can be observed. Then, to analyze conveniently atom depths it seems appropriate to choose the biggest sphere radius which determines the condition Dn,r = 0 only for one nth inner most atom. In the case of HEWL this condition is met by Trp28 C{alpha} carbon, thus resulting as the most internal atom, at a r value of 9 Å.



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 1 The evolution of Di,r values with the sphere radius r calculated for selected backbone C{alpha} carbons of HEWL (Trp28: crosses, Thr47: squares, Ser50: filled circles and Ile58: triangles).

 
Thus, once a suitable sphere radius has been chosen, calculated Di,rs readily describe the topology of each atom, as values close to 0 or 1 defines the inner or outer atoms (Fig. 2). Furthermore, the Di,r > 1 condition defines atoms which are very close to a convex molecular surface, as in the case of HEWL {alpha} carbons of Thr47, Asp48 and Gly117, whose Di,9 are 1.22, 1.10 and 1.15, respectively.



View larger version (61K):
[in this window]
[in a new window]
 
Fig. 2 Space fill representation of lysozyme (pdb file ID code 4lzt [PDB] ); the enzyme is halved into two complementary moieties to show some of the inner heavy atoms colored according to their Di,9 values.

 
The validity of the proposed algorithm has been tested on many proteins by comparing SADIC outputs with distance-based atom depths. Thus, as shown in Figure 3, Di,rs of a small spherical protein and of a large oblate one are compared with atom distances calculated from the closest exposed neighbor, dpxi (Pintar et al., 2003b) and from the nearest surface water molecule (Chakravarty and Varadarajan, 1999), dnwi. Among the different sets of data, a good agreement exists, as the C{alpha} carbons exhibiting the highest Di,r values correspond to the shortest dpxis and dnwis. Conversely, for the C{alpha} carbons having the longest dpxi and dnwi values, Di,rs close to 0 are found. It is also evident that a higher detail in describing atom depth, particularly for those atoms which are located near to the protein surface, is reached by SADIC. This feature directly derives from the fact that only Di,rs depend both on surface distances and molecular shape and that equally distant atoms from the surface, but close to concave or convex surface regions, exhibit very different depth indexes.



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 3 Comparison of the depth index of atom i, Di,r with the distance between atom i and its closest solvent accessible neighbor, dpxi or the nearest surface water molecule, dnwi, along the protein sequence positions. These different kinds of atom depth, calculated by using SADIC, the remote server http://hydra.icgeb.trieste.it/dpx/ and the software package MolMol (Koradi et al., 1996) are shown against the sequence positions of the backbone C{alpha} of (a) human neutrophil collagenase, a small spherical protein (pdb ID code 1BZS [PDB] and (b) acetylcholinesterase, a large oblate protein (pdb ID code 1EEA [PDB] ). Protein shapes have been classified on the basis of the corresponding moments of inertia, as estimated with the program EdPDB (Zhang and Brian, 1995).

 
To check how Di,rs can be useful in the structural interpretation of experimental data, reported H/D exchange rates of HEWL amide protons (Pedersen et al., 1993) and paramagnetic perturbations of NMR signals (Niccolai et al., 2003) have been analyzed in terms of atom depth. As shown in Figure 4, Di,9s, dpxis and dnwis correlate similarly with H/D exchange rates, Kexi, and paramagnetic signal attenuations, Ai, only in the case of the innermost HEWL atoms. For the outer ones, Kexis and Ais are, indeed, all grouped in a very narrow range of both dpxi and dnwi values. By simple inspection of Figure 4, it is apparent that Di,9s generally exhibit a higher correlation than distance-based depths with the experimental data, as all the slowest exchange rates of HEWL amide hydrogens have been calculated for atoms with Di,9 <0.6. It should be noted that for the latter amide hydrogens a large variety of distance-based atom depths is derived. Furthermore, any overlapping of the experimentally derived parameters observed at the closest surface distances is largely resolved, while the slow exchange rates measured for Ala10, Phe34 and Leu83 amide groups are more consistent with the corresponding Di,9s.



View larger version (28K):
[in this window]
[in a new window]
 
Fig. 4 Correlations of TEMPOL induced paramagnetic attenuations, Ai, of HEWL 13C-1H HSQC NMR signals of C{alpha} carbons with (a) Di,9, (b) dpxi and (c) dnwi. Correlations between Kexi of HEWL backbone amide protons and (d) Di,9, (e) dpxi and (f) dnwi. Identical dpxi and dnwi values are exhibited by the most exposed atoms: the latter atoms are highlighted with triangles in the Di,9 plot. Labels refer to amino acid residue positions in the protein sequence.

 
A linear or higher level dependence of Di,9s on HEWL Kexis and Ais cannot be delineated, as fast H/D exchange rates were measured for the deeply inserted Thr40 and Cys94 amide hydrogens, in spite of their close proximity to a concave surface. Moreover, a small Ai value is exhibited by the surface exposed Asp48 C{alpha} carbon, while for the buried Ile98 a strong paramagnetic attenuation is observed. These four cases represent the most evident discrepancies, but many other anomalous behaviors of Kexis and Ais versus both volume-based and distance-based depth can be seen in the data shown in Figure 4. In this respect, it should be stressed that both experimental parameters depend on atom depth only at a first approximation and that a more detailed discussion of exchange rates and paramagnetic perturbations in terms of atom depth would be needed. The H/D exchange process, determined by the dynamics of the hydrogen bond network within the protein and its hydration shell, is commonly related to solvent accessibility. The fact that SADIC outputs are more consistent with Kexis than atom depths obtained from distance-based calculations, suggests that a step forward in the structural interpretation of the latter experimental parameter might be achieved. On the other hand, the weaker correlation observed between paramagnetic perturbations induced by TEMPOL and atom depths confirms that complex dynamics control the interaction of protein surfaces with paramagnetic probes (Niccolai et al., 2003). On the basis of the B factors reported in the crystal structure of HEWL with PDB (Berman et al., 2000) ID code 4lzt, it is apparent that local structural flexibility is not responsible for the limited correlation between atom depth and accessibility dependent experimental data.

It can be concluded that the use of SADIC algorithm might favor improved depth-oriented discussions on experimental data, possibly enhancing our understanding of structure stability and dynamics of complex molecules.


    Acknowledgments
 
Thanks are due to grants from the Italian Ministry of University PRIN03-059395 and from the University of Siena (PAR 2002). Special thanks are also due to Francesco Niccolai for technical assistance.

Received on February 9, 2005; revised on March 30, 2005; accepted on April 7, 2005

    REFERENCES
 TOP
 Abstract
 INTRODUCTION
 ALGORITHM
 IMPLEMENTATION
 RESULTS AND DISCUSSION
 REFERENCES
 

    Berman, H.M., et al. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242[Abstract/Free Full Text].

    Chakravarty, S. and Varadarajan, R. (1999) Residue depth: a novel parameter for the analysis of protein structure and stability. Structure Fold. Des., 7, 723–732[Medline].

    Gerstein, M., et al. (1995) The volume of atoms on the protein surface: calculated from simulation, using Voronoi polyhedra. J. Mol. Biol., 249, 955–966[CrossRef][ISI][Medline].

    Gutteridge, A., et al. (2003) Using a neural network and spatial clustering to predict the location of active sites in enzymes. J. Mol. Biol., 330, 719–734[CrossRef][ISI][Medline].

    Innis, C.A., et al. (2004) Prediction of functional sites in proteins using conserved functional group analysis. J. Mol. Biol., 337, 1053–1068[CrossRef][ISI][Medline].

    Koradi, R., et al. (1996) MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graph., 14, 51–55 29–32[CrossRef][ISI][Medline].

    Lee, B. and Richards, F.M. (1971) The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol., 55, 379–400[CrossRef][ISI][Medline].

    Lo Conte, L., et al. (1999) The atomic structure of protein–protein recognition sites. J. Mol. Biol., 285, 2177–2198[CrossRef][ISI][Medline].

    Miranker, A., et al. (1996) Investigation of protein folding by mass spectrometry. FASEB J., 10, 93–101[Abstract].

    Niccolai, N., et al. (2001) NMR studies of protein surface accessibility. J. Biol. Chem., 276, 42455–42461[Abstract/Free Full Text].

    Niccolai, N., et al. (2003) NMR studies of protein hydration and TEMPOL accessibility. J. Mol. Biol., 332, 437–447[CrossRef][ISI][Medline].

    Pedersen, T.G., et al. (1993) Determination of the rate constants k1 and k2 of the Linderstrom–Lang model for protein amide hydrogen exchange. A study of the individual amides in hen egg-white lysozyme. J. Mol. Biol., 230, 651–660[CrossRef][ISI][Medline].

    Pintacuda, G. and Otting, G. (2002) Identification of protein surfaces by NMR measurements with a paramagnetic Gd(III) chelate. J. Am. Chem. Soc., 124, 372–373[CrossRef][ISI][Medline].

    Pintar, A., et al. (2003a) Atom depth as a descriptor of the protein interior. Biophys. J., 84, 2553–2561[Abstract/Free Full Text].

    Pintar, A., et al. (2003b) DPX: for the analysis of the protein core. Bioinformatics, 19, 313–314[Abstract/Free Full Text].

    Quillin, M.L. and Matthews, B.W. (2000) Accurate calculation of the density of proteins. Acta Crystallogr. D Biol. Crystallogr., 56, 791–794[CrossRef][Medline].

    Richards, F.M. (1977) Areas, volumes, packing and protein structure. Annu. Rev. Biophys. Bioeng., 6, 151–176[CrossRef][ISI][Medline].

    Richmond, T.J. (1984) Solvent accessible surface area and excluded volume in proteins. Analytical equations for overlapping spheres and implications for the hydrophobic effect. J. Mol. Biol., 178, 63–89[CrossRef][ISI][Medline].

    Roder, H., et al. (1985) Individual amide proton exchange rates in thermally unfolded basic pancreatic trypsin inhibitor. Biochemistry, 24, 7407–7411[CrossRef][Medline].

    Serrano, L., et al. (1992) The folding of an enzyme. II. Substructure of barnase and the contribution of different interactions to protein stability. J. Mol. Biol., 224, 783–804[CrossRef][ISI][Medline].

    Shulman-Peleg, A., et al. (2004) Recognition of functional sites in protein structures. J. Mol. Biol., 339, 607–633[CrossRef][ISI][Medline].

    Totrov, M. and Abagyan, R. (1996) The contour-buildup algorithm to calculate the analytical molecular surface. J. Struct. Biol., 116, 138–143[CrossRef][Medline].

    Tsuchiya, Y., et al. (2004) Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces. Proteins, 55, 885–894[CrossRef][ISI][Medline].

    Zhang, X. and Brian, B.W. (1995) EdPDB: a multifunctional tool for protein structure analysis. J. Appl. Cryst., 28, 624–630[CrossRef].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
V. Venditti, N. Niccolai, and S. E. Butcher
Measuring the dynamic surface accessibility of RNA with the small paramagnetic molecule TEMPOL
Nucleic Acids Res., March 27, 2008; 36(4): e20 - e20.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/12/2856    most recent
bti444v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (7)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Varrazzo, D.
Right arrow Articles by Niccolai, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Varrazzo, D.
Right arrow Articles by Niccolai, N.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?