Anisotropic fluctuations of amino acids in protein structures: insights from X-ray crystallography and elastic network models
1Department of Computational Biology, School of Medicine, University of Pittsburgh. Suite 3064, Biomedical Science Tower 3, 3051 Fifth Ave., Pittsburgh, PA 15213, USA and 2Institute of Molecular and Cellular Biosciences, University of Tokyo, R107, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Motivation: A common practice in X-ray crystallographic structure refinement has been to model atomic displacements or thermal fluctuations as isotropic motions. Recent high-resolution data reveal, however, significant departures from isotropy, described by anisotropic displacement parameters (ADPs) modeled for individual atoms. Yet, ADPs are currently reported for a limited set of structures, only.
Results: We present a comparative analysis of the experimentally reported ADPs and those theoretically predicted by the anisotropic network model (ANM) for a representative set of structures. The relative sizes of fluctuations along different directions are shown to agree well between experiments and theory, while the cross-correlations between the (x-, y- and z-) components of the fluctuations show considerable deviations. Secondary structure elements and protein cores exhibit more robust anisotropic characteristics compared to disordered or flexible regions. The deviations between experimental and theoretical data are comparable to those between sets of experimental ADPs reported for the same protein in different crystal forms. These results draw attention to the effects of crystal form and refinement procedure on experimental ADPs and highlight the potential utility of ANM calculations for consolidating experimental data or assessing ADPs in the absence of experimental data.
Availability: The ANM server at http://www.ccbb.pitt.edu/anm is upgraded to permit users to compute and visualize the theoretical ADPs for any PDB structure, thus providing insights into the anisotropic motions intrinsically preferred by equilibrium structures.
Contact: bahar{at}ccbb.pitt.edu
Supplementary information: Two Supplementary Material files can be accessed at the journal website. The first presents the tabulated results from computations (Pearson correlations and KL distances with respect to experimental ADPs) reported for each of the 93 proteins in Set I (the averages over all proteins are presented above in Table 3). The second file consists of three sections: (A) detailed derivation of Equation (7), (B) analysis of the effect of ANM parameters on computed ADPs and identification of parameters that achieve optimal correlation with experiments and (C) description of the method for computing the tangential and radial components of equilibrium fluctuations.
| 1 INTRODUCTION AND THEORY |
|---|
|
|
|---|
1.1 Anisotropic displacement parameters (ADPs) from X-ray crystallography
An essential step in the resolution and refinement of X-ray structures is the determination of model parameters that optimally describe the uncertainties and/or displacements in atomic positions (Willis and Pryor, 1975). These displacements are usually described by Debye-Waller or B-factors, for each atom, assuming the atomic fluctuations about their mean positions to be Gaussianly distributed and isotropic. In cases where sufficiently high-resolution diffraction data are available, on the other hand, a set of six anisotropic displacement parameters (ADPs) have been reported per atom with the structural data deposited in the Protein Data Bank (PDB) (Berman et al., 2000).
The ADPs describe the mean-square displacements of atoms along three directions as well as their cross-correlations (Dunitz et al., 1988; Merritt, 1999). As such, they provide a much more detailed description than the single isotropic parameters, B-factors, assigned to each atom. However, the extraction of such 6-fold more detailed data also requires the determination of the structure at sufficiently high resolution (e.g. higher than 1.2 Å), which has not been readily achievable for biological molecules, or large asymmetrical macromolecules until recently. More than 1600 such structures can now be found in the PDB, compared to only about 30 in 1998 (Merritt, 1999). Yet, these structures currently amount to only 5% of those collected in the PDB.
The ADPs represent the six distinctive elements—three diagonal and three off-diagonal—of the 3 x 3 covariance matrix C associated with the Gaussian probability distribution of atomic fluctuations in space (see Methods section). Apart from describing the atomic fluctuations, the ADPs also convey information on the collective motions of proteins (Rosenfield et al., 1978; Schomaker and Trueblood, 1968). In view of the importance of understanding collective dynamics for inferring biomolecular function, methods for improving the accuracy of ADPs in macromolecular refinements have been developed. A useful approach has been to resort to translation-libration-screw (TLS) model parameters (Painter and Merrit, 2006; Schomaker and Trueblood, 1968), and representing the structure as a pseudo-rigid body composed of multiple TLS groups, instead of modeling the six parameters for each individual atom separately (Painter and Merritt, 2006; Winn et al., 2001).
How anisotropic are atomic fluctuations? A simple measure is the mean anisotropy A, i.e. the ratio of the shortest axis to the longest one, when representing the volume swept during atomic fluctuations as an ellipsoid. Examination of about 30 PDB structures for which the ADPs were available in 1999 yielded an average anisotropy A of 0.4–0.5 (Merritt, 1999; Merritt, et al., 1998). This is a significant departure from isotropic fluctuations (A = 1). The significance of the anisotropic nature of atomic motions has also been evidenced by the improvement in R factors achieved upon adoption of ADPs (Schneider, 1996; Winn, et al., 2001).
Despite recognizing the anisotropy of atomic fluctuations, in view of the difficulties in assessing ADPs, B-factors have been widely examined and compared with predictions from theoretical approaches as a measure of the mobility of atoms in folded structures. Agreement with B-factors has indeed been gauged as a criterion for benchmarking theoretical models, including in particular the elastic network (EN) models used in conjunction with normal mode analysis (NMA) (Eyal et al., 2006; Kundu et al., 2002; Yang et al., 2006). A more stringent approach would, however, be to consider the ADPs. Examination of ADPs and comparison with theoretical predictions is particularly timely and important, given the current availability of a sizeable set of such refined structures in the PDB, and the recent success of EN models for describing functional motions (Alexandrov et al., 2005; Bahar and Rader, 2005; Cui and Bahar, 2006; Krebs et al., 2002; Ma, 2005; Nicholay and Sanejouand, 2006; Tama and Brooks, 2006; Yang et al., 2006).
The article is organized as follows: First we present an analysis of existing experimental data. We considered, to this aim, the reproducibility of the ADP data for the same protein in different crystal forms, or in the same crystal form resolved independently in different experiments, which showed that the parameterized displacements are affected by the crystal geometry. Next, we computed the theoretical predictions based on the anisotropic network model (ANM) (Atilgan et al., 2001; Doruker et al., 2000). The level of agreement between theory and experiments is shown to be comparable to that between the two sets of experimental data for the same proteins under different crystal forms. The theoretical results thus present an effective means of consolidating experimental data when available. We have now implemented our algorithm and its visualization features in the ANM web server, so that users may have access to theoretical ADPs and corresponding graphical interfaces, for any query structure.
1.2 Covariance matrix and ADPs
The covariance matrix, C, is a 3N x 3N dimensional symmetric matrix, for a structure of N atoms, the elements of which are 3 x 3 submatrices of the form
|
| (1) |
i, j
N,
Xi,
Yi and
Zi are the components of the fluctuation vector
Ri associated with the displacements of atom i away from its mean position, and the angular brackets refer to expected (or average) values. The sum over the diagonal elements of Cij, or the trace (tr) of Cij describe the cross-correlation between the fluctuations of atoms i and j, i.e.
|
| (2) |
Ri and
Rj. The mean-square fluctuations
Ri, on the other hand, are given by the respective diagonal and off-diagonal elements of the Cii [i.e. submatrices of C where i = j in Equation (1)], such that
|
| (3) |
The anisotropic displacements, or the space sampled by a fluctuating atom, can be represented by an ellipsoid assuming that the fluctuations along each direction are Gaussianly distributed. The size of the ellipsoids depends on the selected probability/confidence levels and the relative sizes of the three axes are found by diagonalizing the ADP matrix Cii. The resulting eigenvectors define the three principal axes of the ellipsoids and the eigenvalues (di1 > di2 > di3) describe the mean-square fluctuations (sizes) along these directions. Ellipsoids of axial lengths equal to
,
and
enclose the space in which atom i is found with a probability of 0.683 (i.e. one SD away from the mean of the Gaussian distribution). The diagrams presented in this study refer to this size of ellipsoids. Doubling the axes lengths encapsulates 95.4% of the distribution. The ratio between the smallest and largest eigenvalues of Cii defines the anisotropy (Merritt, 1999; Trueblood, et al., 1996) Ai = di3/di1. Ai·(0
Ai
1) serves as a quantitative measure of anisotropy for the displacements of atom i, the lower and upper limits corresponding to planar and isotropic motions, respectively. The size of atomic fluctuations, on the other hand, are described by the corresponding ellipsoid volume Vi = 4
(di1 di2 di3)1/2/3.
1.3 Anisotropic network model (ANM)
The concept of an elastic network (EN) for modeling the equilibrium dynamics of proteins was introduced a decade ago (Bahar et al., 1997; Haliloglu et al., 1997), inspired by a NMA with uniform harmonic potentials performed by Tirion (1996). The applicability of an EN–NMA at residue level was first shown by Hinsen (1998), followed by others (Atilgan et al., 2001; Doruker et al., 2000; Tama and Sanejouand, 2001), as recently reviewed (Bahar and Rader, 2005; Chennubhotla et al., 2005; Ma, 2005; Tama and Brooks, 2006). In the ANM (Atilgan et al., 2001; Doruker et al., 2000), the network nodes are identified with the positions of the
-carbons, and uniform elastic springs with force constant
connect the nodes located within a cutoff distance of rc such that the underlying potential is
|
| (4) |
ij is the ijth element of the Kirchhoff matrix
of inter-residue contacts equal to 1 if nodes i and j are connected by a spring, zero otherwise. The NMA of the ANM requires the eigenvalue decomposition of the Hessian H, a 3N x 3N matrix composed of N x N super elements, Hij (i
j) of the form
|
| (5) |
j, j
i Hij. Here Xij, Yij, and Zij are the components of the distance vector
-carbon, consistent with the ANM. CANM can be expressed in terms of the 3N-6 non-zero eigenvalues
k and corresponding eigenvectors uk of H as
|
| (6) |
The B-factor predicted by the ANM for residue i is calculated from the trace of
using
where kB is the Bolzmann constant and T is the absolute temperature. The value of
is determined a posteriori if experimental data are available, and does not affect the fluctuation profile of residues.
| 2 METHODS |
|---|
|
|
|---|
2.1 Datasets
We extracted 1432 PDB entries for which the ADPs were reported as of January 2006, eliminated the redundant structures in this set using PISCES (Wang and Dunbrack, 2005) such that, in the final set, no protein pair had more than 15% sequence identity. Furthermore, only proteins with resolution higher than 1.5 Å and R-factor smaller than 0.3 Å were retained. Finally, PDB files for which ADPs were assigned to the majority of the atoms were selected. The final set includes 93 proteins listed in the Supplementary Material Table S1. This representative set of non-redundant high-resolution structures with available ADPs will be referred to as Set I. Two additional datasets were compiled for proteins having at least two independent X-ray structures with ADPs. The first (19 pairs) is comprised of pairs of structures having the similar crystal form, i.e. it includes proteins with two crystal structures based on the same space group and unit cell dimensions. The second (8 pairs) is comprised of pairs of structures (same protein in each pair) resolved in different crystal form (different space group or unit cell dimensions). These two sets are listed in the respective Tables 1 and 2.
|
|
2.2 Measuring similarities between ADPs
The similarities between ADPs were measured in terms of (i) the Pearson correlation
(sA, sB) between the N (or 6N)-dimensional arrays sA and sB that describe the ADPs of the N amino acids (or atoms) in the structures A and B, and (ii) the Kullback–Leibler (KL) distances D between the ADP matrices corresponding to the structures A and B. The arrays sA and sB include the anisotropies {Ai}, volumes {Vi}, diagonal
i
N. For simplicity, we adopt the notation
(s), the argument designating the array.
In the following, we will compare the ADPs for
-carbons at the same sequence position (e.g. i) for a pair of structures (that are otherwise structurally aligned). Two types of comparisons will be performed: (i) pairs of PDB structures A and B determined for the same protein in different settings, termed experiments versus experiments comparisons, and (ii) pairs of experimental and computational ADP data for the same PDB entry, termed experiments versus theory.
The KL distance between the trivariate (x-, y- and z-) Gaussian probability distributions a and b, associated with the anisotropic fluctuations of atom i, is expressed in terms of the respective eigenvalues (dak, dbk, 1
k
3) and eigenvectors (vak, vbk, 1
k
3) of the corresponding ADP matrices
and
as
|
| (7) |
See Supplementary Material for a complete derivation. As
is asymmetrical, we use the arithmetic average
for evaluating the distance between
and
for two identical distributions, and increases with the divergence between them.
2.3 Visualization and implementation in ANM server
The ANM website (http://ccbb.pitt.edu/anm) (Eyal et al., 2006) has been further developed to compute, from knowledge of structural coordinates, the C
ADPs for PDB entries or user-supplied structures submitted in PDB format, and display the ellipsoids created by Rastep (Merritt and Bacon, 1997; Merritt and Murphy, 1994) or Povscript (Fenn et al., 2003). If isotropic temperature factors are available in the input file, the server offers an option to use them for rescaling the absolute sizes of atomic fluctuations and evaluating the corresponding ADPs.
2.4 Solvent accessibilities and secondary structures
Solvent accessibilities were computed using Voronoi polyhedra method (McConkey et al., 2002). The accessibility of residue X is defined as the ratio of its total solvent-accessible surface in the native structure to that in the peptide GGXGG with the same backbone conformation. Residues with accessibility of less than 0.1 are accepted as buried, those between 0.1 and 0.5 are assume to be partially exposed and those above 0.5 are taken as exposed. Secondary structures are assigned using DSSP (Kabsch and Sander, 1983).
| 3 RESULTS AND DISCUSSION |
|---|
|
|
|---|
3.1 Experiments versus experiments comparisons of ADPs
We first examined the extent of anisotropy observed in different structural elements, Figure 1 displays the distribution of anisotropies Ai for
-carbons, evaluated for buried residues (panel A) and solvent-exposed residues (panel B), obtained from the statistical examination of the Set I of 93 high-resolution structures. Clearly a broad spectrum of anisotropies is observed, consistent with the bell-shaped distribution obtained earlier with a smaller set (Merritt, 1999; Merritt et al., 1998). The anisotropies are more pronounced (skewed towards smaller Ai values) in the case of exposed residues. The average anisotropies are
|
Next, we proceed to a quantitative assessment of the degree of correlations between the ADPs reported for the same proteins under similar or different conditions/crystal forms. Table 1 refers to pairs of structures A and B determined in isomorphous crystals. Columns 1 and 2 list the PDB codes for each pair, and column 3 gives the size (N) of the corresponding protein. Columns 4–9 list the Pearson correlations
(s) corresponding to s = {Vi}, {Ai}, {tr(Cii)}, all six ADPs
i
N. The last column lists the average KL distance
i
N of the two proteins. The values
We now proceed to the set of structures resolved for the same protein, but under different crystal space group and unit cell dimensions. Table 2 lists the results for eight such pairs retrieved from the PDB. There is a significant decrease in the correlation coefficients, and an increase in KL distance, revealing differences between the ADPs reported for the same protein resolved/modeled with different structure determination/refinement protocols. The correlation
between tr(Cii) = Bi values in this case is 0.569, that between the diagonal elements of Cij is 0.48 and the correlation of off-diagonal elements 0.42. The mean correlation between the ellipsoid volumes is
and that between anisotropies is
. The mean KL distance increases to 0.43. These results provide us with estimates for the levels of agreement one might expect to reach between theory and experiments. In particular, the low
implies that the anisotropy is a very sensitive measure highly dependent on experimental conditions and techniques.
3.2 Experiments versus theory comparisons
ANM calculations have been performed for the 93 proteins (Set I) extracted from the PDB as explained above. Figure 2 illustrates the types of calculations and comparative analysis performed, for an example protein, antifungal protein EAFP2 from Eucommia ulmoides Oliver tree. Panel A displays all the predicted (by ANM) and the experimental (deposited in the PDB) ADPs, grouped in two subsets, consisting of the diagonal
and off-diagonal
elements of Cii for 1
i
N. The correlation between the two sets of ADPs is 0.85 in this case. The correlations for the subsets of diagonal and off-diagonal elements, computed separately, are 0.61 and 0.51, respectively. Panel B compares the predicted and the experimental mean-square fluctuations ![]()
as well as the overall mean-square fluctuations
. In panel C, we display the respective experimental (upper) and the theoretical (lower) ADPs as color-coded ellipsoids, blue and red referring to the smallest and largest size fluctuations, and the orientation of the beads indicating the anisotropic directions of the fluctuations.
|
Similar calculations performed for the complete set of 93 proteins in Set I led to the average results presented in Table 3. For detailed results, similar to Tables 1 and 2, the reader is referred to the Table S1 in the Supplementary Material. The first row shows the correlation (0.77) between the two sets of 6N ADPs, from the PDB and ANM predictions, averaged over all proteins (i.e. by repeating for all proteins the analysis illustrated in Fig. 2A for EAFP2). This is a good agreement in view of the uncertainties in experiments and approximations in the theory. The corresponding average correlations for the diagonal and off-diagonal elements of the ADP matrices, considered separately, are listed in the 2nd and 3rd rows. Separation of the ADPs into these two populations lowers the overall correlation, because the relative sizes of auto- and cross-terms of ADP matrices, which are presumably in good agreement between theory and experiments, are not being taken into consideration upon treating the two subsets as independent entities.
|
The fourth row in Table 3 lists the correlations between the traces of the ADP matrices. The average correlation obtained, 0.57, is similar to that reported in previous studies of B-factors (Hamacher and McCammon, 2006). The better agreement between theoretical and experimental
The low
further strengthens the view that Ai values should be interpreted with caution; whereas Vi provides a reasonable description of the fluctuation volumes, consistent with Bi. Finally, the KL distance averaged over all proteins (last row) is also comparable to the one obtained in Table 2, overall lending support to the view that the two sets of data, experimental and theory, do not differ from each other more than those obtained experimentally for the same proteins under different settings.
3.3 How do results depend on the coarse-graining of the theoretical model?
The ANM predictions presented above are obtained with a residue-level representation of the proteins, each residue position on the network being identified by that of its
-carbons. Of interest is to assess the effect of the selected coarse-grained model on theoretical results. Does the absence of atomic details in the model give rise to inaccuracies in predicting directional fluctuations? How would theory and experiments compare if fluctuations at atomic scale were considered in details?
To answer these questions, we repeated the computations using an atomic level EN model (Tirion, 1996). A cutoff interaction distance of 5 Å between atoms (Sen et al., 2006) was adopted for defining the interacting pairs of atoms. The calculations were done for 60 proteins in the set, the sizes of which are smaller than 250 residues. While the calculations yield data on the fluctuations of all atoms, for direct comparison with our residue-level results we examined the fluctuation dynamics of C
atoms, derived from this full atomic description. The entries in the middle column of Table 3 list the results obtained in this case. As can be clearly seen, the results are comparable to those found with the residue-level model. The coarse-graining is clearly not a major source for inducing additional differences between residue level
and experimentally refined ADPs.
We also examined the fluctuations obtained using the GNM (Bahar et al., 1997), an isotropic network model. The C
ADP matrices based on the GNM are simply diagonal matrices with identical mean-square fluctuations along all three directions. Strikingly, we found (right column of Table 3) that the agreement between GNM results and experimental data is comparable to that obtained with the ANM. This is presumably due to the more accurate prediction of fluctuation sizes by the GNM compared to the ANM (Bahar et al., 2007; Chennubhotla et al., 2005), which more than compensates the discrepancies arising from the lack of anisotropy.
3.4 KL distance: a measure of assessing the robustness of ADPs
We now concentrate on KL divergence as a measure of the probability distributions of residue displacements in 3D space, or the entire shape of ellipsoids (rather than a single parameter per ADP matrix like the anisotropy which is shown to be highly sensitive to model parameters, see Supplementary Material). The KL divergences are proposed here to serve as a metric for estimating the robustness of the ADPs.
As mentioned above, the KL distance between two distributions vary in the range of 0
Di
, with the lower limit corresponding to identical distributions; and Di increases with the dissimilarity in the shape of the ADPs corresponding to the ith atom in the two sets of data. To view how KL divergences incorporate information different from (and additional to) those typically conveyed by isotropic fluctuations, let us first consider two quantities: (i) the KL divergence, Di between the ADP matrix
computed by ANM for atom i, and its experimental counterpart, Cii, and (ii) the information derived from the diagonal elements alone of the ADP matrices, i.e. using
. Figure 3 compares these two quantities for the
-carbons in the protein EAFP2 analyzed above (Fig. 2). Di and
profiles are presented as dotted and solid curves respectively, with the respective scales shown along the right and left ordinates. Significant differences between the two quantities are observed at particular residues. These originate from the shape of the ellipsoids and the cross-correlations between the fluctuations along different directions at those particular residues, which are not accounted for by
but included in Di. For example, the C-terminus appears to diverge between the two sets of data according to Di values while
is negligibly small, i.e. the sizes of the motions are comparable, while their spatial distributions differ. Therefore, the differences (between the theoretical and experimental data) overlooked by
are captured by the relatively high Di values.
|
Notably, minima in the Di profile indicate the structural regions whose experimental and theoretical ADPs are in agreement. The KL divergence between theoretical and experimental data thus provides information on the regions whose ADPs are more robustly defined. Di values may even serve as a metric for assessing the confidence levels of the ADPs. Figure 4 shows such results for different types of secondary structural elements, or different extents of solvent exposure. The KL distances are lower at the protein interior, and at ordered regions (
-helix, ß-strand). This difference between regular/ordered and disordered regions (such as loops, coils and turns) becomes more pronounced when the residues are solvent exposed.
|
Figure 5 illustrates, for hen egg white lysozyme (HEWL), how the KL distances may be used for identifying structural regions or residues distinguished by their well-defined (robust) anisotropic fluctuations. Panel A displays the experimentally assigned ADPs for
-carbons in three high-resolution structures (PDB files 3lzt
[PDB]
, 4lzt and 1iee (Sauter et al., 2001; Walsh et al., 1998) of HEWL, and those predicted by the ANM for 3lzt, as color/size-coded ellipsoids. Here 3lzt and 4lzt correspond to isomorphous crystals of HEWL, and the pair 3lzt and 1iee refer to different crystal forms. Panel B displays the KL distances between the ADPs for the pairs listed in the inset. Maxima point to regions where the two sets of data exhibit the strongest departure, and minima indicate the regions where the two sets of data concur. Interestingly, the N-terminal segment of about 40 residues is observed in all cases to exhibit the lowest divergence, pointing to the robustness of the fluctuation dynamics at this region. Additionally, the helical segment 80–100 appears to have well-defined preferred motions. If we focus in particular on the structure (3lzt) for which ANM calculations have been performed (the blue curve in Figure 5A), we also distinguish the ß-strand region (residues 50–65) to exhibit minimal divergence.
|
3.5 Tangential versus radial displacements
The good agreement between the ADPs corresponding to pairs of isomorphous structures and the departures between the ADPs from structures based on different crystal forms were originally noted by Merrit (Merritt, 1999). Rigid body motions were suggested to be responsible for these differences. If rigid body rotations of proteins in the crystal are the main contributors to ADPs, the atoms located far away from the mass center would be expected to exhibit larger tangential fluctuations compared to the atoms that are closer to the mass center. For the radial fluctuations, this trend would not be expected (Schneider, 1996). Analysis of the experimental fluctuations for many proteins in our set reveals a clear trend in which tangential displacements indeed become larger with the increasing distance from the protein center, and the radial displacements are almost not affected. However, a very similar trend is obtained for the theoretical fluctuations calculated by the ANM, as well, although the rigid body motions are by definition eliminated in the ANM fluctuations. Figure 6 illustrates this for oxy-myoglobin (Vojtechovsky et al., 1999). Having this trend detected by a model which is solely based on the internal motions of the protein, suggests that it is also internal phenomenon (probably a by-product of the large scale slow mode motions) and should not be explained only by rigid body motions in the crystal. It is likely that other effects such as crystal contacts also make crystal-dependent contributions to the ADPs. Differences in the refinement protocols also seem to contribute, as suggested by the small differences between the pairs of isomorphous structures. The agreement between the experimental and theoretical values is found to be better for tangential fluctuations as shown for oxi-myoglobin in Figure 6 panel C.
|
| 4 CONCLUSION |
|---|
|
|
|---|
The degree to which experimental ADPs represent internal dynamics, rigid body motions or static disorders is still under long debate for both proteins and small molecules (Dunitz et al., 1988; Harata et al., 1998; Merritt, 1999). It is also not clear how much crystal contacts affect the internal dynamics of proteins in crystals, but such effects surely exist (Kundu et al., 2002). Comparative analysis of the experimental data for a given protein structurally resolved in different crystal forms has been shown earlier (Diamond, 1990; Kidera and Go, 1992; Winn et al., 2001), and confirmed here with a larger set (Table 2), to display poor-to-moderate agreement with regard to the anisotropic nature of residue fluctuations, raising questions about the physical meaning of the ADPs, or the criteria for assessing the levels of agreement between different datasets. Anisotropies (Ai) of individual amino acids emerge as a poor metric, even for pairs of structures resolved in isomorphic crystals (
Comparison of theoretical ANM results with experimental data yielded overall correlations, or KL divergences, comparable to those observed in Table 2. Such an agreement between theory and experiments is satisfactory in view of several possible sources of discrepancy between the two sets: approximations in the ANM, experimental artifacts due to static disorder induced by the slightly different location of atoms in different unit cells, crystal contacts or inaccuracies and errors in the refinement process such as interpretation of alternative conformations as a single structure with large displacement parameters.
Our analysis suggests that there are subsets of residues that tend to have more robust preferences with regard to the spatial distributions of their fluctuations/displacements than others. Here, to identify and characterize such amino acids, we propose to use the residue-based KL divergences between Cii and
. KL divergences vary in the range 0
Di
. The Di profiles for individual proteins, illustrated in Figure 5B provide an assessment of the structural regions that exhibit consistent fluctuations between theory and experiments (e.g. Di
0.3, which corresponds to the 58% of all 17 308 residues examined in our entire dataset). In this context, the ADPs at the protein core or at regular structures appear to be more accurately measured/predicted than other regions in general (Fig. 4).
Finally, the examination of residue displacements along radial and tangential directions shows that the amplitude of tangential motions increase with the distance from the mass center, while radial motions remain unaffected. This feature in close accord with experimental data has been largely attributed to rigid body rotational motions of the proteins in the crystal, resulting in larger tangential displacements of exterior residues. Yet, the same behavior obtained here with the ANM purely internal conformational fluctuations in the absence of any rigid body motions suggests that this interpretation may not be complete. In fact, internal motions also lead to larger tangential displacements at distant positions, and this is a natural consequence of the dominant effect of slow modes that drive global/domain motions with respect to central hinge centers or domain interfaces. We also note that an extensive study of X-ray crystallographic B-factors by Phillips and coworkers (Kundu et al., 2002) using the TLS model and an elastic network model demonstrated that the elastic network models yield better agreement with experiments, compared to TLS models. This is also consistent with the proven success of normal mode analysis in assisting structural refinement (Delarue and Dumas, 2004; Diamond, 1990; Kidera and Go, 1990; Suhre and Sanejouand, 2004).
Therefore we conclude that the ADPs reported for highly refined PDB structures do convey information on the directional preferences of individual residues. While they are subject to several perturbing effects, a reasonable correlation is observed with the ANM predictions that are based solely on internal conformational motions, particularly at core regions and regular structural elements. Combined examination of experimental and computed ADPs may thus assist in identifying the amino acids that possess strong structure-encoded preferences to fluctuate in specific directions. Notably, in the absence of experimental data, theoretical ADPs can be resorted to for a first estimation of the anisotropic displacements of amino acids. The features of the ANM server (www.ccbb.pitt.edu/anm) have now been augmented to permit users to calculate and visualize the ADPs for query structures.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Support by NIH R33 GM068400 [GenBank] -01A2 is gratefully acknowledged by I.B. While publishing this work, we have been alerted to a new related paper on comparing anisotropic temperature factors with theoretical models (Kondrashov et al., 2007).
Conflict of Interest: none declared.
| REFERENCES |
|---|
|
|
|---|
Alexandrov V, et al. Normal modes for predicting protein motions: a comprehensive database assessment and associated Web tool. Proteins Sci (2005) 14:633–643.[CrossRef]
Atilgan A, et al. Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J (2001) 80:505–515.[Web of Science][Medline]
Bahar I, et al. Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Fold Des (1997) 2:173–181.[CrossRef][Web of Science][Medline]
Bahar I, et al. On the theoretical foundations of the Gaussian network model and its applications to proteins. Phys. Biol (2007) 4:64–65.[CrossRef][Web of Science]
Bahar I, et al. Coarse-grained normal mode analysis in structural biology. Curr. Opin. Struct. Biol (2005) 15:586–592.[CrossRef][Web of Science][Medline]
Berman H, et al. The Protein Data Bank. Nucleic Acids Res (2000) 28:235–242.
Chennubhotla C, et al. Elastic network models for understanding biomolecular machinary: from enzymes to supramolecular assemblies. Phys. Biol (2005) 2:S173–S180.[CrossRef][Web of Science][Medline]
Cui Q, et al. Normal Mode Analysis: Theory and Applications to Biological and Chemical Systems (2006) Boca Raton, FL, USA: Chapman & Hall/CRC.
Delarue M, et al. On the use of low-frequency normal modes to enforce collective movements in refining macromolecular structural models. Proc. Natl Acad. Sci. USA (2004) 101:6957–6962.
Diamond R. On the use of normal modes in thermal parameter refinement: theory and application in the bovine pancreatic trypsin inhibitor. Acta. Cryst (1990) A46:425–435.[Medline]
Doruker P, et al. Dynamics of proteins predicted by molecular dynamics simulations and analytical approaches: application to amylase inhibitor. Proteins (2000) 40:512–524.[CrossRef][Web of Science][Medline]
Dunitz J, et al. Atomic motions in molecular crystals from diffraction measurements. Angew. Chem. Int. Ed. Engl (1988) 27:880–895.[CrossRef][Web of Science]
Eyal E, et al. Anisotropic Network Model: systematic evaluation and a new web interface. Bioinformatics (2006) 22:2619–2627.
Fenn T, et al. POVScript+: a program for model and data visualization using persistence of vision ray-tracing. J. Appl. Cryst (2003) 36:944–947.[CrossRef][Web of Science]
Haliloglu T, et al. Gaussian dynamics of folded proteins. Phys. Rev. Lett (1997) 79:3090–3093.[CrossRef][Web of Science]
Hamacher K, et al. Computing the amino acid specificity of fluctuations in biomolecular systems. J. Chem. Theory Comput (2006) 2:873–878.[CrossRef]
Harata K, et al. Full-matrix least-square refinement of lysozymes and analysis of anisotropic thermal motion. Proteins (1998) 30:232–243.[CrossRef][Web of Science][Medline]
Hinsen K. Analysis of domain motions by approximate normal mode calculations. Proteins (1998) 33:417–429.[CrossRef][Web of Science][Medline]
Kabsch W, et al. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers (1983) 22:2577–2637.[CrossRef][Web of Science][Medline]
Kidera A, et al. Refinement of protein dynamics structure: normal mode refinement. Proc. Natl Acad. Sci. USA (1990) 87:3718–3722.
Kidera A, et al. Normal mode refinement: crystallographic refinement of protein dynamic structure. I. Theory and test by simulated diffraction data. J. Mol. Biol (1992) 225:457–475.[CrossRef][Web of Science][Medline]
Kondrashov DA, et al. Protein structural variation in computational models and crystallographic data. Structure (2007) 15:169–177.[Medline]
Krebs W, et al. Normal mode analysis of macromolecular motions in a database framework: developing mode concentration as a useful classifying statistic. Proteins (2002) 48:682–695.[CrossRef][Web of Science][Medline]
Kundu S, et al. Dynamics of proteins in crystals: comparison of experiment with simple models. Biophys. J (2002) 83:723–732.[Web of Science][Medline]
Ma J. Usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes. Structure (2005) 13:373–380.[Medline]
McConkey BJ, et al. Quantification of protein surfaces, volumes and atom-atom contacts using a constrained Voronoi procedure. Bioinformatics (2002) 18:1365–1373.
Merritt E. Comparing anisotropic displacement parameters in protein structures. Acta. Cryst (1999) D55:1997–2004.
Merritt E. Expanding the model: anisotropic displacement parameters in protein structure refinement. Acta. Cryst (1999) D55:1109–1117.
Merritt E, et al. Raster3D Photorealistic Molecular Graphics. Meth. Enzymol (1997) 277:505–524.[Web of Science][Medline]
Merritt E, et al. The 1.25 Å resolution refinement of the cholera toxin B-pentamer: evidence of peptide backbone strain at the receptor-binding site. J. Mol. Biol (1998) 282:1043–1059.[CrossRef][Web of Science][Medline]
Merritt E, et al. Raster3D Version 2.0 – a program for photorealistic molecular graphics. Acta. Cryst (1994) D50:869–873.
Nicholay S, et al. Functional modes of proteins are among the most robust. Phys. Rev. Let (2006) 96. 078104.
Painter J, et al. Optimal description of protein structure in terms of multiple groups undergoing TLS motion. Acta. Cryst (2006) D62:439–450.[Web of Science]
Rosenfield R, et al. A test for rigid vibrations, based on generalisation of Hirshfeld's "rigid bond" postulate. Acta. Cryst (1978) A34:828–829.
Sauter C, et al. Structure of tetragonal hen egg-white lysozyme at 0.94 Å from crystals grown by the counter-diffusion method. Acta. Cryst (2001) D57:1119–1126.[Web of Science]
Schneider T. What can we learn from anisotropic temperature factors? (1996) Proceedings of the CCP4 study weekend. Daresbury UK: SERC Darsbury Laboratory. 133–144.
Schomaker V, et al. On the rigid-body motion of molecules in crystals. Acta. Cryst (1968) B24:63–76.[CrossRef]
Sen TZ, et al. The extent of cooperativity of protein motionsobserved with elastic network models is similar for atomic and coarser-grained models. J. Chem. Theory Comput (2006) 2:696–704.[CrossRef][Medline]
Suhre K, et al. On the potential of normal-mode analysis for solving difficult molecular-replacement problems. Acta. Cryst (2004) D60:796–799.[Web of Science]
Tama F, et al. Symmetry, form, and shape: guiding principles for robustness in macromolecular machines. Annu. Rev. Biophys. Biomol. Struct (2006) 35:115–133.[CrossRef][Web of Science][Medline]
Tama F, et al. Conformational changes of proteins arising from normal mode calculations. Protein Eng (2001) 14:1–6.
Tirion M. Large amplitude elastic motions in proteins from a single-parameter, atomic analysis. Phys. Rev. Lett (1996) 77:1905–1908.[CrossRef][Web of Science][Medline]
Trueblood K, et al. Atomic displacement parameters nomenclature. Report of a subcommittee on atomic displacement parameter nomenclature. Acta. Cryst (1996) A52:770–781.
Vojtechovsky J, et al. Crystal structures of myoglobin-ligand complexes at near-atomic resolution. Biophys. J (1999) 77:2153–2174.[Web of Science][Medline]
Walsh M, et al. Refinement of triclinic hen egg-white lysozyme at atomic resolution. Acta. Cryst (1998) D54:522–546.[Web of Science]
Wang G, et al. PISCES: recent improvements to a PDB sequence culling server. Nucleic Acids Res (2005) 33:W94–W98.
Willis B, et al. Thermal Vibrations in Crystallography (1975) London: University press.
Winn M, et al. Use of TLS parameters to model anisotropic displacement in macromolecular refinement. Acta. Cryst (2001) D57:122–133.[Web of Science]
Xiang Y, et al. Crystal structure of a novel antifungal protein distinct with five disulfide bridges from Eucommia ulmoides Oliver at an atomic resolution. J. Struct. Biol (2004) 148:86–97.[CrossRef][Web of Science][Medline]
Yang LW, et al. oGNM: online computation of structural dynamics using the gaussian network model. Nucleic Acids Res (2006) 34:W24–W31.
This article has been cited by other articles:
![]() |
K. Hinsen Structural flexibility in proteins: impact of the crystal environment Bioinformatics, February 15, 2008; 24(4): 521 - 528. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








