Skip Navigation


Bioinformatics Advance Access originally published online on October 5, 2007
Bioinformatics 2007 23(24):3400-3402; doi:10.1093/bioinformatics/btm476
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/24/3400    most recent
btm476v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Hu, M.
Right arrow Articles by Peng, Q.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hu, M.
Right arrow Articles by Peng, Q.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Identification and visualization of cage-shaped proteins

Min Hu 1,*, Junhui Wang 2 and Qunsheng Peng 1

1State Key Laboratory of CAD & CG and 2College of Life Sciences, Zhejiang University, Hangzhou, 310058, China

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 ACKNOWLEDGEMENT
 REFERENCES
 

Summary: Cage-shaped protein, with its special structure, may have potential applications in biomedicine and nanotechnology. We developed a program CSPro (Cage-Shaped Protein) for efficient identification of cage-shaped proteins based on quaternary structure. CSPro is capable of revealing the corresponding cage-shaped feature more clearly and quickly than traditional visualization tools. Using CSPro, we have searched the full set of PDB (protein data bank) and three types of proteins are retrieved with notably large central cavities inside. CSPro can be used to validate if the quaternary structure of a protein is cage shaped in molecular simulation.

Availability: http://www.cad.zju.edu.cn/home/humin

Contact: humin{at}cad.zju.edu.cn

Supplementary information: Supplementary data are available at Bioinformatics online.

The quaternary structural arrangement of protein is typically responsible for complex cellular functions. Many known quaternary structures have been identified by experimentation and are listed in the protein data bank (PDB) (Berman et al., 2000), facilitating researchers to study the global structural features of proteins comprehensively. One of the interesting issues is how to explore the global shape features of quaternary structures available from the large PDB, in an automatic fashion. Due to the complexity of quaternary structure, few computational approaches are currently available. Although some structural visualization tools, such as VMD (Humphrey et al., 1996), MOLSCRIPT (Kraulis, 1991), RasMol (see http://www.rasmol.org), etc. can help users to observe quaternary structure conveniently in multimodes, one of the important shape features of some proteins, e.g. a large hollow core inside as observed for the metal transport protein (PDB ID 1qgh) (Ilari et al., 2000), is not clearly shown by traditional display modes. In Figure 1A and B, the spatial arrangement of the protein peptides can be seen plainly, but the vacant space inside the protein body is sheltered from its exterior components. The hidden hollow core may be observed by displaying the cavities of the protein with LSMS (Can et al., 2006) and CAVITY SEARCH (Ho and Marshall, 1990), or by visualizing the electrostatic potential of the protein with GRASP (Nicholls and Honig, 1991), but these methods require user intervention. To develop simple and efficient visualization tools for displaying the cage shape of proteins would be of significance to the entire research community. Ferritins, apoferritins and DPS (DNA-binding proteins from starved cells) are all cage-shaped proteins involved in essential cellular events, including iron homeostasis regulation and redox stress protection (Ilari et al., 2000; Yang et al., 2000). These proteins may also have potential usefulness in synthesizing nanoparticles (Tsukamoto et al., 2005). In addition, in life science research, some other types of protein complexes have been proven to have the cage-shaped feature, for instance, the chaperonin nanocage required for protein folding (Ellis, 2006; Tang and Chang, 2006), and the proteasome complex required for protein degradation (Meng et al., 1999). Protein folding disorders and protein degradation disorders are still two major concerns in medicine. Therefore, it would be germane to be able to identify cage-shaped proteins automatically when analyzing and mining information from large-scale structural datasets.


Figure 1
View larger version (61K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Comparison of traditional display modes with CSPro. (AC) display the same protein PDB ID 1qgh (a dodecameric ferritin from Listeria with 12 peptide chains) with spherical shape and (D and E) display the same protein PDB ID 2fak (a 20S proteasome of yeast with 28 subunits) with cylindrical shape. (A) and (B) are drawn by VMD in cartoon mode (A) and solid surface mode (B), respectively. (D) is drawn by RasMol in backbone mode. (C) and (E) are drawn by CSPro. The central cavity inside the cage-shaped protein is clearly revealed in (C) and (E), but invisible in (A), (B) and (D).

 
We developed an effective tool CSPro (Cage-Shaped Protein) to automatically identify and visualize the shape features of a protein based on its quaternary structure. The cage-shaped protein in our algorithm is recognized as the one that holds a large central cavity or hollow core.

Using the molecular surface model of the protein (Lee and Richards, 1971; Liang et al., 1998), we transform the structural data of the PDB format into a uniform voxel representation (Stouch and Jurs, 1986). Such discrete spatial representation permits our algorithm independent from the great number of atoms of the protein. If a voxel is occupied by an atom, its value is set to 1; if not, its value is set to 0. A binary 3D image is thus created. Taking the voxel at a corner of the image as the seed, we can find the locally connected zero-valued voxels (Kong and Rosenfeld, 1989; Kronheimer, 1992) of the image and reset the values of all these voxels to –1. The background voxels in the image of the protein are then marked. The image now contains three parts: (1) cavities, which are composed of voxels labeled 0, (2) background regions with voxels labeled –1 and (3) protein body with voxels labeled 1 (Fig. 2A). For cavities, we focus on that covering the protein centroid, and for protein body, we are only interested in its actual surface (Fig. 2B).


Figure 2
View larger version (16K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Identification and visualization of a cage-shaped protein. (A) depicts the voxel classification of its 3D image and (B) illustrates the detected central cavity and the protein surface.

 
If we allow Pc to be the voxel situated on the protein centroid, and if Pc is labeled 0, we choose Pc as the seed and find its locally connected region in the image. Apparently, this region exhibits the central cavity of the protein. The volume of the central cavity can be estimated efficiently by adding up all of the voxels inside the cavity. Similarly, the protein volume can be computed by finding all of the locally connected voxels that are labeled 1. We then estimate the ratio of the volume of the central cavity to that of the protein, denoted by R. If R is greater than some threshold T, notable cage-features can be detected, such as T = 0.08. For convenience, the threshold T can be set by users in CSPro, which makes it possible to screen proteins with different R values.

Following the theorem of discrete multidimensional Jordan surface, a closed surface separates the discrete environment into two parts with a connected inside and a connected outside (Herman, 1992). There exist two Jordan surfaces in Figure 2B. One is the outer surface, which is the boundary of the protein against the background, and the other is the inner surface, which is the boundary of the central cavity against the protein. By rendering both the outer and the inner surfaces simultaneously, we can illustrate the cage-shaped feature clearly (Fig. 1C and E), where the blue portion shows the large central cavity and the green portion represents the outer surface of the protein. The fascinating and strongly functional linked shape is now visible. However, there are some cage-shaped proteins with tunnels penetrating through the outer surface (such as PDB1fnt), or a deep hole on the surface of the protein (such as PDB1svt). In these cases, the inner surface disappears (see the Supplementary Figures), and we then show the outer surface of the detected protein.

CSPro is implemented at an interactive rate. We conducted experiments on a PC with a 3.00 GHz Pentium-4 CPU and 2 GB main memory. The runtime for determining and displaying the cage-shaped protein is completed in <1 s based on 3D image resolution of 64 x 64 x 64. Using CSPro, we searched the structural data of all proteins listed in the PDB (see http://www.rcsb.org/pdb) released by the end of 2006. Three types of quaternary structures were retrieved with notable cage-shaped features. The volume (V) of delineated cages and R values were evaluated. Some representatively retrieved proteins are listed in Table 1. We can see that the maximum volume ratio of ferritin gets up to 31%. The beneficial properties of cage-shaped proteins, including their great spatial capacity and stability of the quaternary structure, are attractive options when we select candidates for the carrier of pharmacological molecules.


View this table:
[in this window]
[in a new window]

 
Table 1. Cage-shaped proteins

 
The contribution of this article involves two aspects: (1) an efficient algorithm for automatic identification of the cage-shaped protein complex and (2) a fast visualization tool for exhibiting the cage-shaped feature of the quaternary structure.

CSPro supplies a complimentary means for analysis and visualization of structural feature of complicated protein complex. It can be used for validation, or to confirm that if a synthetic protein has or does not have a large hollow core inside. We believe that CSPro will assist in finding more cage-shaped proteins in a wide range of species.


    ACKNOWLEDGEMENT
 TOP
 ABSTRACT
 ACKNOWLEDGEMENT
 REFERENCES
 
This article is supported by the NSFC Key Project under grant No. 60533050 and the NSFZJC project under grant No. R304098.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Anna Tramontano

Received on June 24, 2007; revised on September 4, 2007; accepted on September 17, 2007

    REFERENCES
 TOP
 ABSTRACT
 ACKNOWLEDGEMENT
 REFERENCES
 

    Berman HM, et al. The Protein data bank. Nucl. Acids Res (2000) 28:235–242.[Abstract/Free Full Text]

    Can T, et al. Efficient molecular surface generation using level-set methods. J. Mol. Graph. Model (2006) 25:442–454.[CrossRef][Web of Science][Medline]

    Ellis RJ. Protein folding inside the cage. Nature (2006) 442:360–362.[CrossRef][Medline]

    Herman GT. Discrete multidimensional Jordan surfaces. CVGIP-Graph. Model. Image process (1992) 54:507–515.[CrossRef]

    Ho CMW, Marshall GR. Cavity search: an algorithm for the isolation and display of cavity-like binding regions. J. Comput. Aided. Mol. Des (1990) 4:337–354.[CrossRef][Web of Science][Medline]

    Humphrey W, et al. VMD – Visual Molecular Dynamics. J. Mol. Graph (1996) 14:33–38.[CrossRef][Web of Science][Medline]

    Ilari A, et al. The dodecameric ferritin from Listeria innocua contains a novel intersubunit iron-binding site. Nat. Struct. Biol (2000) 7:38–43.[CrossRef][Web of Science][Medline]

    Kong TY, Rosenfeld A. Digital topology: introduction and survey. Comput. Vison Graph (1989) 48:357–393.

    Kraulis PJ. MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Cryst (1991) 24:946–950.[CrossRef][Web of Science]

    Kronheimer EH. The topology of digital images. Topol. Appl (1992) 46:279–303.[CrossRef]

    Lee B, Richards FM. The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol (1971) 55:379–400.[CrossRef][Web of Science][Medline]

    Liang J, et al. Analytical shape computation of macromolecules: I. molecular area and volume through alpha shape. Proteins (1998) 33:1–17.[Web of Science][Medline]

    Meng L, et al. Epoxomicin, a potent and selective proteasome inhibitor, exhibits in vivo antiinflammatory activity. Proc. Natl Acad. Sci. USA (1999) 96:10403–10408.[Abstract/Free Full Text]

    Nicholls A, Honig B. A rapid finite difference algorithm, utilising succesive over relaxation to solve the Poisson–Boltzmann equation. J. Comput. Chem (1991) 12:435–445.[CrossRef][Web of Science]

    Stouch TR, Jurs PC. A simple method for the representation, quantification, and comparison of the volumes and shapes of chemical compounds. J. Chem. Inf. Comput. Sci (1986) 26:4–12.[CrossRef][Web of Science]

    Tang Y-C, Chang H-C. Structural features of the GroEL-GroES nano-cage required for rapid folding of encapsulated protein. Cell (2006) 125:903–914.[CrossRef][Web of Science][Medline]

    Tsukamoto R, et al. Synthesis of CoB3BOB4B nanoparticles using the cage-shaped protein, Apoferritin. Bull. Chem. Soc. Jpn (2005) 78:2075–2081.[CrossRef][Web of Science]

    Yang XE, et al. Iron oxidation and hydrolysis reactions of a novel ferritin from Listeria innocua. Biochem. J (2000) 349:783–786.[Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/24/3400    most recent
btm476v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Hu, M.
Right arrow Articles by Peng, Q.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hu, M.
Right arrow Articles by Peng, Q.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?