Skip Navigation


Bioinformatics Advance Access originally published online on June 30, 2005
Bioinformatics 2005 21(17):3570-3571; doi:10.1093/bioinformatics/bti561
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/17/3570    most recent
bti561v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Kinoshita, K.
Right arrow Articles by Ota, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kinoshita, K.
Right arrow Articles by Ota, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

P-cats: prediction of catalytic residues in proteins from their tertiary structures

Kengo Kinoshita 1,2,* and Motonori Ota 3

1Institute of Medical Science, University of Tokyo Shirokanedai, Minato-ku, Tokyo 108-8639, Japan
2SORST, JST Honcho, Kawaguchi, Saitama 332-0012, Japan
3Global Scientific Information and Computing Center, Tokyo Institute of Technology O-okayama, Meguro-ku, Tokyo 152-8550, Japan

*To whom correspondence should be addressed.


    Abstract
 TOP
 Abstract
 REFERENCES
 

Summary: P-cats is a web server that predicts the catalytic residues in proteins from the atomic coordinates. P-cats receives a coordinate file of the tertiary structure and sends out analytical results via e-mail. The reply contains a summary and two URLs to allow the user to examine the conserved residues: one for interactive images of the prediction results and the other for a graphical view of the multiple sequence alignment.

Availability: P-cats is freely available at http://p-cats.hgc.jp/p-cats

Contact: kino{at}ims.u-tokyo.ac.jp

Progress in several genome projects has yielded the genomic sequences of >200 prokaryotes and 20 eukaryotes. However, a significant number of proteins encoded by the genomes lack annotations for their functions. To get some clues for their functions, several structural genomics projects are currently being carried out to determine their structures. Based on structural information, similarity searches to the structures in the Protein Data Bank (PDB) (Berman et al., 2000) are carried out in terms of protein folds, spatial configuration of atoms and the molecular surface of proteins (Handa et al., 2003; Hwang et al., 1999; Kinoshita and Nakamura, 2003). These methods, however, do not always produce successful results, because the relationship between protein structure and function is insufficiently known. Proteins of novel folds tend to be especially hypothetical, making further analysesextremely hard.

Even in these cases, prediction of functionally important residues may offer useful suggestions for the functions, which lead to experimental elucidation. Catalytic residues are often situated at conserved positions that are unfavorable for structural stability (Elcock, 2001; Ota et al., 1997; Shoichet et al., 1995) and are also likely to be found within holes, cavities and clefts on the protein surface (Laskowski et al., 1996). Making use of these preferences along with the sequence conservation (Casari et al., 1995; Zvelebil and Sternberg, 1988), we have developed a method to locate catalytic residues within proteins (Ota et al., 2003). Conserved residues are selected by conservation numbers and their local and spatial averages (Zvelebil and Sternberg, 1988) with the use of multiple sequence alignments (Higgins et al., 1996). These residues are classified as either catalytic or non-catalytic based on theoretical stability changes caused by all possible mutations, and the statistical preferences for geometrical features at the molecular surface of proteins, with the final decision being made using a simple k-nearest neighbor method (Ota et al., 2003). The stability change was estimated using a knowledge-based potential consisting of three terms, i.e. side-chain packing, hydration and local conformation (Ota et al., 1997). Using 98 catalytic residues annotated in the SWISS-PROT database (Boeckmann et al., 2003) among 49 proteins, the sensitivity and specificity were determined by maximizing the correlation coefficient between the correct answers and the outputs under the constraint that sensitivity should be >55%. The sensitivity and specificity achieved by our method are 56% and 27%, respectively, and are comparable to those produced by other methods [e.g. 66% sensitivity and 21% specificity (Aloy et al., 2001)], but are superior to those attained by conventional methods based only on sequence conservation (Ota et al., 2003; Zvelebil and Sternberg, 1988).

Our method has been implemented to the P-cats web server. Since details of the usage are described in the help page (http://p-cats.hgc.jp/p-cats/help.html), we briefly outline the submission process here. In the first step, the server simply asks for a coordinate file of the structure in the PDB format and an e-mail address to which results of the prediction will be sent. To retrieve the homologous sequences using BLAST, the user can select either SWISS-PROT or RefSeq (Pruitt et al., 2005) as the sequence database. We recommend the user to try SWISS-PROT first (default) because RefSeq contains a number of sequences of the ‘provisional’ curation level. The parameters selected in the first step are subsequently shown. In the same screen, the user should choose a chain identifier of the query molecule if the PDB file includes several chains. In the following step, pressing the ‘Execute BLAST’ button initiates BLAST search, and after a short while leads to a display of the results. If only a few homologous proteins are identified in SWISS-PROT, it may be prudent to switch to RefSeq or to relax the E-value threshold of the BLAST search from the default value (10 x 10–10) or both. The user can do so by pressing the ‘turn back’ button of the browser twice. Finally, when an adequate number of homologs have been retrieved, clicking the ‘Submit’ button feeds the query structure and the homologous sequences into P-cats. It should be noted that in case >150 homologs are identified by the BLAST search, P-cats automatically selects representative sequences using the BLASTCLUST program (Altschul et al., 1997) because a large number of sequences cannot be processed by the CLUSTAL-W multiple alignment program (Higgins et al., 1996) and the BOXSHADE alignment viewer (http://bioweb.pasteur.fr/seqanal/interfaces/boxshade.html). The other parameters used in prediction have been optimized, as formerly described (Ota et al., 2003).

Upon completion of the calculation, P-cats sends the submitter an email. The duration of the job depends largely on the number of other jobs and the size of the query protein. Although it takes <10 min to process a small protein of 200 residues in the absence of other jobs, it may require a few hours if the query protein is larger or when other jobs are running. The user can check the status of the P-cat queue upon submission. The email response provides a summary of the prediction and two URLs that show the multiply aligned homologous sequences and a graphical representation of the predicted residues in the query structure. The spatial positions of the conserved and predicted residues can be seen at the indicated URL with Internet Explorer for Windows or with Safari for Macintosh OS X. The visual page is implemented with the latest version of the PDBjViewer (Kinoshita and Nakamura, 2004), which requires a previous installation of Java and JOGL (http://www.pdbj.org/PDBjViewer/). In the PDBjViewer, we provide a ribbon model of the protein and molecular surface with the electrostatic potential and hydrophobicity mapped onto it.


    Acknowledgments
 
The authors would like to thank Ken Nishikawa for discussions on the prediction scheme, Keiko Ichikawa for preparing the delightful feline drawings on the web pages, Haruki Nakamura for providing his program for electrostatic calculations and Keiichi Homma for a critical reading of the manuscript. This work was supported by grants-in-aid from the Ministry of Education, Culture, Sports, Science and Technology of Japan to K.K. and M.O.

Conflict of Interest: none declared.

Received on April 5, 2005; revised on June 20, 2005; accepted on June 27, 2005

    REFERENCES
 TOP
 Abstract
 REFERENCES
 

    Aloy, P., et al. (2001) Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J. Mol. Biol., 311, 395–408[CrossRef][Web of Science][Medline].

    Altschul, S.F., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402[Abstract/Free Full Text].

    Berman, H.M., et al. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242[Abstract/Free Full Text].

    Boeckmann, B., et al. (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res., 31, 365–370[Abstract/Free Full Text].

    Casari, G., et al. (1995) A method to predict functional residues in proteins. Nat. Struct. Biol., 2, 171–178[CrossRef][Web of Science][Medline].

    Elcock, A.H. (2001) Prediction of functionally important residues based solely on the computed energetics of protein structure. J. Mol. Biol., 312, 885–896[CrossRef][Web of Science][Medline].

    Handa, N., et al. (2003) Crystal structure of the conserved protein TT1542 from Thermus thermophilus HB8. Protein Sci., 12, 1621–1632[CrossRef][Web of Science][Medline].

    Higgins, D.G., et al. (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol., 266, 383–402[Web of Science][Medline].

    Hwang, K.Y., et al. (1999) Structure-based identification of a novel NTPase from. Methanococcus jannaschii. Nat. Struct. Biol., 6, 691–696.

    Kinoshita, K. and Nakamura, H. (2003) Protein informatics towards function identification. Curr. Opin. Struct. Biol., 13, 396–400[CrossRef][Web of Science][Medline].

    Kinoshita, K. and Nakamura, H. (2004) eF-site and PDBjViewer: database and viewer for protein functional sites. Bioinformatics, 20, 1329–1330[Abstract/Free Full Text].

    Laskowski, R.A., et al. (1996) Protein clefts in molecular recognition and function. Protein Sci., 5, 2438–2452[Web of Science][Medline].

    Ota, M., et al. (1997) Structural requirement of highly-conserved residues in globins. FEBS Lett., 415, 129–133[Medline].

    Ota, M., et al. (2001) Knowledge-based potential defined for a rotamer library to design protein sequences. Protein Eng., 14, 557–564[Abstract/Free Full Text].

    Ota, M., et al. (2003) Prediction of catalytic residues in enzymes based on known tertiary structure, stability profile, and sequence conservation. J. Mol. Biol., 327, 1053–1064[CrossRef][Web of Science][Medline].

    Pruitt, K.D., et al. (2005) NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res., 33, D501–D504[Abstract/Free Full Text].

    Shoichet, B.K., et al. (1995) A relationship between protein stability and protein function. Proc. Natl Acad. Sci. USA, 92, 452–456[Abstract/Free Full Text].

    Zvelebil, M.J. and Sternberg, M.J. (1988) Analysis and prediction of the location of catalytic residues in enzymes. Protein Eng., 2, 127–138[Abstract/Free Full Text].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
F. Pazos, A. Rausell, and A. Valencia
Phylogeny-independent detection of functional residues
Bioinformatics, June 15, 2006; 22(12): 1440 - 1448.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/17/3570    most recent
bti561v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Kinoshita, K.
Right arrow Articles by Ota, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kinoshita, K.
Right arrow Articles by Ota, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?