Bioinformatics Advance Access published online on August 12, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth466
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Gordon Life Science Institute, San Diego, CA 92130, USA
* To whom correspondence should be addressed. E-mail: kchou{at}san.rr.com.
Motivation: With the protein sequences entering into databanks at an explosive pace, it is important to timely determine the family or subfamily class for a newly-found enzyme molecule because this is directly related to the detailed information about what specific target it acts on, as well as to its catalytic process and biological function. Unfortunately, it is both time-consuming and costly to do so by experiments alone. In a previous study, the covariant-discriminant algorithm was introduced to identify the 16 subfamily classes of oxidoreductases. Although the results were quite encouraging, the entire prediction process was based on the amino acid composition alone without including any sequence-order information. Therefore, it is worthy of further investigation. Results: To incorporate the sequence-order effects into the predictor, the "amphiphilic pseudo amino acid composition" is introduced to represent the statistical sample of a protein. The novel representation contains 20 + 2
Revised July 20, 2004
Accepted August 2, 2004
Article
Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes
![]()
Abstract
discrete numbers: the first 20 numbers are the components of the conventional amino acid composition; the next 2
numbers are a set of correlation factors that reflect different hydrophobicity and hydrophilicity distribution patterns along a protein chain. Based on such a concept and formulation scheme, a new predictor is developed. It is observed by the self-consistency test, jackknife test, and independent dataset test that the success rates obtained by the new predictor are all significantly higher than those by the previous predictors. The significant enhancement in success rates also implies that the distribution of hydrophobicity and hydrophilicity of the amino acid residues along a protein chain plays a very important role to its structure and function.![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
P. V. Aguilar, L. W. Leung, E. Wang, S. C. Weaver, and C. F. Basler A Five-Amino-Acid Deletion of the Eastern Equine Encephalitis Virus Capsid Protein Attenuates Replication in Mammalian Systems but Not in Mosquito Cells J. Virol., July 15, 2008; 82(14): 6972 - 6983. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Damoulas and M. A. Girolami Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection Bioinformatics, May 15, 2008; 24(10): 1264 - 1270. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-B. Shen and K.-C. Chou Gpos-PLoc: an ensemble classifier for predicting subcellular localization of Gram-positive bacterial proteins Protein Eng. Des. Sel., January 23, 2007; (2007) gzl053v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-B. Shen and K.-C. Chou Ensemble classifier for protein fold pattern recognition Bioinformatics, July 15, 2006; 22(14): 1717 - 1722. [Abstract] [Full Text] [PDF] |
||||


