Bioinformatics Advance Access published online on May 3, 2006
Bioinformatics, doi:10.1093/bioinformatics/btl170
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Institute of Image Processing & Pattern Recognition, Shanghai Jiaotong University, Shanghai 200030, China
Motivation: Prediction of protein folding patterns is one level deeper than that of protein structural classes, and hence is much more complicated and difficult. To deal with such a challenging problem, the ensemble classifier was introduced. It was formed by a set of basic classifiers, with each trained in different parameter systems, such as predicted secondary structure, hydrophobicity, van der Waals volume, polarity, polarizability, as well as different dimensions of pseudo amino acid composition, that were extracted from a training dataset. The operation engine for the constituent individual classifiers was OET-KNN (Optimized Evidence-Theoretic K-Nearest Neighbors) rule. Their outcomes were combined thru a weighted voting to give a final determination for classifying a query protein. The recognition was to find the true fold among the 27 possible patterns. Results: The overall success rate thus obtained was 62% for a testing dataset where most of the proteins have less than 25% sequence identity with the proteins used in training the classifier. Such a rate is 6-21% higher than the corresponding rates obtained by various existing NN (Neural Networks) and SVM (Support Vector Machines) approaches, implying that the ensemble classifier is very promising and might become an useful vehicle in protein science, as well proteomics and bioinformatics. The ensemble classifier, called PFP-Pred, is available as a web-server at http://www.pami.sjtu.edu.cn/people/hbshen for public usage.
Received March 31, 2006
Revised April 26, 2006
Accepted April 27, 2006
Article
Ensemble classifier for protein fold pattern recognition
Hong-Bin Shen 1
and
Kuo-Chen Chou 2
2 Institute of Image Processing & Pattern Recognition, Shanghai Jiaotong University, Shanghai 200030, China; Gordon Life Science Institute, San Diego, CA 92130, USA
![]()
Abstract
Associate Editor: Keith A Crandall
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
W.-Z. Lin, X. Xiao, and K.-C. Chou GPCR-GIA: a web-server for identifying G-protein coupled receptors and their families with grey incidence analysis Protein Eng. Des. Sel., November 1, 2009; 22(11): 699 - 705. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Dong, S. Zhou, and J. Guan A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation Bioinformatics, October 15, 2009; 25(20): 2655 - 2662. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rackovsky Sequence physical properties encode the global organization of protein structure space PNAS, August 25, 2009; 106(34): 14345 - 14348. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Guo and X. Gao A novel hierarchical ensemble classifier for protein fold recognition Protein Eng. Des. Sel., November 1, 2008; 21(11): 659 - 664. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. V. Aguilar, L. W. Leung, E. Wang, S. C. Weaver, and C. F. Basler A Five-Amino-Acid Deletion of the Eastern Equine Encephalitis Virus Capsid Protein Attenuates Replication in Mammalian Systems but Not in Mosquito Cells J. Virol., July 15, 2008; 82(14): 6972 - 6983. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Damoulas and M. A. Girolami Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection Bioinformatics, May 15, 2008; 24(10): 1264 - 1270. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. T. A. Shamim, M. Anwaruddin, and H.A. Nagarajaram Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs Bioinformatics, December 15, 2007; 23(24): 3320 - 3327. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Chen and L. Kurgan PFRES: protein fold classification by using evolutionary information and predicted secondary structure Bioinformatics, November 1, 2007; 23(21): 2843 - 2850. [Abstract] [Full Text] [PDF] |
||||



