Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (169)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ding, C. H.Q.
Right arrow Articles by Dubchak, I.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ding, C. H.Q.
Right arrow Articles by Dubchak, I.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 17 no. 4 2001
Pages 349-358
© 2001 Oxford University Press


Original Paper

Multi-class protein fold recognition using support vector machines and neural networks

Chris H.Q. Ding * and Inna Dubchak

NERSC Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720, USA

Received on August 2, 2000 ; revised on November 4, 2000 ; accepted on November 16, 2000

Motivation: Protein fold recognition is an important approach to structure discovery without relying on sequence similarity. We study this approach with new multi-class classification methods and examined many issues important for a practical recognition system.

Results: Most current discriminative methods for protein fold prediction use the one-against-others method, which has the well-known ‘False Positives’ problem. We investigated two new methods: the unique one-against-others and the all-against-all methods. Both improve prediction accuracy by 14–110% on a dataset containing 27 SCOP folds. We used the Support Vector Machine (SVM) and the Neural Network (NN) learning methods as base classifiers. SVMs converges fast and leads to high accuracy. When scores of multiple parameter datasets are combined, majority voting reduces noise and increases recognition accuracy. We examined many issues involved with large number of classes, including dependencies of prediction accuracy on the number of folds and on the number of representatives in a fold. Overall, recognition systems achieve 56% fold prediction accuracy on a protein test dataset, where most of the proteins have below 25% sequence identity with the proteins used in training.

Supplementary information: The protein parameter datasets used in this paper are available online (http://www.nersc.gov/~cding/protein).

Contact: chqding{at}lbl.gov; ildubchak{at}lbl.gov

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Protein Eng Des SelHome page
X. Guo and X. Gao
A novel hierarchical ensemble classifier for protein fold recognition
Protein Eng. Des. Sel., September 4, 2008; (2008) gzn045v1.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. Damoulas and M. A. Girolami
Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection
Bioinformatics, May 15, 2008; 24(10): 1264 - 1270.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
N. Lama and M. Girolami
vbmp: Variational Bayesian Multinomial Probit Regression for multi-class classification in R
Bioinformatics, January 1, 2008; 24(1): 135 - 136.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. T. A. Shamim, M. Anwaruddin, and H.A. Nagarajaram
Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs
Bioinformatics, December 15, 2007; 23(24): 3320 - 3327.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Chen and L. Kurgan
PFRES: protein fold classification by using evolutionary information and predicted secondary structure
Bioinformatics, November 1, 2007; 23(21): 2843 - 2850.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Hochreiter, M. Heusel, and K. Obermayer
Fast model-based protein homology detection without alignment
Bioinformatics, July 15, 2007; 23(14): 1728 - 1736.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J.-R. Xu, J.-X. Zhang, B.-C. Han, L. Liang, and Z.-L. Ji
CytoSVM: an advanced server for identification of cytokine-receptor interactions
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W538 - W542.
[Abstract] [Full Text] [PDF]


Home page
J Biomol ScreenHome page
C. Y. Tao, J. Hoyt, and Yan Feng
A Support Vector Machine Classifier for Recognizing Mitotic Subphases Using High-Content Screening Data
J Biomol Screen, June 1, 2007; 12(4): 490 - 496.
[Abstract] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Shen, J. Zhang, X. Luo, W. Zhu, K. Yu, K. Chen, Y. Li, and H. Jiang
Predicting protein-protein interactions based only on sequences information
PNAS, March 13, 2007; 104(11): 4337 - 4341.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H.-B. Shen and K.-C. Chou
Ensemble classifier for protein fold pattern recognition
Bioinformatics, July 15, 2006; 22(14): 1717 - 1722.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Z. R. Li, H. H. Lin, L. Y. Han, L. Jiang, X. Chen, and Y. Z. Chen
PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W32 - W37.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
D. Eramian, M.-y. Shen, D. Devos, F. Melo, A. Sali, and M. A. Marti-Renom
A composite score for predicting errors in protein structure models
Protein Sci., July 1, 2006; 15(7): 1653 - 1666.
[Abstract] [Full Text] [PDF]


Home page
J. Lipid Res.Home page
H. H. Lin, L. Y. Han, H. L. Zhang, C. J. Zheng, B. Xie, and Y. Z. Chen
Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity
J. Lipid Res., April 1, 2006; 47(4): 824 - 831.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Idicula-Thomas, A. J. Kulkarni, B. D. Kulkarni, V. K. Jayaraman, and P. V. Balaji
A support vector machine-based method for predicting the propensity of a protein to be soluble or to form inclusion body on overexpression in Escherichia coli
Bioinformatics, February 1, 2006; 22(3): 278 - 284.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Bhardwaj, R. E. Langlois, G. Zhao, and H. Lu
Kernel-based machine learning protocol for predicting DNA-binding proteins
Nucleic Acids Res., November 10, 2005; 33(20): 6486 - 6493.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
O. C. Kulkarni, R. Vigneshwar, V. K. Jayaraman, and B. D. Kulkarni
Identification of coding and non-coding sequences using local Holder exponent formalism
Bioinformatics, October 15, 2005; 21(20): 3818 - 3823.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
P. M. Kasson, J. B. Huppa, M. M. Davis, and A. T. Brunger
A hybrid machine-learning approach for segmentation of protein localization data
Bioinformatics, October 1, 2005; 21(19): 3778 - 3786.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Chen and H.-X. Zhou
Prediction of solvent accessibility and sites of deleterious mutations from protein sequence
Nucleic Acids Res., June 3, 2005; 33(10): 3193 - 3199.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
W. Zheng and S. Doniach
Fold recognition aided by constraints from small angle X-ray scattering data
Protein Eng. Des. Sel., May 1, 2005; 18(5): 209 - 219.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. R. Bradford and D. R. Westhead
Improved prediction of protein-protein binding sites using a support vector machines approach
Bioinformatics, April 15, 2005; 21(8): 1487 - 1494.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. Y. Han, C. Z. Cai, Z. L. Ji, Z. W. Cao, J. Cui, and Y. Z. Chen
Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach
Nucleic Acids Res., December 7, 2004; 32(21): 6437 - 6444.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
M. Wang, J. Yang, G.-P. Liu, Z.-J. Xu, and K.-C. Chou
Weighted-support vector machines for predicting membrane protein types based on pseudo-amino acid composition
Protein Eng. Des. Sel., June 1, 2004; 17(6): 509 - 516.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
L. Y. HAN, C. Z. CAI, S. L. LO, M. C.M. CHUNG, and Y. Z. CHEN
Prediction of RNA-binding proteins from primary sequence by a support vector machine approach
RNA, March 1, 2004; 10(3): 355 - 368.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
M. Bhasin and G.P.S. Raghava
Analysis and prediction of affinity of TAP binding peptides using cascade SVM
Protein Sci., March 1, 2004; 13(3): 596 - 607.
[Abstract] [Full Text] [PDF]


Home page
Biophys. JHome page
R. H. Leary, J. B. Rosen, and P. Jambeck
An Optimal Structure-Discriminative Amino Acid Index for Protein Fold Recognition
Biophys. J., January 1, 2004; 86(1): 411 - 419.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
E. Bindewald, A. Cestaro, J. Hesser, M. Heiler, and S. C.E. Tosatto
MANIFOLD: protein fold recognition based on secondary structure, sequence similarity and enzyme classification
Protein Eng. Des. Sel., November 1, 2003; 16(11): 785 - 789.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
J. A. Siepen, S. E. Radford, and D. R. Westhead
{beta} Edge strands in protein structure prediction and aggregation
Protein Sci., October 1, 2003; 12(10): 2348 - 2359.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
F. Liang
An Effective Bayesian Neural Network Classifier with a Comparison Study to Support Vector Machine
Neural Comput., August 1, 2003; 15(8): 1959 - 1989.
[Abstract] [Full Text]


Home page
Protein Eng Des SelHome page
H. Kim and H. Park
Protein secondary structure prediction based on an improved support vector machines approach
Protein Eng. Des. Sel., August 1, 2003; 16(8): 553 - 560.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C.Z. Cai, L.Y. Han, Z.L. Ji, X. Chen, and Y.Z. Chen
SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence
Nucleic Acids Res., July 1, 2003; 31(13): 3692 - 3697.
[Abstract] [Full Text] [PDF]


Home page
Biophys. JHome page
Y.-D. Cai, G.-P. Zhou, and K.-C. Chou
Support Vector Machines for Predicting Membrane Protein Types by Using Functional Domain Composition
Biophys. J., May 1, 2003; 84(5): 3257 - 3263.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
K.-C. Chou and Y.-D. Cai
Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location
J. Biol. Chem., November 22, 2002; 277(48): 45765 - 45769.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.