Bioinformatics Vol. 18 no. 5 2002
Pages 689-696
© 2002 Oxford University Press
Support vector machines with selective kernel scaling for protein classification and identification of key amino acid positions
1 Argonne National Laboratory, 9700 S. Cass Avenue,
Argonne, IL 60439, USA
2 US Army Medical Research and Materiel Command,
504 Scott Street, Fort Detrick, MD 21702, USA
Received on May 25, 2001
; revised on
; accepted on December 6, 2001
Motivation: Data that characterize primary and tertiary structures of proteins are now accumulating at a rapid and accelerating rate and require automated computational tools to extract critical information relating amino acid changes with the spectrum of functionally attributes exhibited by a protein. We propose that immunoglobulin-type beta-domains, which are found in approximate 400 functionally distinct forms in humans alone, provide the immense genetic variation within limited conformational changes that might facilitate the development of new computational tools. As an initial step, we describe here an approach based on Support Vector Machine (SVM) technology to identify amino acid variations that contribute to the functional attribute of pathological self-assembly by some human antibody light chains produced during plasma cell diseases.
Results: We demonstrate that SVMs with selective kernel scaling are an
effective tool in discriminating between benign and pathologic
human immunoglobulin light chains. Initial results compare
favorably against manual classification performed by experts and
indicate the capability of SVMs to capture the underlying
structure of the data. The data set consists of 70 proteins of
human antibody
1 light chains, each represented by
aligned sequences of 120 amino acids. We perform feature
selection based on a first-order adaptive scaling algorithm,
which confirms the importance of changes in certain amino acid
positions and identifies other positions that are key in the
characterization of protein function.
Contact: nelaz{at}ra.anl.gov fstevens{at}anl.gov jaques.reifman{at}amedd.army.mil
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y. Saeys, I. Inza, and P. Larranaga A review of feature selection techniques in bioinformatics Bioinformatics, October 1, 2007; 23(19): 2507 - 2517. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Idicula-Thomas, A. J. Kulkarni, B. D. Kulkarni, V. K. Jayaraman, and P. V. Balaji A support vector machine-based method for predicting the propensity of a protein to be soluble or to form inclusion body on overexpression in Escherichia coli Bioinformatics, February 1, 2006; 22(3): 278 - 284. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. Bradford and D. R. Westhead Improved prediction of protein-protein binding sites using a support vector machines approach Bioinformatics, April 15, 2005; 21(8): 1487 - 1494. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Kifer, O. Sasson, and M. Linial Predicting fold novelty based on ProtoNet hierarchical classification Bioinformatics, April 1, 2005; 21(7): 1020 - 1027. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bhasin and G. P. S. Raghava ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST Nucleic Acids Res., July 1, 2004; 32(suppl_2): W414 - W419. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Reifman, G. R. Gilbert, L. Fagan, and R. Satava Military Research Needs in Biomedical Informatics J. Am. Med. Inform. Assoc., September 1, 2002; 9(5): 509 - 519. [Abstract] [Full Text] [PDF] |
||||


