Bioinformatics Advance Access originally published online on February 26, 2004
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics 20(11) © Oxford University Press 2004; all rights reserved.
Protein homology detection using string alignment kernels
1 Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, 611-0011, Japan and 2 Centre de Géostatistique, Ecole des Mines de Paris, 35 rue Saint-Honoré, Fontainebleau, 77300, France
Received on April 30, 2003; revised on December 9, 2003; accepted on January 8, 2004
Advance Access Publication February 26, 2004
Motivation: Remote homology detection between protein sequences is a central problem in computational biology. Discriminative methods involving support vector machines (SVMs) are currently the most effective methods for the problem of superfamily recognition in the Structural Classification Of Proteins (SCOP) database. The performance of SVMs depends critically on the kernel function used to quantify the similarity between sequences.
Results: We propose new kernels for strings adapted to biological sequences, which we call local alignment kernels. These kernels measure the similarity between two sequences by summing up scores obtained from local alignments with gaps of the sequences. When tested in combination with SVM on their ability to recognize SCOP superfamilies on a benchmark dataset, the new kernels outperform state-of-the-art methods for remote homology detection.
Availability: Software and data available upon request.
Contact: Jean-Philippe.Vert{at}mines.org
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Q. Dong, S. Zhou, and J. Guan A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation Bioinformatics, October 15, 2009; 25(20): 2655 - 2662. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Bleakley and Y. Yamanishi Supervised prediction of drug-target interactions using bipartite local models Bioinformatics, September 15, 2009; 25(18): 2397 - 2403. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Jung and D. Kim SIMPRO: simple protein homology detection method by using indirect signals Bioinformatics, March 15, 2009; 25(6): 729 - 735. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Morita, Y. Saito, K. Sato, K. Oka, K. Hotta, and Y. Sakakibara Genome-wide searching with base-pairing kernel functions for noncoding RNAs: computational and expression analysis of snoRNA families in Caenorhabditis elegans Nucleic Acids Res., February 1, 2009; 37(3): 999 - 1009. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Damoulas and M. A. Girolami Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection Bioinformatics, May 15, 2008; 24(10): 1264 - 1270. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Fariselli, I. Rossi, E. Capriotti, and R. Casadio The WWWH of remote homolog detection: The state of the art Brief Bioinform, March 1, 2007; 8(2): 78 - 87. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Sonego, M. Pacurar, S. Dhir, A. Kertesz-Farkas, A. Kocsor, Z. Gaspari, J. A.M. Leunissen, and S. Pongor A Protein Classification Benchmark collection for machine learning Nucleic Acids Res., January 12, 2007; 35(suppl_1): D232 - D236. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Kajan, A. Kertesz-Farkas, D. Franklin, N. Ivanova, A. Kocsor, and S. Pongor Application of a simple likelihood ratio approximant to protein sequence classification Bioinformatics, December 1, 2006; 22(23): 2865 - 2869. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Lingner and P. Meinicke Remote homology detection based on oligomer distances Bioinformatics, September 15, 2006; 22(18): 2224 - 2231. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Chen, W. Wang, S. Ling, C. Jia, and F. Wang KemaDom: a web server for domain prediction using kernel machine with local context. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W158 - W163. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Camoglu, T. Can, and A. K. Singh Integrating multi-attribute similarity networks for robust representation of the protein space Bioinformatics, July 1, 2006; 22(13): 1585 - 1592. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q.-w. Dong, X.-l. Wang, and L. Lin Application of latent semantic analysis to protein remote homology detection Bioinformatics, February 1, 2006; 22(3): 285 - 290. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Rangwala and G. Karypis Profile-based direct kernels for remote homology detection and fold recognition Bioinformatics, December 1, 2005; 21(23): 4239 - 4247. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Weston, C. Leslie, E. Ie, D. Zhou, A. Elisseeff, and W. S. Noble Semi-supervised protein classification using cluster kernels Bioinformatics, August 1, 2005; 21(15): 3241 - 3247. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Wang and R. Samudrala FSSA: a novel method for identifying functional signatures from structural alignments Bioinformatics, July 1, 2005; 21(13): 2969 - 2977. [Abstract] [Full Text] [PDF] |
||||


