Skip Navigation



Bioinformatics Advance Access published online on July 12, 2006

Bioinformatics, doi:10.1093/bioinformatics/btl376
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
22/18/2224    most recent
btl376v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Lingner, T.
Right arrow Articles by Meinicke, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lingner, T.
Right arrow Articles by Meinicke, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2006). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
Received March 30, 2006
Revised June 20, 2006
Accepted July 5, 2006

Article

Remote homology detection based on oligomer distances

Thomas Lingner 1 * and Peter Meinicke 1

1 Abteilung Bioinformatik, Institut für Mikrobiologie und Genetik, Georg-August-Universität Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany

* To whom correspondence should be addressed.
Thomas Lingner, E-mail: thomas{at}gobics.de


   Abstract

Motivation: Remote homology detection is among the most intensively researched problems in bioinformatics. Currently discriminative approaches, especially kernel-based methods, provide the most accurate results. However, kernel methods also show several drawbacks: in many cases prediction of new sequences is computationally expensive, often kernels lack an interpretable model for analysis of characteristic sequence features, and finally most approaches make use of so-called hyperparameters which complicate the application of methods across different data sets.

Results: We introduce a feature vector representation for protein sequences based on distances between short oligomers. The corresponding feature space arises from distance histograms for any possible pair of K-mers. Our distance-based approach shows important advantages in terms of computational speed while on common test data the prediction performance is highly competitive with state-of-the-art methods for protein remote homology detection. Furthermore the learnt model can easily be analyzed in terms of discriminative features and in contrast to other methods our representation does not require any tuning of kernel hyperparameters.

Availability: Normalized kernel matrices for the experimental setup can be downloaded at www.gobics.de/thomas. Matlab code for computing the kernel matrices is available upon request.


Associate Editor: Christos Ouzounis
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
T. Damoulas and M. A. Girolami
Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection
Bioinformatics, May 15, 2008; 24(10): 1264 - 1270.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. R. Shah, C. S. Oehmen, and B.-J. Webb-Robertson
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection
Bioinformatics, March 15, 2008; 24(6): 783 - 790.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Hochreiter, M. Heusel, and K. Obermayer
Fast model-based protein homology detection without alignment
Bioinformatics, July 15, 2007; 23(14): 1728 - 1736.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.