Skip Navigation


Bioinformatics Advance Access originally published online on February 22, 2008
Bioinformatics 2008 24(7):924-931; doi:10.1093/bioinformatics/btn069
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
24/7/924    most recent
btn069v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (5)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Wu, S.
Right arrow Articles by Zhang, Y.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wu, S.
Right arrow Articles by Zhang, Y.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

A comprehensive assessment of sequence-based and template-based methods for protein contact prediction

Sitao Wu and Yang Zhang *

Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA

*To whom correspondence should be addressed.


   Abstract

Motivation: Pair-wise residue-residue contacts in proteins can be predicted from both threading templates and sequence-based machine learning. However, most structure modeling approaches only use the template-based contact predictions in guiding the simulations; this is partly because the sequence-based contact predictions are usually considered to be less accurate than that by threading. With the rapid progress in sequence databases and machine-learning techniques, it is necessary to have a detailed and comprehensive assessment of the contact-prediction methods in different template conditions.

Results: We develop two methods for protein-contact predictions: SVM-SEQ is a sequence-based machine learning approach which trains a variety of sequence-derived features on contact maps; SVM-LOMETS collects consensus contact predictions from multiple threading templates. We test both methods on the same set of 554 proteins which are categorized into ‘Easy’, ‘Medium’, ‘Hard’ and ‘Very Hard’ targets based on the evolutionary and structural distance between templates and targets. For the Easy and Medium targets, SVM-LOMETS obviously outperforms SVM-SEQ; but for the Hard and Very Hard targets, the accuracy of the SVM-SEQ predictions is higher than that of SVM-LOMETS by 12–25%. If we combine the SVM-SEQ and SVM-LOMETS predictions together, the total number of correctly predicted contacts in the Hard proteins will increase by more than 60% (or 70% for the long-range contact with a sequence separation ≥24), compared with SVM-LOMETS alone. The advantage of SVM-SEQ is also shown in the CASP7 free modeling targets where the SVM-SEQ is around four times more accurate than SVM-LOMETS in the long-range contact prediction. These data demonstrate that the state-of-the-art sequence-based contact prediction has reached a level which may be helpful in assisting tertiary structure modeling for the targets which do not have close structure templates. The maximum yield should be obtained by the combination of both sequence- and template-based predictions.

Contact: yzhang{at}ku.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Associate Editor: Anna Tramontano


Received on December 13, 2007; revised on February 16, 2008; accepted on February 16, 2008

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
A. N. Tegge, Z. Wang, J. Eickholt, and J. Cheng
NNcon: improved protein contact map prediction using 2D-recursive neural networks
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W515 - W518.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
P. Bjorkholm, P. Daniluk, A. Kryshtafovych, K. Fidelis, R. Andersson, and T. R. Hvidsten
Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts
Bioinformatics, May 15, 2009; 25(10): 1264 - 1270.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. K. Ho, M. J. Kuiper, and R. Kotagiri
PConPy--a Python module for generating 2D protein maps
Bioinformatics, December 15, 2008; 24(24): 2934 - 2935.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.