Bioinformatics Advance Access published online on May 8, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn222
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
HSEpred: predict Half-Sphere Exposure from protein sequences
1Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
2Caulfield School of Information Technology, Monash University, Caulfield, East VIC 3145, Australia
*To whom correspondence should be addressed. Dr. Jiangning Song. Tatsuya Akutsu, E-mail: sjn{at}kuicr.kyoto-u.ac.jp, takutsu{at}kuicr.kyoto-u.ac.jp
| Abstract |
|---|
Motivation: Half-Sphere Exposure (HSE) is a newly developed two-dimensional solvent exposure measure. By conceptually separating an amino acid's sphere in a protein structure into two half spheres which represent its distinct spatial neighborhoods in the upward and downward directions, the HSE-up and HSE-down measures show superior performance compared with other measures such as accessible surface area, residue depth and contact number. However, currently there is no existing method for the prediction of HSE measures from sequence data.
Results: In this article, we propose a novel approach to predict the HSE measures and infer residue contact numbers using the predicted HSE values, based on a well-prepared non-homologous protein structure dataset. In particular, we employ support vector regression to quantify the relationship between HSE measures and protein sequences and evaluate its prediction performance. We extensively explore five sequence encoding schemes to examine their effects on the prediction performance. Our method could achieve the correlation coefficients of 0.72 and 0.68 between the predicted and observed HSE-up and HSE-down measures, respectively. Moreover, contact number can be accurately predicted by the summation of the predicted HSE-up and HSE-down values, which has further enlarged the application of this method. The successful application of support vector regression approach in this study suggests that it should be more useful in quantifying the protein sequence-structure relationship and predicting the structural property profiles from protein sequences.
Availability: The prediction webserver and supplementary materials are accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/hse/.
Contact: sjn{at}kuicr.kyoto-u.ac.jp; takutsu{at}kuicr.kyoto-u.ac.jp
Associate Editor: Prof. Anna Tramontano
Received on March 6, 2008; revised on April 20, 2008; accepted on May 3, 2008