Bioinformatics Vol. 19 Suppl. 1 2003
Pages i205-i211
© 2003 Oxford University Press
An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins
Laboratory of Biocomputing, CIRB/Department of Biology, University of Bologna, via Irnerio 42, 40126 Bologna, Italy
Received on January 6, 2003
; accepted on February 20, 2003
Motivation: All-alpha membrane proteins constitute a functionally relevant subset of the whole proteome. Their content ranges from about 10 to 30% of the cell proteins, based on sequence comparison and specific predictive methods. Due to the paucity of membrane proteins solved with atomic resolution, the training/testing sets of predictive methods for protein topography and topology routinely include very few well-solved structures mixed with a hundred proteins known with low resolution. Moreover, available predictors fail in predicting recently crystallised membrane proteins (Chen et al., 2002). Presently the number of well-solved membrane proteins comprises some 59 chains of low sequence homology. It is therefore possible to train/test predictors only with the set of proteins known with atomic resolution and evaluate more thoroughly the performance of different methods.
Results: We implement a cascade-neural network (NN), two different hidden Markov models (HMM), and their ensemble (ENSEMBLE) as a new method. We train and test in cross validation the three methods and ENSEMBLE on the 59 well resolved membrane proteins. ENSEMBLE scores with a per-protein accuracy of 90% for topography and 71% for topology, outperforming the best single method of 7 and 5 percentage points, respectively. When tested on a low resolution set of 151 proteins, with no homology with the 59 proteins, the per-protein accuracy of ENSEMBLE is 76% for topography and 68% for topology. Our results also indicate that the performance of ENSEMBLE is higher than that of the best predictors presently available on the Web.
Contact: gigi{at}biocomp.unibo.it; http://www.biocomp.unibo.it
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
R. Casadio, P. L. Martelli, and A. Pierleoni The prediction of protein subcellular localization from sequence: a shortcut to functional genome annotation Brief Funct Genomic Proteomic, February 18, 2008; (2008) eln003v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. T. Jones Improving the accuracy of transmembrane protein topology prediction using evolutionary information Bioinformatics, March 1, 2007; 23(5): 538 - 544. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Pierleoni, P. L. Martelli, P. Fariselli, and R. Casadio eSLDB: eukaryotic subcellular localization database Nucleic Acids Res., January 12, 2007; 35(suppl_1): D208 - D212. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Amico, M. Finelli, I. Rossi, A. Zauli, A. Elofsson, H. Viklund, G. von Heijne, D. Jones, A. Krogh, P. Fariselli, et al. PONGO: a web server for multiple predictions of all-alpha transmembrane proteins. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W169 - W172. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Gianni, P. L. Martelli, R. Casadio, and G. Campadelli-Fiume The Ectodomain of Herpes Simplex Virus Glycoprotein H Contains a Membrane {alpha}-Helix with Attributes of an Internal Fusion Peptide, Positionally Conserved in the Herpesviridae Family J. Virol., March 1, 2005; 79(5): 2931 - 2940. [Abstract] [Full Text] [PDF] |
||||



