Bioinformatics Vol. 18 no. 6 2002
Pages 788-801
© 2002 Oxford University Press
A Bayesian network model for protein fold and remote homologue recognition
1 Keck Graduate Institute of Applied Life Sciences,
535 Watson Drive, Claremont, CA 91711, USA
2 Gatsby Computational Neuroscience Unit,
University College London, Queen's Square, London WC1N 3AR, UK
3 Keck Graduate Institute of Applied Life Sciences,
535 Watson Drive, Claremont, CA 91711, USA
Received on December 1, 2000
; revised on May 15, 2001 and December 12, 2001
; accepted on December 18, 2001
Motivation: The Bayesian network approach is a framework which combines graphical representation and probability theory, which includes, as a special case, hidden Markov models. Hidden Markov models trained on amino acid sequence or secondary structure data alone have been shown to have potential for addressing the problem of protein fold and superfamily classification.
Results: This paper describes a novel implementation of a Bayesian network which simultaneously learns amino acid sequence, secondary structure and residue accessibility for proteins of known three-dimensional structure. An awareness of the errors inherent in predicted secondary structure may be incorporated into the model by means of a confusion matrix. Training and validation data have been derived for a number of protein superfamilies from the Structural Classification of Proteins (SCOP) database. Cross validation results using posterior probability classification demonstrate that the Bayesian network performs better in classifying proteins of known structural superfamily than a hidden Markov model trained on amino acid sequences alone.
Contact: alpan_raval{at}kgi.edu zoubin{at}gatsby.ucl.ac.uk david_wild{at}kgi.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
T. Damoulas and M. A. Girolami Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection Bioinformatics, May 15, 2008; 24(10): 1264 - 1270. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Larranaga, B. Calvo, R. Santana, C. Bielza, J. Galdiano, I. Inza, J. A. Lozano, R. Armananzas, G. Santafe, A. Perez, et al. Machine learning in bioinformatics Brief Bioinform, March 1, 2006; 7(1): 86 - 112. [Abstract] [Full Text] [PDF] |
||||

