Bioinformatics Vol. 18 no. 90001 2002
Pages S5-S13
© 2002 Oxford University Press
Mining viral protease data to extract cleavage knowledge
School of Engineering and Computer Sciences (Bioinformatics Laboratory), Old Library, University of Exeter, Exeter EX4 4PT, UK
Received on January 19, 2002
; revised on March 28, 2002
; accepted on March 28, 2002
Motivation: The motivation is to identify, through machine learning techniques, specific patterns in HIV and HCV viral polyprotein amino acid residues where viral protease cleaves the polyprotein as it leaves the ribosome. An understanding of viral protease specificity may help the development of future anti-viral drugs involving protease inhibitors by identifying specific features of protease activity for further experimental investigation. While viral sequence information is growing at a fast rate, there is still comparatively little understanding of how viral polyproteins are cut into their functional unit lengths. The aim of the work reported here is to investigate whether it is possible to generalise from known cleavage sites to unknown cleavage sites for two specific virusesHIV and HCV. An understanding of proteolytic activity for specific viruses will contribute to our understanding of viral protease function in general, thereby leading to a greater understanding of protease families and their substrate characteristics.
Results: Our results show that artificial neural networks and symbolic learning techniques (See5) capture some fundamental and new substrate attributes, but neural networks outperform their symbolic counterpart.
Availability: Publicly available software was used (Stuttgart Neural Network Simulatorhttp://www-ra.informatik.uni-tuebingen.de/SNNS/ and See5http://www.rulequest.com. The datasets used (HIV, HCV) for See5 are available at: http://www.dcs.ex.ac.uk/~anarayan/bioinf/ismbdatasets/
Keywords: protase cleavage; protease inhibitors; machine learning; neural networks; decision trees.
Contact: a.narayanan{at}ex.ac.uk; wuxikun{at}yahoo.com; z.r.yang{at}ex.ac.uk
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C.-T. Chen, E.-W. Yang, H.-J. Hsu, Y.-K. Sun, W.-L. Hsu, and A.-S. Yang Protease substrate site predictors derived from machine learning on multilevel substrate phage display data Bioinformatics, December 1, 2008; 24(23): 2691 - 2697. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Ferre and P. Clote DiANNA 1.1: an extension of the DiANNA web server for ternary cysteine classification. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W182 - W185. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. You, D. Garwicz, and T. Rognvaldsson Comprehensive Bioinformatic Analysis of the Specificity of Human Immunodeficiency Virus Type 1 Protease J. Virol., October 1, 2005; 79(19): 12477 - 12486. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. R. Yang Prediction of caspase cleavage sites using Bayesian bio-basis function neural networks Bioinformatics, May 1, 2005; 21(9): 1831 - 1837. [Abstract] [Full Text] [PDF] |
||||


