Bioinformatics Advance Access originally published online on October 29, 2008
Bioinformatics 2008 24(23):2691-2697; doi:10.1093/bioinformatics/btn538
Protease substrate site predictors derived from machine learning on multilevel substrate phage display data


1Institute of Information Science, Academia Sinica, Taipei 115, 2Institute of bioinformatics, National Chiao Tung University, Hsin Chu 300, 3Genomics Research Center, Academia Sinica, Taipei 115 and 4Graduate Institute of Life Sciences, National Defense Medical University, Taipei 114, Taiwan
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Regulatory proteases modulate proteomic dynamics with a spectrum of specificities against substrate proteins. Predictions of the substrate sites in a proteome for the proteases would facilitate understanding the biological functions of the proteases. High-throughput experiments could generate suitable datasets for machine learning to grasp complex relationships between the substrate sequences and the enzymatic specificities. But the capability in predicting protease substrate sites by integrating the machine learning algorithms with the experimental methodology has yet to be demonstrated.
Results: Factor Xa, a key regulatory protease in the blood coagulation system, was used as model system, for which effective substrate site predictors were developed and benchmarked. The predictors were derived from bootstrap aggregation (machine learning) algorithms trained with data obtained from multilevel substrate phage display experiments. The experimental sampling and computational learning on substrate specificities can be generalized to proteases for which the active forms are available for the in vitro experiments.
Availability: http://asqa.iis.sinica.edu.tw/fXaWeb/
Contact: hsu{at}iis.sinica.edu.tw; yangas{at}gate.sinica.edu.tw
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Burkhard Rost
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
Received on August 27, 2008; revised on October 9, 2008; accepted on October 10, 2008