Bioinformatics Advance Access published online on May 6, 2009
Bioinformatics, doi:10.1093/bioinformatics/btp284
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Approximate Bayesian feature selection on a large meta-dataset offers novel insights on factors that effect siRNA potency


1Department of Biochemistry, University of Oxford, Oxford, OX1 3QU, UK
2Department of Statistics, University of Oxford, Oxford, OX1 3TG , UK
*To whom correspondence should be addressed. Mr. Jochen Klingelhoefer, E-mail: jochen.klingelhoefer{at}dtc.ox.ac.uk
| Abstract |
|---|
Motivation: Short interfering RNA (siRNA)-induced RNA interference is an endogenous pathway in sequence-specific gene silencing. The potency of different siRNAs to inhibit a common target varies greatly and features affecting inhibition are of high current interest. The limited success in predicting siRNA potency being reported so far could originate in the small number and the heterogeneity of avail-able datasets in addition to the knowledge-driven, empirical basis on which features thought to be affecting siRNA potency are often chosen. We attempt to overcome these problems by first con-structing a meta-dataset of 6483 publicly available siRNAs (targeting mammalian mRNA), the largest to date, and then applying a Bayes-ian analysis which accommodates feature set uncertainty. A stochastic logistic regression-based algorithm is designed to explore a vast model space of 497 compositional, structural and thermodynamic features, identifying associations with siRNA potency.
Results: Our algorithm reveals a number of features associated with siRNA potency that are, to the best of our knowledge, either under reported in literature, such as anti-sense 5'-3' motif 'UCU', or not reported at all, such as the anti-sense 5'-3' motif 'ACGA'. These findings should aid in improving future siRNA potency predictions and might offer further insights into the working of the RNA-induced silencing complex (RISC).
Contact: cholmes{at}stats.ox.ac.uk
Supplementary Information: Supplementary data are available at Bioinformatics online as well as the following web resource: http://portal.stats.ox.ac.uk/userdata/moutsian/siRNA/
Associate Editor: Dr. Alex Bateman
The authors wish it to be known that, in their opinion, the first two authors should be considered as joint First Authors.
Received on January 28, 2009; revised on April 19, 2009; accepted on April 22, 2009