Bioinformatics Advance Access published online on February 26, 2009
Bioinformatics, doi:10.1093/bioinformatics/btp110
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A Robust Peak Detection Method for RNA Structure Inference by High-throughput Contact Mapping
School of Electrical Engineering, Korea University, Seoul 136-713, Korea
Department of Computer Science, Korea University, Seoul 136-713, Korea
Department of Statistics, Stanford University, Stanford, CA 94305, USA
Department of Electrical and Electronic Engineering, Yonsei University, Seoul 120-749, Korea
Departments of Biochemistry and Physics, Stanford University, Stanford, CA 94305, USA
*To whom correspondence should be addressed. Prof. Sungroh Yoon, E-mail: sryoon{at}korea.ac.kr
| Abstract |
|---|
Motivation: For high-throughput prediction of the helical arrangements of large RNA molecules, an innovative method termed multiplexed hydroxyl radical (·OH) cleavage analysis (MOHCA) has been proposed (Das et al., 2008). A key step in this promising technique is to detect peaks accurately from noisy radioactivity profiles. Since manual peak finding is laborious and prone to error, an automated peak detection method to improve the accuracy and throughput of MOHCA is required. Existing methods were not applicable to MOHCA due to their high false positive rates.
Results:We developed a two-step computational method that can detect peaks from MOHCA profiles in a robust manner. The first step exploits an ensemble of linear and nonlinear signal processing techniques to find true peak candidates. In the second step, a binary classifier trained with the characteristics of true and false peaks is used to eliminate false peaks out of the peak candidates. We tested the proposed approach with 2002 MOHCA cleavage profiles and obtained the median recall, precision, and F-measure values of 0.917, 0.750, and 0.830, respectively. Compared with the alternatives considered, the proposed method was able to handle false peaks substantially better, thus resulting in 51.0–71.8% higher median values of precision and F-measure.
Availability: The software and supplemental data are available athttp://dna.korea.ac.kr/pub/mohca.
Contact: sryoon{at}korea.ac.kr
Associate Editor: Prof. Anna Tramontano
Received on November 18, 2008; revised on January 30, 2009; accepted on February 22, 2009