Bioinformatics Advance Access published online on July 4, 2006
Bioinformatics, doi:10.1093/bioinformatics/btl355
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Chicago, IL, 60611, USA
* To whom correspondence should be addressed.
Motivation: A major problem for current peak detection algorithms is that noise in Mass Spectrometry (MS) spectra gives rise to a high rate of false positives. The false positive rate is especially problematic in detecting peaks with low amplitudes. Usually, various baseline correction algorithms and smoothing methods are applied before attempting peak detection. This approach is very sensitive to the amount of smoothing and aggressiveness of the baseline correction, which contribute to making peak detection results inconsistent between runs, instrumentation and analysis methods. Results: Most peak detection algorithms simply identify peaks based on amplitude, ignoring the additional information present in the shape of the peaks in a spectrum. In our experience, true peaks have characteristic shapes, and providing a shape-matching function that provides a goodness of fit coefficient should provide a more robust peak identification method. Based on these observations, a Continuous Wavelet Transform (CWT)-based peak detection algorithm has been devised that identifies peaks with different scales and amplitudes. By transforming the spectrum into wavelet space, the pattern-matching problem is simplified and additionally provides a powerful technique for identifying and separating the signal from the spike noise and colored noise. This transformation, with the additional information provided by the 2-D CWT coefficients can greatly enhance the effective Signal-to-Noise Ratio (SNR). Furthermore, with this technique no baseline removal or peak smoothing preprocessing steps are required before peak detection, and this improves the robustness of peak detection under a variety of conditions. The algorithm was evaluated with SELDI-TOF spectra with known polypeptide positions. Comparisons with two other popular algorithms were performed. The results show the CWT-based algorithm can identify both strong and weak peaks while keeping false positive rate low. Availability: The algorithm is implemented in R and will be included as an open source module in the Bioconductor project. Supplementary material: http://basic.northwestern.edu/publications/peakdetection/.
Received April 24, 2006
Revised June 22, 2006
Accepted June 23, 2006
Article
Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching
Pan Du 1,
Warren A. Kibbe 1,
and
Simon M. Lin 1 *
Simon M. Lin, E-mail: s-lin2{at}northwestern.edu
![]()
Abstract
Associate Editor: Chris Stoeckert
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y. Wang, X. Zhou, H. Wang, K. Li, L. Yao, and S. T.C. Wong Reversible jump MCMC approach for peak identification for stroke SELDI mass spectrometry using mixture model Bioinformatics, July 1, 2008; 24(13): i407 - i413. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Zheng, P. Lu, Y. Liu, J. Pease, J. Usuka, G. Liao, and G. Peltz 2D NMR metabonomic analysis: a novel method for automated peak alignment Bioinformatics, November 1, 2007; 23(21): 2926 - 2933. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Noy and D. Fasulo Improved model-based, platform-independent feature extraction for mass spectrometry Bioinformatics, October 1, 2007; 23(19): 2528 - 2535. [Abstract] [Full Text] [PDF] |
||||
