Bioinformatics 20(Suppl. 1) © Oxford University Press 2004; all rights reserved.
Automatic Quality Assessment of Peptide Tandem Mass Spectra
1 Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA 94304, USA and 2 The Scripps Research Institute, 10440 North Torrey Pines Road, La Jolla, CA 92037, USA
Received on January 15, 2004; accepted on March 1, 2004
Motivation: A powerful proteomics methodology couples high-performance liquid chromatography (HPLC) with tandem mass spectrometry and database-search software, such as SEQUEST. Such a set-up, however, produces a large number of spectra, many of which are of too poor quality to be useful. Hence a filter that eliminates poor spectra before the database search can significantly improve throughput and robustness. Moreover, spectra judged to be of high quality, but that cannot be identified by database search, are prime candidates for still more computationally intensive methods, such as de novo sequencing or wider database searches including post-translational modifications.
Results: We report on two different approaches to assessing spectral quality prior to identification: binary classification, which predicts whether or not SEQUEST will be able to make an identification, and statistical regression, which predicts a more universal quality metric involving the number of b- and y-ion peaks. The best of our binary classifiers can eliminate over 75% of the unidentifiable spectra while losing only 10% of the identifiable spectra. Statistical regression can pick out spectra of modified peptides that can be identified by a de novo program but not by SEQUEST. In a section of independent interest, we discuss intensity normalization of mass spectra.
Contact: goldberg{at}parc.com
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
B. W. Neuman, J. S. Joseph, K. S. Saikatendu, P. Serrano, A. Chatterjee, M. A. Johnson, L. Liao, J. P. Klaus, J. R. Yates III, K. Wuthrich, et al. Proteomics Analysis Unravels the Functional Repertoire of Coronavirus Nonstructural Protein 3 J. Virol., June 1, 2008; 82(11): 5279 - 5294. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-Y. Jang, J. Wong, J. A. Coppinger, A. Seki, J. R. Yates III, and G. Fang DDA3 recruits microtubule depolymerase Kif2a to spindle poles and controls spindle dynamics and mitotic chromosome movement J. Cell Biol., April 16, 2008; 181(2): 255 - 267. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. McLaughlin, J. A. Siepen, J. Selley, J. A. Lynch, K. W. Lau, H. Yin, S. J. Gaskell, and S. J. Hubbard PepSeeker: a database of proteome peptide identifications for investigating fragmentation patterns Nucleic Acids Res., January 1, 2006; 34(suppl_1): D649 - D654. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Savitski, M. L. Nielsen, and R. A. Zubarev New Data Base-independent, Sequence Tag-based Scoring of Peptide MS/MS Data Validates Mowse Scores, Recovers Below Threshold Data, Singles Out Modified Peptides, and Assesses the Quality of MS/MS Techniques Mol. Cell. Proteomics, August 1, 2005; 4(8): 1180 - 1188. [Abstract] [Full Text] [PDF] |
||||



