Skip Navigation

Bioinformatics 2008 24(16):i42-i48; doi:10.1093/bioinformatics/btn294
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Käll, L.
Right arrow Articles by Noble, W. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Käll, L.
Right arrow Articles by Noble, W. S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Non-parametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry

Lukas Käll 1, John D. Storey 1,2 and William Stafford Noble 1,3,*

1Department of Genome Sciences, University of Washington, Seattle, WA, 2Lewis-Sigler Institute, Princeton University, Princeton, NJ and 3Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA

*To whom correspondence should be addressed.


   Abstract

Motivation: A mass spectrum produced via tandem mass spectrometry can be tentatively matched to a peptide sequence via database search. Here, we address the problem of assigning a posterior error probability (PEP) to a given peptide-spectrum match (PSM). This problem is considerably more difficult than the related problem of estimating the error rate associated with a large collection of PSMs. Existing methods for estimating PEPs rely on a parametric or semiparametric model of the underlying score distribution.

Results: We demonstrate how to apply non-parametric logistic regression to this problem. The method makes no explicit assumptions about the form of the underlying score distribution; instead, the method relies upon decoy PSMs, produced by searching the spectra against a decoy sequence database, to provide a model of the null score distribution. We show that our non-parametric logistic regression method produces accurate PEP estimates for six different commonly used PSM score functions. In particular, the estimates produced by our method are comparable in accuracy to those of PeptideProphet, which uses a parametric or semiparametric model designed specifically to work with SEQUEST. The advantage of the non-parametric approach is applicability and robustness to new score functions and new types of data.

Availability: C++ code implementing the method as well as supplementary information is available at http://noble.gs.washington.edu/proj/qvality

Contact: noble{at}gs.washington.edu



Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
L. Kall, J. D. Storey, and W. S. Noble
QVALITY: non-parametric estimation of q-values and posterior error probabilities
Bioinformatics, April 1, 2009; 25(7): 964 - 966.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.