Skip Navigation


Bioinformatics Advance Access originally published online on August 30, 2007
Bioinformatics 2007 23(22):3065-3072; doi:10.1093/bioinformatics/btm415
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
23/22/3065    most recent
btm415v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Pelikan, R.
Right arrow Articles by Hauskrecht, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pelikan, R.
Right arrow Articles by Hauskrecht, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Intersession reproducibility of mass spectrometry profiles and its effect on accuracy of multivariate classification models

Richard Pelikan 1,2,3,*, William L. Bigbee 4, David Malehorn 4, James Lyons-Weiler 3,4,5 and Milos Hauskrecht 1,2,3,4

1Department of Computer Science, 2Intelligent Systems Program, 3Department of Biomedical Informatics, 4University of Pittsburgh Cancer Institute and 5Genomics and Proteomics Core Laboratories, University of Pittsburgh, Pittsburgh, PA 15260, USA

*To whom correspondence should be addressed.


   Abstract

Motivation: The ‘reproducibility’ of mass spectrometry proteomic profiling has become an intensely controversial topic. The mere mention of concern over the ‘reproducibility’ of data generated from any particular platform can lead to the anxiety over the generalizability of its results and its role in the future of discovery proteomics. In this study, we examine the reproducibility of proteomic profiles generated by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) across multiple data-generation sessions. We analyze the problem in terms of the reproducibility of signals, reproducibility of discriminative features and reproducibility of multivariate classification models on profiles for serum samples from early lung cancer and healthy control subjects.

Results: Proteomic profiles in individual data-generation sessions experience within-session variability. We show that combining data from multiple sessions introduces additional (inter-session) noise. While additional noise can affect the discriminative analysis, we show that its average effect on profiles in our study is relatively small. Moreover, for the purposes of prediction on future (previously unseen) data, classifiers trained on multi-session data are able to adapt to inter-session noise and improve their classification accuracy.

Contact: milos{at}cs.pitt.edu

Associate Editor: Jonathan Wren


Received on May 25, 2007; revised on July 31, 2007; accepted on August 9, 2007

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.