Skip Navigation


Bioinformatics Advance Access originally published online on March 25, 2007
Bioinformatics 2007 23(11):1401-1409; doi:10.1093/bioinformatics/btm104
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/11/1401    most recent
btm104v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Sköld, M.
Right arrow Articles by Baldetorp, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sköld, M.
Right arrow Articles by Baldetorp, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Regression analysis and modelling of data acquisition for SELDI-TOF mass spectrometry

Martin Sköld 1,*, Tobias Rydén 1, Viktoria Samuelsson 2, Charlotte Bratt 3, Lars Ekblad 3, Håkan Olsson 3 and Bo Baldetorp 3

1Centre for Mathematical Sciences, Lund University, Box 118, SE-221 00 Lund, Sweden, 2Oncological Centre, University Hospital, Klinikgatan 22, SE-221 85 Lund, Sweden and 3Department of Oncology, Lund University, Barngatan 2:1, SE-221 85 Lund, Sweden

*To whom correspondence should be addressed.


   Abstract

Motivation: Pre-processing of SELDI-TOF mass spectrometry data is currently performed on a largel y ad hoc basis. This makes comparison of results from independent analyses troublesome and does not provide a framework for distinguishing different sources of variation in data.

Results: In this article, we consider the task of pooling a large number of single-shot spectra, a task commonly performed automatically by the instrument software. By viewing the underlying statistical problem as one of heteroscedastic linear regression, we provide a framework for introducing robust methods and for dealing with missing data resulting from a limited span of recordable intensity values provided by the instrument. Our framework provides an interpretation of currently used methods as a maximum-likelihood estimator and allows theoretical derivation of its variance. We observe that this variance depends crucially on the total number of ionic species, which can vary considerably between different pooled spectra. This variation in variance can potentially invalidate the results from naive methods of discrimination/classification and we outline appropriate data transformations. Introducing methods from robust statistics did not improve the standard errors of the pooled samples. Imputing missing values however—using the EM algorithm—had a notable effect on the result; for our data, the pooled height of peaks which were frequently truncated increased by up to 30%.

Contact: martins{at}maths.lth.se

Supplementary information: Supplementary data are available at Bioinformatics online.

Associate Editor: John Quackenbush


Received on August 31, 2006; revised on February 22, 2007; accepted on March 10, 2007

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.