Bioinformatics Advance Access originally published online on January 28, 2009
Bioinformatics 2009 25(6):758-764; doi:10.1093/bioinformatics/btp052
Retention time alignment algorithms for LC/MS data must consider non-linear shifts


1Fakultät Statistik, Technische Universität Dortmund, 44221 Dortmund, 2Zentrum für Angewandte Proteomik, Dortmund, 3Protagen AG, Otto-Hahn-Str. 15, 44227 Dortmund, 4Fakultät für Informatik, Technische Universität Dortmund, 44221 Dortmund and 5Medizinisches Proteom-Center (MPC), Ruhr-Universität Bochum, 44801 Bochum, Germany
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Proteomics has particularly evolved to become of high interest for the field of biomarker discovery and drug development. Especially the combination of liquid chromatography and mass spectrometry (LC/MS) has proven to be a powerful technique for analyzing protein mixtures. Clinically orientated proteomic studies will have to compare hundreds of LC/MS runs at a time. In order to compare different runs, sophisticated preprocessing steps have to be performed. An important step is the retention time (rt) alignment of LC/MS runs. Especially non-linear shifts in the rt between pairs of LC/MS runs make this a crucial and non-trivial problem.
Results: For the purpose of demonstrating the particular importance of correcting non-linear rt shifts, we evaluate and compare different alignment algorithms. We present and analyze two versions of a new algorithm that is based on regression techniques, once assuming and estimating only linear shifts and once also allowing for the estimation of non-linear shifts. As an example for another type of alignment method we use an established alignment algorithm based on shifting vectors that we adapted to allow for correcting non-linear shifts also. In a simulation study, we show that rt alignment procedures that can estimate non-linear shifts yield clearly better alignments. This is even true under mild non-linear deviations.
Availability: R code for the regression-based alignment methods and simulated datasets are available at http://www.statistik.tu-dortmund.de/genetik-publikationen-alignment.html
Contact: katharina.podwojski{at}tu-dortmund.de
Supplementary information: Supplementary data are available at Bioinformatics online.
The authors wish to be known that, in their opinion the first two authors should be regarded as joint First Authors.
Associate Editor: John Quackenbush
Received on October 7, 2008; revised on January 22, 2009; accepted on January 22, 2009