Bioinformatics Advance Access published online on January 28, 2009
Bioinformatics, doi:10.1093/bioinformatics/btp056
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Automated procedure for candidate compound selection in GC-MS metabolomics based on prediction of Kovats retention index
1Applied Bioinformatics, PRI, Droevendaalsesteeg 1, Wageningen, The Netherlands, 2Centre for BioSystems Genomics (CBSG), Droevendaalsesteeg 1, Wageningen, The Netherlands, and 3Laboratory of Bioinformatics, Wageningen University, Dreijenlaan 3, Wageningen, The Netherlands
*To whom correspondence should be addressed. Dr. Roeland van Ham, E-mail: roeland.vanham{at}wur.nl
| Abstract |
|---|
Motivation: Matching both the retention index (RI) and the mass spectrum of an unknown compound against a mass spectral reference library provides strong evidence for a correct identification of that compound. Data on retention indices are, however, available for only a small fraction of the compounds in such libraries. We propose a quantitative structure - retention index model that enables the ranking and filtering of putative identifications of compounds for which the predicted RI falls outside a predefined window.
Results: We constructed multiple linear regression and support vector regression (SVR) models using a set of descriptors obtained with a genetic algorithm as variable selection method. The SVR model is a significant improvement over previous models built for structurally diverse compounds as it covers a large range (360 to 4100) of RI values and gives better prediction of isomer compounds. The hit list reduction varied from 41% to 60% and depended on the size of the original hit list. Large hit lists were reduced to a greater extend compared to small hit lists.
Contact. roeland.vanham{at}wur.nl
Software availability. http://appliedbioinformatics.wur.nl/GC-MS
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Dr. Trey Ideker
Received on July 21, 2008; revised on December 11, 2008; accepted on January 24, 2009