Bioinformatics Advance Access published online on March 24, 2009
Bioinformatics, doi:10.1093/bioinformatics/btp168
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Integrating Shotgun Proteomics and mRNA expression data to Improve Protein Identification
1 Department of Computer Sciences, 1 University Station C0500, The University of Texas at Austin, Austin, TX 78712
2 Center for Systems and Synthetic Biology, Department of Chemistry and Biochemistry & Institute for Cellular and Molecu-lar Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712
3 Children's Cancer Research Institute; The University of Texas Health Science Center at San Antonio; San Antonio, TX 78229
**To whom correspondence should be addressed., E-mail: marcotte{at}icmb.utexas.edu, miranker{at}cs.utexas.edu
| Abstract |
|---|
Motivation: Tandem mass spectrometry (MS/MS) offers fast and reliable characterization of complex protein mixtures, but suffers from low sensitivity in protein identification. In a typical shotgun-proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other information avail-able, e.g. the probability of a protein's presence is likely to correlate with its mRNA concentration.
Results: We develop a Bayesian score that estimates the posterior probability of a protein's presence in the sample given its identifica-tion in an MS/MS experiment and its mRNA concentration measured under similar experimental conditions. Our method, MSpresso, sub-stantially increases the number of proteins identified in an MS/MS experiment at the same error rate, e.g. in yeast, MSpresso in-creases the number of proteins identified by
40%. We apply MSpresso to data from different MS/MS instruments, experimental conditions, and organisms (E.coli, human), and predict 19 to 63% more proteins across the different datasets. MSpresso demonstrates that incorporating prior knowledge of protein presence into shotgun-proteomics experiments can substantially improve protein identifica-tion scores.
Availability and Implementation: Software is available upon re-quest from the authors. Mass spectrometry datasets are available from http://marcottelab.org/MSdata/.
Contact: marcotte{at}icmb.utexas.edu, miranker{at}cs.utexas.edu
Supplementary Information: Supplementary data website: http://www.marcottelab.org/MSpresso/.
Associate Editor: Prof. Martin Bishop
*Equally contributing authors.
Received on December 24, 2008; revised on February 19, 2009; accepted on March 18, 2009
This article has been cited by other articles:
![]() |
S. R. Ramakrishnan, C. Vogel, T. Kwon, L. O. Penalva, E. M. Marcotte, and D. P. Miranker Mining gene functional networks to improve mass-spectrometry-based protein identification Bioinformatics, November 15, 2009; 25(22): 2955 - 2961. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. F. Ochs Knowledge-based data analysis comes of age Brief Bioinform, October 23, 2009; (2009) bbp044v1. [Abstract] [Full Text] [PDF] |
||||

