Bioinformatics Vol. 18 no. 4 2002
Pages 585-596
© 2002 Oxford University Press
Analysis of mRNA expression and protein abundance data: an approach for the comparison of the enrichment of features in the cellular population of proteins and transcripts


1 Departments of Molecular Biophysics &
Biochemistry
2 Computer Science
3 Genetics, 266 Whitney Avenue, Yale
University, PO Box 208114, New Haven, CT 06520, USA
Received on July 2, 2001
; revised on October 5, 2001
; accepted on October 22, 2001
Motivation: Protein abundance is related to mRNA expression through many different cellular processes. Up to now, there have been conflicting results on how correlated the levels of these two quantities are. Given that expression and abundance data are significantly more complex and noisy than the underlying genomic sequence information, it is reasonable to simplify and average them in terms of broad proteomic categories and features (e.g. functions or secondary structures), for understanding their relationship. Furthermore, it will be essential to integrate, within a common framework, the results of many varied experiments by different investigators. This will allow one to survey the characteristics of highly expressed genes and proteins.
Results: To this end, we outline a formalism for merging and scaling many different gene expression and protein abundance data sets into a comprehensive reference set, and we develop an approach for analyzing this in terms of broad categories, such as composition, function, structure and localization. As the various experiments are not always done using the same set of genes, sampling bias becomes a central issue, and our formalism is designed to explicitly show this and correct for it. We apply our formalism to the currently available gene expression and protein abundance data for yeast. Overall, we found substantial agreement between gene expression and protein abundance, in terms of the enrichment of structural and functional categories. This agreement, which was considerably greater than the simple correlation between these quantities for individual genes, reflects the way broad categories collect many individual measurements into simple, robust averages. In particular, we found that in comparison to the population of genes in the yeast genome, the cellular populations of transcripts and proteins (weighted by their respective abundances, the transcriptome and what we dub the translatome) were both enriched in: (i) the small amino acids Val, Gly, and Ala; (ii) low molecular weight proteins; (iii) helices and sheets relative to coils; (iv) cytoplasmic proteins relative to nuclear ones; and (v) proteins involved in protein synthesis, cell structure, and energy production.
Supplementary information: http://genecensus.org/expression/translatome
Contact: mark.gerstein{at}yale.edu
* To whom correspondence should be addressed.
These authors
contributed equally to this work.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G. Musso, M. Costanzo, M. Huangfu, A. M. Smith, J. Paw, B.-J. San Luis, C. Boone, G. Giaever, C. Nislow, A. Emili, et al. The extensive and condition-dependent nature of epistasis among whole-genome duplicates in yeast Genome Res., July 1, 2008; 18(7): 1092 - 1099. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Kong, X.-P. Yu, X.-H. Bai, W.-F. Zhang, Y. Zhang, W.-M. Zhao, J.-H. Jia, W. Tang, Y.-B. Zhou, and C.-j. Liu RbAp48 Is a Critical Mediator Controlling the Transforming Activity of Human Papillomavirus Type 16 in Cervical Cancer J. Biol. Chem., September 7, 2007; 282(36): 26381 - 26391. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. A. Drake and P. Ping Thematic review series: Systems Biology Approaches to Metabolic and Cardiovascular Disorders. Proteomics approaches to the systems biology of cardiovascular diseases J. Lipid Res., January 1, 2007; 48(1): 1 - 8. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Chen, T. L. Eggerman, and A. P. Patterson ApoB mRNA editing is mediated by a coordinated modulation of multiple apoB mRNA editing enzyme components Am J Physiol Gastrointest Liver Physiol, January 1, 2007; 292(1): G53 - G65. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. M. Waters, J. G. Pounds, and B. D. Thrall Data merging for integrated microarray and proteomic analysis Brief Funct Genomic Proteomic, December 1, 2006; 5(4): 261 - 272. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Nie, G. Wu, and W. Zhang Correlation of mRNA Expression and Protein Abundance Affected by Multiple Sequence Features Related to Translational Efficiency in Desulfovibrio vulgaris: A Quantitative Analysis Genetics, December 1, 2006; 174(4): 2229 - 2243. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Nie, G. Wu, F. J. Brockman, and W. Zhang Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins Bioinformatics, July 1, 2006; 22(13): 1641 - 1647. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. A. Conrads, M. Yi, K. A. Simpson, D. A. Lucas, C. E. Camalier, L.-R. Yu, T. D. Veenstra, R. M. Stephens, T. P. Conrads, and G. R. Beck Jr. A Combined Proteome and Microarray Investigation of Inorganic Phosphate-induced Pre-osteoblast Cells Mol. Cell. Proteomics, September 1, 2005; 4(9): 1284 - 1296. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. J. Lu, Y. Xia, A. Paccanaro, H. Yu, and M. Gerstein Assessing the limits of genomic data integration for predicting protein networks Genome Res., July 1, 2005; 15(7): 945 - 953. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Margetts, P. Bonniaud, L. Liu, C. M. Hoff, C. J. Holmes, J. A. West-Mays, and M. M. Kelly Transient Overexpression of TGF-{beta}1 Induces Epithelial Mesenchymal Transition in the Rodent Peritoneum J. Am. Soc. Nephrol., February 1, 2005; 16(2): 425 - 436. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. da Silva, E. Lucchinetti, T. Pasch, M. C. Schaub, and M. Zaugg Ischemic but not pharmacological preconditioning elicits a gene expression profile similar to unprotected myocardium Physiol Genomics, December 15, 2004; 20(1): 117 - 130. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Beyer, J. Hollunder, H.-P. Nasheuer, and T. Wilhelm Post-transcriptional Expression Regulation in the Yeast Saccharomyces cerevisiae on a Genomic Scale Mol. Cell. Proteomics, November 1, 2004; 3(11): 1083 - 1092. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. McRedmond, S. D. Park, D. F. Reilly, J. A. Coppinger, P. B. Maguire, D. C. Shields, and D. J. Fitzgerald Integration of Proteomics and Genomics in Platelets: A PROFILE OF PLATELET PROTEINS AND PLATELET-SPECIFIC GENES Mol. Cell. Proteomics, February 1, 2004; 3(2): 133 - 144. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Yu, X. Zhu, D. Greenbaum, J. Karro, and M. Gerstein TopNet: a tool for comparing biological sub-networks, correlating protein properties with topological statistics Nucleic Acids Res., January 14, 2004; 32(1): 328 - 337. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. Kryndushkin, I. M. Alexandrov, M. D. Ter-Avanesyan, and V. V. Kushnirov Yeast [PSI+] Prion Aggregates Are Formed by Small Sup35 Polymers Fragmented by Hsp104 J. Biol. Chem., December 5, 2003; 278(49): 49636 - 49643. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. M. Luscombe, T. E. Royce, P. Bertone, N. Echols, C. E. Horak, J. T. Chang, M. Snyder, and M. Gerstein ExpressYourself: a modular platform for processing and visualizing microarray data Nucleic Acids Res., July 1, 2003; 31(13): 3477 - 3482. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Jansen, H. J. Bussemaker, and M. Gerstein Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models Nucleic Acids Res., April 15, 2003; 31(8): 2242 - 2251. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Lian, Y. Kluger, D. S. Greenbaum, D. Tuck, M. Gerstein, N. Berliner, S. M. Weissman, and P. E. Newburger Genomic and proteomic analysis of the myeloid differentiation program: global analysis of gene expression during induced differentiation in the MPRO cell line Blood, October 16, 2002; 100(9): 3209 - 3220. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lin, J. Qian, D. Greenbaum, P. Bertone, R. Das, N. Echols, A. Senes, B. Stenger, and M. Gerstein GeneCensus: genome comparisons in terms of metabolic pathway activity and protein family sharing Nucleic Acids Res., October 15, 2002; 30(20): 4574 - 4582. [Abstract] [Full Text] [PDF] |
||||











