Bioinformatics Advance Access published online on January 22, 2004
Bioinformatics, doi:10.1093/bioinformatics/btg469
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Biology, University of New Mexico, Albuquerque, NM 87106, USA
* To whom correspondence should be addressed. E-mail: mike{at}unm.edu.
Motivation: To accurately identify protein function on a proteome-wide scale requires integrating data within and between high-throughput experiments. High-throughput proteomic datasets often have high rates of errors and thus yield incomplete and contradictory information. In this study we develop a simple statistical framework using Bayes' law to interpret such data and combine information from different high-throughput experiments. In order to illustrate our approach we apply it to two protein complex purification datasets. Results: Our approach shows how to use high-throughput data to accurately calculate the probability that two proteins are part of the same complex. Importantly, our approach does not need a reference set of verified protein interactions to determine false positive and false negative error rates of protein association. We also demonstrate how to combine information from two separate protein purification datasets into a combined dataset that has greater coverage and accuracy than either dataset alone. In addition, we also provide a technique for estimating the total number of proteins which can be detected using a particular experimental technique. Availability: A suite of simple programs to accomplish some of the above tasks is available at www.unm.edu/~compbio/software/DatasetAssess
Accepted September 25, 2003
Article
A statistical framework for combining and interpreting proteomic datasets
2 Department of Mathematics & Statistics, University of New Mexico, Albuquerque, NM 87106, USA
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G. Liu, L. Wong, and H. N. Chua Complex discovery from weighted PPI networks Bioinformatics, August 1, 2009; 25(15): 1891 - 1897. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Huang and J. S. Bader Precision and recall estimates for two-hybrid screens Bioinformatics, February 1, 2009; 25(3): 372 - 378. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Zhang, B.-H. Park, T. Karpinets, and N. F. Samatova From pull-down data to protein interaction networks and complexes with biological relevance Bioinformatics, April 1, 2008; 24(7): 979 - 986. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-J. M. Webb-Robertson and W. R. Cannon Current trends in computational inference from mass spectrometry-based proteomics Brief Bioinform, September 1, 2007; 8(5): 304 - 317. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Prelic, S. Bleuler, P. Zimmermann, A. Wille, P. Buhlmann, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler A systematic comparison and evaluation of biclustering methods for gene expression data Bioinformatics, May 1, 2006; 22(9): 1122 - 1129. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Hwang, A. G. Rust, S. Ramsey, J. J. Smith, D. M. Leslie, A. D. Weston, P. de Atauri, J. D. Aitchison, L. Hood, A. F. Siegel, et al. A data integration methodology for systems biology PNAS, November 29, 2005; 102(48): 17296 - 17301. [Abstract] [Full Text] [PDF] |
||||


