Skip Navigation


Bioinformatics Advance Access first published online on April 25, 2007
This version published online on June 7, 2007

Bioinformatics, doi:10.1093/bioinformatics/btm130
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
23/12/1451    most recent
btm130v3
btm130v2
btm130v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Park, Y.
Right arrow Articles by Wei, L.J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Park, Y.
Right arrow Articles by Wei, L.J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2007). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Simultaneous and exact interval estimates for the contrast of two groups based on an extremely high dimensional variable: Application to Mass Spec data

Yuhyun Park 1,2,{dagger},*, Sean R. Downing 3,{dagger}, Dohyun Kim 4, William C. Hahn 3, Cheng Li 1,2, Philip W. Kantoff 3 and L.J. Wei 1,2

1Department of Biostatistics, Dana-Farber Cancer Institute, 2Department of Biostatistics, Harvard School of Public Health, 3Lank Center for Genitourinary Oncology, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, U.S.A., 4Department of Statistics, Seoul National University, Seoul, Korea.

*To whom correspondence should be addressed. Dr. Yuhyun Park, E-mail: parkyuhyun{at}gmail.com, Sean.Downing{at}childrens.harvard.edu


   Abstract

Motivation: Analysis of high-throughput proteomic/genomic data, in particular, surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS) data and microarray data, has led to a multitude of techniques aimed at identifying potential biomarkers. Most of the statistical techniques for comparing two groups are based on qualitative measures such as p-value. A quantitative way such as interval estimation for the contrasts of two groups is more appealing.

Results: We have devised a simultaneous confidence bands method capable of detecting potential biomarkers, while controlling for overall confidence coverage level, in high-dimensional datasets that discriminate two treatment groups using a permutation scheme. For example, for the SELDI-TOF MS data, we deal with the entire spectrum simultaneously and construct (1 – {alpha})confidence bands for the mean differences between groups. Furthermore, peaks were identified based on the maximal differences between the groups as determined by the confidence bands. The analysis method herein described gives both qualitative (p-value) and quantitative data (magnitude of difference). The Clinical Proteomics Programs Databank's ovarian cancer dataset and data from in-house samples containing known spiked-in proteins were analyzed. We were able to identify potential biomarkers similar to those described in previous analysis of the ovarian cancer data, however, while these markers are highly significant between cancer and normal groups, our analysis indicated the absolute difference between the two groups was minimal. In addition, we found additional markers than those previously described with greater differences in average intensities. The proposed confidence bands method successfully detected the spiked-in peaks, as well as, secondary peaks generated by adducts and double-charged species. We also illustrate our method utilizing paired gene expression data from a prostate cancer microarray experiment by constructing confidence bands for the fold changes between cancer and normal samples.

Availability: R-package, "seie.zip" (license: GNU GPL), is publically available at http://research2.dfci.harvard.edu/dfci/MSP_spike-in_data/.

{dagger}These authors contributed equally to this work

Associate Editor: Prof. Alfonso Valencia


Received on March 27, 2006; revised on March 12, 2007; accepted on March 28, 2007

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.