Skip Navigation


Bioinformatics Advance Access originally published online on August 17, 2009
Bioinformatics 2009 25(21):2764-2771; doi:10.1093/bioinformatics/btp491
This Article
Right arrow Full Text
Right arrow Full Text (Print PDF)
Right arrow All Versions of this Article:
25/21/2764    most recent
btp491v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Waaijenborg, S.
Right arrow Articles by Zwinderman, A. H.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Waaijenborg, S.
Right arrow Articles by Zwinderman, A. H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Correlating multiple SNPs and multiple disease phenotypes: penalized non-linear canonical correlation analysis

Sandra Waaijenborg * and Aeilko H. Zwinderman

Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1100 DD Amsterdam, The Netherlands

* To whom correspondence should be addressed.


   Abstract

Motivation: Canonical correlation analysis (CCA) can be used to capture the underlying genetic background of a complex disease, by associating two datasets containing information about a patient's phenotypical and genetic details. Often the genetic information is measured on a qualitative scale, consequently ordinary CCA cannot be applied to such data. Moreover, the size of the data in genetic studies can be enormous, thereby making the results difficult to interpret.

Results: We developed a penalized non-linear CCA approach that can deal with qualitative data by transforming each qualitative variable into a continuous variable through optimal scaling. Additionally, sparse results were obtained by adapting soft-thresholding to this non-linear version of the CCA. By means of simulation studies, we show that our method is capable of extracting relevant variables out of high-dimensional sets. We applied our method to a genetic dataset containing 144 patients with glial cancer.

Contact: s.waaijenborg{at}amc.uva.nl

Associate Editor: David Rocke


Received on December 21, 2008; revised on August 13, 2009; accepted on August 13, 2009

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.