Bioinformatics Advance Access originally published online on July 14, 2009
Bioinformatics 2009 25(19):2595-6602; doi:10.1093/bioinformatics/btp428
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A multi-dimensional evidence-based candidate gene prioritization approach for complex diseases–schizophrenia as a case


1Department of Biomedical Informatics and Department of Psychiatry, Vanderbilt University, Nashville, TN 37203, 2Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, 3Washington VA Medical Center, Washington, DC 20422, 4Department of Pharmacy and 5Department of Human Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: During the past decade, we have seen an exponential growth of vast amounts of genetic data generated for complex disease studies. Currently, across a variety of complex biological problems, there is a strong trend towards the integration of data from multiple sources. So far, candidate gene prioritization approaches have been designed for specific purposes, by utilizing only some of the available sources of genetic studies, or by using a simple weight scheme. Specifically to psychiatric disorders, there has been no prioritization approach that fully utilizes all major sources of experimental data.
Results: Here we present a multi-dimensional evidence-based candidate gene prioritization approach for complex diseases and demonstrate it in schizophrenia. In this approach, we first collect and curate genetic studies for schizophrenia from four major categories: association studies, linkage analyses, gene expression and literature search. Genes in these data sets are initially scored by category-specific scoring methods. Then, an optimal weight matrix is searched by a two-step procedure (core genes and unbiased P-values in independent genome-wide association studies). Finally, genes are prioritized by their combined scores using the optimal weight matrix. Our evaluation suggests this approach generates prioritized candidate genes that are promising for further analysis or replication. The approach can be applied to other complex diseases.
Availability: The collected data, prioritized candidate genes, and gene prioritization tools are freely available at http://bioinfo.mc.vanderbilt.edu/SZGR/.
Contact: zhongming.zhao{at}vanderbilt.edu
Supplementary information:Supplementary data are available at Bioinformatics online.
Associate Editor: Jeffrey Barrett
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First authors.
Received on March 20, 2009; revised on July 3, 2009; accepted on July 4, 2009