Bioinformatics Advance Access originally published online on July 30, 2008
Bioinformatics 2008 24(19):2215-2221; doi:10.1093/bioinformatics/btn406
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Efficient whole-genome association mapping using local phylogenies for unphased genotype data
1Department of Computer Science, University of California, Davis, USA, 2Bioinformatics Research Center, University of Aarhus, Denmark and 3Computer Science Division and Department of Statistics, University of California, Berkeley, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Recent advances in genotyping technology has made data acquisition for whole-genome association study cost effective, and a current active area of research is developing efficient methods to analyze such large-scale datasets. Most sophisticated association mapping methods that are currently available take phased haplotype data as input. However, phase information is not readily available from sequencing methods and inferring the phase via computational approaches is time-consuming, taking days to phase a single chromosome.
Results: In this article, we devise an efficient method for scanning unphased whole-genome data for association. Our approach combines a recently found linear-time algorithm for phasing genotypes on trees with a recently proposed tree-based method for association mapping. From unphased genotype data, our algorithm builds local phylogenies along the genome, and scores each tree according to the clustering of cases and controls. We assess the performance of our new method on both simulated and real biological datasets.
Availability The software described in this article is available at http://www.daimi.au.dk/~mailund/Blossoc and distributed under the GNU General Public License.
Contact:mailund{at}birc.au.dk
Associate Editor: Martin Bishop
Received on May 1, 2008; revised on July 25, 2008; accepted on July 29, 2008