Bioinformatics Advance Access published online on July 30, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn406
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Efficient Whole-Genome Association Mapping using Local Phylogenies for Unphased Genotype Data
1 Department of Computer Science, University of California, Davis, USA
2 Bioinformatics Research Center, University of Aarhus, Denmark
3 Computer Science Division and Department of Statistics, University of California, Berkeley, USA
*To whom correspondence should be addressed. Thomas Mailund, E-mail: mailund{at}birc.au.dk
| Abstract |
|---|
Motivation: Recent advances in genotyping technology has made data acquisition for whole-genome association study cost effective, and a current active area of research is developing efficient methods to analyze such large-scale data sets. Most sophisticated association mapping methods that are currently available take phased haplotype data as input. However, phase information is not readily available from sequencing methods and inferring the phase via computational approaches is time-consuming, taking days to phase a single chromosome.
Results: In this paper, we devise an efficient method for scanning unphased whole-genome data for association. Our approach combines a recently found linear-time algorithm for phasing genotypes on trees with a recently proposed tree-based method for association mapping. From unphased genotype data, our algorithm builds local phylogenies along the genome, and scores each tree according to the clustering of cases and controls. We assess the performance of our new method on both simulated and real biological data sets.
Availability: The software described in this paper is available at http://www.daimi.au.dk/~mailund/Blossoc and distributed under the GNU General Public License.
Contact: mailund{at}birc.au.dk
Associate Editor: Prof. Martin Bishop
Received on May 1, 2008; revised on July 25, 2008; accepted on July 29, 2008