Skip Navigation



Bioinformatics Advance Access published online on October 27, 2009

Bioinformatics, doi:10.1093/bioinformatics/btp584
This Article
Right arrow Advance Access manuscript (PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Pirinen, M.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pirinen, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2009). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Estimating population haplotype frequencies from pooled SNP data using incomplete database information

Matti Pirinen

Department of Mathematics and Statistics, University of Helsinki, PO Box 68, 00014, Finland

To whom correspondence should be addressed. Dr. Matti Pirinen, E-mail: matti.pirinen{at}iki.fi


   Abstract

Motivation: Information about haplotype structures gives a more detailed picture of genetic variation between individuals than singlelocus analyses. Databases that contain the most frequent haplotypes of certain populations are developing rapidly (e.g. the HapMap database for single-nucleotide polymorphisms in humans). Utilisation of such prior information about the prevailing haplotype structures makes it possible to estimate the haplotype frequencies also from large DNA pools. When genetic material from dozens of individuals is pooled together and analysed in a single genotyping, the overall number of genotypings and the costs of the genetic studies are reduced.

Results: A Bayesian model for estimating the haplotypes and their frequencies from pooled allelic observations is introduced. The model combines an idea of using database information for haplotype estimation with a computationally efficient multinormal approximation. In addition, the model treats the number and structures of the unknown haplotypes as random variables whose joint posterior distribution is estimated. The results on real human data from the HapMap database show that the proposed method provides significant improvements over the existing methods.

Availability: A reversible-jump Markov chain Monte Carlo algorithm for analysing the model is implemented in a program called Hippo. For comparisons, an approximate EM-algorithm that utilises database information about the existing haplotypes is implemented in a program called AEML. The source codes written in C (using Gnu Scientific Library) are available at www.iki.fi/~mpirinen .

Contact: matti.pirinen{at}iki.fi

Associate Editor: Dr. Jonathan Wren


Received on July 29, 2009; revised on September 20, 2009; accepted on September 24, 2009

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.