Bioinformatics Advance Access published online on July 1, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth388
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Section on Statistical Genetics, Department of Biostatistics, University of Alabama at Birmingham, AL, 35294
* To whom correspondence should be addressed. E-mail: hongyu.zhao{at}yale.edu.
Motivation: Haplotype reconstruction is an essential step in genetic linkage and association studies. Although many methods have been developed to estimate haplotype frequencies and reconstruct haplotypes for a sample of unrelated individuals, haplotype reconstruction in large pedigrees with a large number of genetic markers remains a challenging problem. Methods: We have developed an efficient computer program, HAPLORE (HAPLOtype REconstruction), to identify all haplotype sets that are compatible with the observed genotypes in a pedigree for tightly linked genetic markers. HAPLORE consists of three steps that can serve different needs in applications. In the first step, a set of logic rules is used to reduce the number of compatible haplotypes of each individual in the pedigree as much as possible. After this step, the haplotypes of all individuals in the pedigree can be completely or partially determined. These logic rules are applicable to completely linked markers and they can be used to impute missing data and check genotyping errors. In the second step, a haplotype elimination algorithm similar to the genotype elimination algorithms used in linkage analysis is applied to delete incompatible haplotypes derived from the first step. All superfluous haplotypes of the pedigree members will be excluded after this step. In the third step, the expectation-maximization (EM) algorithm combined with the partition and ligation technique is used to estimate haplotype frequencies based on the inferred haplotype configurations through the first two steps. Only compatible haplotype configurations with haplotypes having frequencies greater than a threshold are retained. Results: We test the effectiveness and the efficiency of HAPLORE using both simulated and real data sets. Our results show that, the rule-based algorithm is very efficient for completely genotyped pedigree. In this case, almost all of families have one unique haplotype configuration. In the presence of missing data, the number of compatible haplotypes can be substantially reduced by HAPLORE, and the program will provide all possible haplotype configurations of a pedigree under different circumstances, if such multiple configurations exist. These inferred haplotype configurations, as well as the haplotype frequencies estimated by the EM algorithm, can be used in genetic linkage and association studies. Availability: The program can be downloaded from http://bioinformatics.med.yale.edu.
Revised December 4, 2003
Accepted December 4, 2003
Article
HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination
2 Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, 1042 W. 36th Place, Los Angeles, CA, 90089
3 Department of Epidemiology and Public Health, Yale University School of Medicine, 60 College Street, New Haven, CT, 06520
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Abney Identity-by-Descent Estimation and Mapping of Qualitative Traits in Large, Complex Pedigrees Genetics, July 1, 2008; 179(3): 1577 - 1590. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lin, Z. Wang, L. Wang, Y.-L. Lau, and W. Yang Identification of linked regions using high-density SNP genotype data in linkage analysis Bioinformatics, January 1, 2008; 24(1): 86 - 93. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. L. Sherman, J. D. Nkrumah, B. M. Murdoch, C. Li, Z. Wang, A. Fu, and S. S. Moore Polymorphisms and haplotypes in the bovine neuropeptide Y, growth hormone receptor, ghrelin, insulin-like growth factor 2, and uncoupling proteins 2 and 3 genes and their associations with measures of growth, performance, feed efficiency, and carcass merit in beef cattle J Anim Sci, January 1, 2008; 86(1): 1 - 16. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Albers, T. Heskes, and H. J. Kappen Haplotype Inference in General Pedigrees Using the Cluster Variation Method Genetics, October 1, 2007; 177(2): 1101 - 1116. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. J. Yoo, J. Tang, R. A. Kaslow, and K. Zhang Haplotype inference for present absent genotype data using previously identified haplotypes and haplotype patterns Bioinformatics, September 15, 2007; 23(18): 2399 - 2406. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Allen-Brady, L. A. Cannon-Albright, S. L. Neuhausen, and N. J. Camp A Role for XRCC4 in Age at Diagnosis and Breast Cancer Risk. Cancer Epidemiol. Biomarkers Prev., July 1, 2006; 15(7): 1306 - 1310. [Abstract] [Full Text] [PDF] |
||||



