Bioinformatics Advance Access originally published online on November 15, 2005
Bioinformatics 2006 22(3):371-373; doi:10.1093/bioinformatics/bti785
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2SNP: scalable phasing based on 2-SNP haplotypes
Department of Computer Science, Georgia State University Atlanta, GA 30303, USA
*To whom correspondence should be addressed.
Summary: 2SNP software package implements a new very fast scalable algorithm for haplotype inference based on genotype statistics collected only for pairs of SNPs. This software can be used for comparatively accurate phasing of large number of long genome sequences, e.g. obtained from DNA arrays. As an input 2SNP takes genotype matrix and outputs the corresponding haplotype matrix. On datasets across 79 regions from HapMap 2SNP is several orders of magnitude faster than GERBIL and PHASE while matching them in quality measured by the number of correctly phased genotypes, single-site and switching errors. For example, 2SNP requires 41 s on Pentium 4 2 Ghz processor to phase 30 genotypes with 1381 SNPs (ENm010.7p15:2 data from HapMap) versus GERBIL and PHASE requiring more than a week and admitting no less errors than 2SNP.
Availability: 2SNP software package is publicly available at http://alla.cs.gsu.edu/~software/2SNP
Contact: alexz{at}cs.gsu.edu
Received on October 3, 2005; revised on November 10, 2005; accepted on November 14, 2005
This article has been cited by other articles:
![]() |
M. Niens, R. F. Jarrett, B. Hepkema, I. M. Nolte, A. Diepstra, M. Platteel, N. Kouprie, C. P. Delury, A. Gallagher, L. Visser, et al. HLA-A*02 is associated with a reduced risk and HLA-A*01 with an increased risk of developing EBV+ Hodgkin lymphoma Blood, November 1, 2007; 110(9): 3310 - 3315. [Abstract] [Full Text] [PDF] |
||||
