Bioinformatics Advance Access published online on October 24, 2006
Bioinformatics, doi:10.1093/bioinformatics/btl536
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Computation Biology Division, Translational Genomics Research Institute, Phoenix, 445 N 5th St., Phoenix, AZ
* To whom correspondence should be addressed.
Motivation: The technology to genotype single nucleotide polymorphisms (SNPs) at extremely high densities provides for hypothesis-free genome-wide scans for common polymorphisms associated with complex disease. However, we find that some errors introduced by commonly employed genotyping algorithms may lead to inflation of false associations between markers and phenotype. Results: We have developed a novel SNP genotype calling program, SNiPer-High Density (SNiPer-HD), for highly accurate genotype calling across hundreds of thousands of SNPs. The program employs an expectation-maximization (EM) algorithm with parameters based on a training sample set. The algorithm choice allows for highly accurate genotyping for most SNPs. Also, we introduce a quality control metric for each assayed SNP, such that poor-behaving SNPs can be filtered using a metric correlating to genotype class separation in the calling algorithm. SNiPer-HD is superior to the standard dynamic modelling algorithm and is complementary and non-redundant to other algorithms, such as BRLMM. Implementing multiple algorithms together may provide highly accurate genotyping calls, without inflation of false positives due to systematically miss-called SNPs. A reliable and accurate set of SNP genotypes for increasingly dense panels will eliminate some false association signals and false negative signals, allowing for rapid identification of disease susceptibility loci for complex traits. Availability: SNiPer-HD is available at TGen's website: http://www.tgen.org/neurogenomics/data.
Received July 28, 2006
Revised September 25, 2006
Accepted October 12, 2006
Article
SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays
Jianping Hua 1, David W. Craig 2, Marcel Brun 1, Jennifer Webster 2, Victoria Zismann 2, Waibhav Tembe 1, Keta Joshipura 2, Matthew J. Huentelman 2, Edward R. Dougherty 3, and Dietrich A. Stephan 2 *
2 Neurogenomics Division, Translational Genomics Research Institute, Phoenix, 445 N 5th St., Phoenix, AZ
3 Computation Biology Division, Translational Genomics Research Institute, Phoenix, 445 N 5th St., Phoenix, AZ; Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX
Dietrich A. Stephan, E-mail: dstephan{at}tgen.org
![]()
Abstract
Associate Editor: Keith A Crandall
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. D. Greenman, G. Bignell, A. Butler, S. Edkins, J. Hinton, D. Beare, S. Swamy, T. Santarius, L. Chen, S. Widaa, et al. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data Biostat., October 15, 2009; (2009) kxp045v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. E. Ritchie, B. S. Carvalho, K. N. Hetrick, S. Tavare, and R. A. Irizarry R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips Bioinformatics, October 1, 2009; 25(19): 2621 - 2623. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. Bayjanov, M. Wels, M. Starrenburg, J. E. T. van Hylckama Vlieg, R. J. Siezen, and D. Molenaar PanCGH: a genotype-calling algorithm for pangenome CGH data Bioinformatics, February 1, 2009; 25(3): 309 - 314. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Homer, W. D. Tembe, S. Szelinger, M. Redman, D. A. Stephan, J. V. Pearson, S. F. Nelson, and D. Craig Multimarker analysis and imputation of multiple platform pooling-based genome-wide association studies Bioinformatics, September 1, 2008; 24(17): 1896 - 1902. [Abstract] [Full Text] [PDF] |
||||

