Bioinformatics Advance Access originally published online on January 18, 2005
Bioinformatics 2005 21(9):1958-1963; doi:10.1093/bioinformatics/bti275
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarrays
Affymetrix, Inc. 3380 Central Expressway, Santa Clara, CA 95051, USA
*To whom correspondence should be addressed.
Motivation: A high density of single nucleotide polymorphism (SNP) coverage on the genome is desirable and often an essential requirement for population genetics studies. Region-specific or chromosome-specific linkage studies also benefit from the availability of as many high quality SNPs as possible. The availability of millions of SNPs from both Perlegen and the public domain and the development of an efficient microarray-based assay for genotyping SNPs has brought up some interesting analytical challenges. Effective methods for the selection of optimal subsets of SNPs spanning the genome and methods for accurately calling genotypes from probe hybridization patterns have enabled the development of a new microarray-based system for robustly genotyping over 100 000 SNPs per sample.
Results: We introduce a new dynamic model-based algorithm (DM) for screening over 3 million SNPs and genotyping over 100 000 SNPs. The model is based on four possible underlying states: Null, A, AB and B for each probe quartet. We calculate a probe-level log likelihood for each model and then select between the four competing models with an SNP-level statistical aggregation across multiple probe quartets to provide a high-quality genotype call along with a quality measure of the call. We assess performance with HapMap reference genotypes, informative Mendelian inheritance relationship in families, and consistency between DM and another genotype classification method. At a call rate of 95.91% the concordance with reference genotypes from the HapMap Project is 99.81% based on over 1.5 million genotypes, the Mendelian error rate is 0.018% based on 10 trios, and the consistency between DM and MPAM is 99.90% at a comparable rate of 97.18%. We also develop methods for SNP selection and optimal probe selection.
Availability: The DM algorithm is available in Affymetrix's Genotyping Tools software package and in Affymetrix's GDAS software package. See http://www.affymetrix.com for further information. 10K and 100K mapping array data are available on the Affymetrix website.
Contact: xiaojun_di{at}affymetrix.com
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y.-Y. Teo, X. Sim, R. T.H. Ong, A. K.S. Tan, J. Chen, E. Tantoso, K. S. Small, C.-S. Ku, E. J.D. Lee, M. Seielstad, et al. Singapore Genome Variation Project: A haplotype map of three Southeast Asian populations Genome Res., November 1, 2009; 19(11): 2154 - 2162. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Wan, K. Sun, Q. Ding, Y. Cui, M. Li, Y. Wen, R. C. Elston, M. Qian, and W. J Fu Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation Nucleic Acids Res., September 1, 2009; 37(17): e117 - e117. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Xing, W. S. Watkins, D. J. Witherspoon, Y. Zhang, S. L. Guthery, R. Thara, B. J. Mowry, K. Bulayeva, R. B. Weiss, and L. B. Jorde Fine-scaled human genetic structure revealed by SNP microarrays Genome Res., May 1, 2009; 19(5): 815 - 825. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-F. Lei, L.-J. Tan, X.-G. Liu, L. Wang, H. Yan, Y.-F. Guo, Y.-Z. Liu, D.-H. Xiong, J. Li, T.-L. Yang, et al. Genome-wide association study identifies two novel loci containing FLNB and SBF2 genes underlying stature variation Hum. Mol. Genet., May 1, 2009; 18(9): 1661 - 1669. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lin, G. C. Tseng, S. Y. Cheong, L. J. H. Bean, S. L. Sherman, and E. Feingold Smarter clustering methods for SNP genotype calling Bioinformatics, December 1, 2008; 24(23): 2665 - 2671. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Kujawski, P. Ouillette, H. Erba, C. Saddler, A. Jakubowiak, M. Kaminski, K. Shedden, and S. N. Malek Genomic complexity identifies patients with aggressive chronic lymphocytic leukemia Blood, September 1, 2008; 112(5): 1993 - 2003. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-J. Liu, X.-G. Liu, L. Wang, C. Dina, H. Yan, J.-F. Liu, S. Levy, C. J. Papasian, B. M. Drees, J. J. Hamilton, et al. Genome-wide association scans identified CTNNBL1 as a novel gene for obesity Hum. Mol. Genet., June 15, 2008; 17(12): 1803 - 1813. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. T. Croft Jr, R. M. Jordan, H. L. Patney, C. D. Shriver, M. N. Vernalis, T. J. Orchard, and D. L. Ellsworth Performance of Whole-Genome Amplified DNA Isolated from Serum and Plasma on High-Density Single Nucleotide Polymorphism Arrays J. Mol. Diagn., May 1, 2008; 10(3): 249 - 257. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Bacolod, G. S. Schemmann, S. Wang, R. Shattock, S. F. Giardina, Z. Zeng, J. Shia, R. F. Stengel, N. Gerry, J. Hoh, et al. The Signatures of Autozygosity among Patients with Colorectal Cancer Cancer Res., April 15, 2008; 68(8): 2610 - 2621. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Saddler, P. Ouillette, L. Kujawski, S. Shangary, M. Talpaz, M. Kaminski, H. Erba, K. Shedden, S. Wang, and S. N. Malek Comprehensive biomarker and genomic analysis identifies p53 status as the major determinant of response to MDM2 inhibitors in chronic lymphocytic leukemia Blood, February 1, 2008; 111(3): 1584 - 1593. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Y. Teo, M. Inouye, K. S. Small, R. Gwilliam, P. Deloukas, D. P. Kwiatkowski, and T. G. Clark A genotype calling algorithm for the Illumina BeadArray platform Bioinformatics, October 15, 2007; 23(20): 2741 - 2746. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. W. Ross, P. D. Ouillette, C. M. Saddler, K. A. Shedden, and S. N. Malek Comprehensive Analysis of Copy Number and Allele Status Identifies Multiple Chromosome Defects Underlying Follicular Lymphoma Pathogenesis Clin. Cancer Res., August 15, 2007; 13(16): 4777 - 4785. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Xiao, M. R. Segal, Y.H. Yang, and R.-F. Yeh A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays Bioinformatics, June 15, 2007; 23(12): 1459 - 1467. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Carvalho, H. Bengtsson, T. P. Speed, and R. A. Irizarry Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data Biostat., April 1, 2007; 8(2): 485 - 499. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Jacobs, E. R. Thompson, Y. Nannya, G. Yamamoto, R. Pillai, S. Ogawa, D. K. Bailey, and I. G. Campbell Genome-Wide, High-Resolution Detection of Copy Number, Loss of Heterozygosity, and Genotypes from Formalin-Fixed, Paraffin-Embedded Tumor Tissue Using Microarrays Cancer Res., March 15, 2007; 67(6): 2544 - 2551. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Hua, D. W. Craig, M. Brun, J. Webster, V. Zismann, W. Tembe, K. Joshipura, M. J. Huentelman, E. R. Dougherty, and D. A. Stephan SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays Bioinformatics, January 1, 2007; 23(1): 57 - 63. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Komura, F. Shen, S. Ishikawa, K. R. Fitch, W. Chen, J. Zhang, G. Liu, S. Ihara, H. Nakamura, M. E. Hurles, et al. Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arrays Genome Res., December 1, 2006; 16(12): 1575 - 1584. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Kotliarov, M. E. Steed, N. Christopher, J. Walling, Q. Su, A. Center, J. Heiss, M. Rosenblum, T. Mikkelsen, J. C. Zenklusen, et al. High-resolution Global Genomic Survey of 178 Gliomas Reveals Novel Regions of Copy Number Alteration and Allelic Imbalances Cancer Res., October 1, 2006; 66(19): 9428 - 9436. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Lamy, C. L. Andersen, F. P. Wikman, and C. Wiuf Genotyping and annotation of Affymetrix SNP arrays Nucleic Acids Res., September 1, 2006; 34(14): e100 - e100. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Nicolae, X. Wu, K. Miyake, and N. J. Cox GEL: a novel genotype calling algorithm using empirical likelihood Bioinformatics, August 15, 2006; 22(16): 1942 - 1947. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Meaburn, L. M. Butcher, L. C. Schalkwyk, and R. Plomin Genotyping pooled DNA using 100K SNP microarrays: a step towards genomewide association scans Nucleic Acids Res., February 14, 2006; 34(4): e28 - e28. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Macgregor, P. M. Visscher, and G. Montgomery Analysis of pooled DNA samples on high density arrays without prior knowledge of differential hybridization rates. Nucleic Acids Res., January 1, 2006; 34(7): e55 - e55. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rabbee and T. P. Speed A genotype calling algorithm for affymetrix SNP arrays Bioinformatics, January 1, 2006; 22(1): 7 - 12. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Brohede, R. Dunne, J. D. McKay, and G. N. Hannan PPC: an algorithm for accurate estimation of SNP allele frequencies in small equimolar pools of DNA using data from high density microarrays Nucleic Acids Res., September 30, 2005; 33(17): e142 - e142. [Abstract] [Full Text] [PDF] |
||||








