Bioinformatics Advance Access published online on November 2, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti741
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Statistics, University of California, Berkeley, CA, USA
* To whom correspondence should be addressed.
Motivation: A classification algorithm, based on a multi-chip, multi-SNP approach is proposed for Affymetrix SNP arrays. Current procedures for calling genotypes on SNP arrays process all the features associated with one chip and one SNP at a time. Using a large training sample where the genotype labels are known, we develop a supervised learning algorithm to obtain more accurate classification results on new data. The method we propose, RLMM, is based on a robustly fitted, linear model and uses the Mahalanobis distance for classification. The chip-to-chip non-biological variance is reduced through normalization. This model-based algorithm captures the similarities across genotype groups and probes, as well as across thousands of SNPs for accurate classification. In this paper, we apply RLMM to Affymetrix 100K SNP array data, present classification results and compare them to genotype calls obtained from the Affymetrix procedure DM, as well as to the publicly available genotype calls from the HapMap project. Availability: The RLMM software is implemented in R and is available from Bioconductor or from the first author at.
Received July 26, 2005
Revised September 28, 2005
Accepted October 21, 2005
Article
A genotype calling algorithm for Affymetrix SNP arrays
2 Department of Statistics, University of California, Berkeley, CA, USA; Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Nusrat Rabbee, E-mail: nrabbee{at}post.harvard.edu
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
E.-O. Glocker, A. Hennigs, M. Nabavi, A. A. Schaffer, C. Woellner, U. Salzer, D. Pfeifer, H. Veelken, K. Warnatz, F. Tahami, et al. A Homozygous CARD9 Mutation in a Family with Susceptibility to Fungal Infections N. Engl. J. Med., October 29, 2009; 361(18): 1727 - 1735. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. D. Greenman, G. Bignell, A. Butler, S. Edkins, J. Hinton, D. Beare, S. Swamy, T. Santarius, L. Chen, S. Widaa, et al. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data Biostat., October 15, 2009; (2009) kxp045v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. E. Ritchie, B. S. Carvalho, K. N. Hetrick, S. Tavare, and R. A. Irizarry R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips Bioinformatics, October 1, 2009; 25(19): 2621 - 2623. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Sun, F. A. Wright, Z. Tang, S. H. Nordgard, P. V. Loo, T. Yu, V. N. Kristensen, and C. M. Perou Integrated study of copy number states and genotype calls using high-density SNP arrays Nucleic Acids Res., September 1, 2009; 37(16): 5365 - 5377. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Wan, K. Sun, Q. Ding, Y. Cui, M. Li, Y. Wen, R. C. Elston, M. Qian, and W. J Fu Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation Nucleic Acids Res., September 1, 2009; 37(17): e117 - e117. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. LaFramboise Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances Nucleic Acids Res., July 1, 2009; (2009) gkp552v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-F. Lei, L.-J. Tan, X.-G. Liu, L. Wang, H. Yan, Y.-F. Guo, Y.-Z. Liu, D.-H. Xiong, J. Li, T.-L. Yang, et al. Genome-wide association study identifies two novel loci containing FLNB and SBF2 genes underlying stature variation Hum. Mol. Genet., May 1, 2009; 18(9): 1661 - 1669. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. You, D. Wang, P. Liu, H. Vikis, M. James, Y. Lu, Y. Wang, M. Wang, Q. Chen, D. Jia, et al. Fine Mapping of Chromosome 6q23-25 Region in Familial Lung Cancer Families Reveals RGS17 as a Likely Candidate Gene Clin. Cancer Res., April 15, 2009; 15(8): 2666 - 2674. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Yang, C. Cheng, W. Yang, D. Pei, X. Cao, Y. Fan, S. B. Pounds, G. Neale, L. R. Trevino, D. French, et al. Genome-wide Interrogation of Germline Genetic Variation Associated With Treatment Response in Childhood Acute Lymphoblastic Leukemia JAMA, January 28, 2009; 301(4): 393 - 403. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. I. Chasman, G. Pare, and P. M Ridker Population-Based Genomewide Genetic Analysis of Common Clinical Chemistry Analytes Clin. Chem., January 1, 2009; 55(1): 39 - 51. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lin, G. C. Tseng, S. Y. Cheong, L. J. H. Bean, S. L. Sherman, and E. Feingold Smarter clustering methods for SNP genotype calling Bioinformatics, December 1, 2008; 24(23): 2665 - 2671. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. E. Sarasquete, R. Garcia-Sanz, L. Marin, M. Alcoceba, M. C. Chillon, A. Balanzategui, C. Santamaria, L. Rosinol, J. de la Rubia, M. T. Hernandez, et al. Bisphosphonate-related osteonecrosis of the jaw is associated with polymorphisms of the cytochrome P450 CYP2C8 in multiple myeloma: a genome-wide single nucleotide polymorphism analysis Blood, October 1, 2008; 112(7): 2709 - 2712. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Liu, H. G. Vikis, D. Wang, Y. Lu, Y. Wang, A. G. Schwartz, S. M. Pinney, P. Yang, M. de Andrade, G. M. Petersen, et al. Familial Aggregation of Common Sequence Variants on 15q24-25.1 in Lung Cancer J Natl Cancer Inst, September 17, 2008; 100(18): 1326 - 1330. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-J. Liu, X.-G. Liu, L. Wang, C. Dina, H. Yan, J.-F. Liu, S. Levy, C. J. Papasian, B. M. Drees, J. J. Hamilton, et al. Genome-wide association scans identified CTNNBL1 as a novel gene for obesity Hum. Mol. Genet., June 15, 2008; 17(12): 1803 - 1813. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Ober, Z. Tan, Y. Sun, J. D. Possick, L. Pan, R. Nicolae, S. Radford, R. R. Parry, A. Heinzmann, K. A. Deichmann, et al. Effect of Variation in CHI3L1 on Serum YKL-40 Level, Risk of Asthma, and Lung Function N. Engl. J. Med., April 17, 2008; 358(16): 1682 - 1691. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Greshock, J. Cheng, D. Rusnak, A. M. Martin, R. Wooster, T. Gilmer, K. Lee, B. L. Weber, and T. Zaks Genome-wide DNA copy number predictors of lapatinib sensitivity in tumor-derived cell lines Mol. Cancer Ther., April 1, 2008; 7(4): 935 - 943. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Li, S. Wetten, L. Li, P. L. St. Jean, R. Upmanyu, L. Surh, D. Hosford, M. R. Barnes, J. D. Briley, M. Borrie, et al. Candidate Single-Nucleotide Polymorphisms From a Genomewide Association Study of Alzheimer Disease Arch Neurol, January 1, 2008; 65(1): 45 - 53. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Hayes, A. Pluzhnikov, K. Miyake, Y. Sun, M. C.Y. Ng, C. A. Roe, J. E. Below, R. I. Nicolae, A. Konkashbaev, G. I. Bell, et al. Identification of Type 2 Diabetes Genes in Mexican Americans Through Genome-Wide Association Studies Diabetes, December 1, 2007; 56(12): 3033 - 3044. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Xiao, M. R. Segal, Y.H. Yang, and R.-F. Yeh A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays Bioinformatics, June 15, 2007; 23(12): 1459 - 1467. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Carvalho, H. Bengtsson, T. P. Speed, and R. A. Irizarry Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data Biostat., April 1, 2007; 8(2): 485 - 499. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Hua, D. W. Craig, M. Brun, J. Webster, V. Zismann, W. Tembe, K. Joshipura, M. J. Huentelman, E. R. Dougherty, and D. A. Stephan SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays Bioinformatics, January 1, 2007; 23(1): 57 - 63. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Hu, H.-Y. Wang, D. M. Greenawalt, M. A. Azaro, M. Luo, I. V. Tereshchenko, X. Cui, Q. Yang, R. Gao, L. Shen, et al. AccuTyping: new algorithms for automated analysis of data from high-throughput genotyping with oligonucleotide microarrays Nucleic Acids Res., October 18, 2006; 34(17): e116 - e116. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Lamy, C. L. Andersen, F. P. Wikman, and C. Wiuf Genotyping and annotation of Affymetrix SNP arrays Nucleic Acids Res., September 1, 2006; 34(14): e100 - e100. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Nicolae, X. Wu, K. Miyake, and N. J. Cox GEL: a novel genotype calling algorithm using empirical likelihood Bioinformatics, August 15, 2006; 22(16): 1942 - 1947. [Abstract] [Full Text] [PDF] |
||||












