Skip Navigation


Bioinformatics Advance Access originally published online on August 12, 2004
Bioinformatics 2005 21(1):124-127; doi:10.1093/bioinformatics/bth470
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/1/124    most recent
bth470v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (5)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Tebbutt, S. J.
Right arrow Articles by Andersen, M. I.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tebbutt, S. J.
Right arrow Articles by Andersen, M. I.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics vol. 21 issue 1 © Oxford University Press 2005; all rights reserved.

SNP Chart: an integrated platform for visualization and interpretation of microarray genotyping data

Scott J. Tebbutt *, Igor V. Opushnyev , Ben W. Tripp , Ayaz M. Kassamali , Wendy L. Alexander and Marilyn I. Andersen

James Hogg iCAPTURE Centre for Cardiovascular and Pulmonary Research, St Paul's Hospital, University of British Columbia Vancouver, Canada V6Z 1Y6

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 PROGRAM OVERVIEW
 REFERENCES
 

Summary: SNP Chart is a Java application for the visualization and interpretation of microarray genotyping data primarily derived from arrayed primer extension-based chemistries. Spot intensity output files from microarray analysis tools are imported into SNP Chart, together with a multi-channel TIFF image of the original array experiment and a list of the actual single nucleotide polymorphisms (SNPs) being tested. Data from different and/or replicate probes that interrogate the same SNP, but that are scattered across the array grid, can be reassembled into a single chart format, specific for the SNP. This allows a quick and very effective ‘visualization’/‘quality control’ of the data from multiple probes for the same SNP that can be easily interpreted and manually scored as a genotype.

Availability: http://www.snpchart.ca

Contact: stebbutt{at}mrl.ubc.ca

Supplementary information: A comprehensive manual describing SNP Chart is available at the above website, together with sample data files.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 PROGRAM OVERVIEW
 REFERENCES
 
An important aspect of the Human Genome Project is the massive governmental and industry-sponsored effort to develop a dense set of biallelic markers (single nucleotide polymorphisms, SNPs) throughout the human genome Wang et al., 1998. This effort has been spurred by the realization that a dense set of SNP markers throughout the genome could yield critical information to determine specific functional SNPs and combinations of SNPs that form the genetic basis of complex diseases Risch and Merikangas, 1996. For research discovery purposes, there are a number of high-throughput genotyping technologies available [e.g. MALDI-TOF Sequenom Buetow et al., 2001; TaqMan Livak et al., 1995; and Pyrosequencing Ahmadian et al., 2000] that have been engineered to optimize the genotyping of large numbers of individuals for one SNP at a time.

Genotyping microarrays are devices displaying hundreds, or even thousands of specific oligonucleotide probes, precisely located on a small-format solid support. These array-based technologies offer both research and potentially clinical (patient-specific) application due to the ability of the multiple probe sets to simultaneously interrogate multiple genetic markers (SNPs) from an individual. There are a number of microarray genotyping protocols, including Affymetrix GeneChips Kennedy et al., 2003, Tagged/ZipCode Arrays [e.g. SBE-TAGS Hirschhorn et al., 2000 and Illumina's bead-array system Oliphant et al., 2002] and arrayed primer extension [APEX Kurg et al., 2000, Shumaker et al., 1996].

APEX is a re-sequencing method, combining the advantages of a highly parallel microarray with the discriminatory power of the Sanger dideoxy terminator sequencing chemistry Sanger et al., 1977. Research groups have developed APEX-based microarrays for a variety of genotyping and mutation detection assays, including thalassemia gene mutations Chan et al., 2004, Gemignani et al., 2002, human chromosome 22 SNP markers Dawson et al., 2002, xenobiotic metabolism- and DNA repair-related gene SNPs Landi et al., 2003, and genome-wide SNPs (Tebbutt et al., in press).

Microarray image analysis tools are reasonably good at identifying array probe features (spots) and extracting appropriate intensity values from multiple dye channels, but are ultimately designed for gene expression studies, not for genotyping. To our knowledge there is only one software package specifically designed for microarray-based genotyping using arrayed primer extension. Genorama is a proprietary image analysis software package designed by Asper Biotech Ltd (www.asperbio.com) that is capable of detecting all four colours of fluorescence emitted from the dyes used in an APEX experiment, and then automatically call the base(s) incorporated at a particular probe spot. However, the scoring algorithm treats all probes equally, and can sometimes give an erroneous score that has three bases. This is an obvious problem as the genotype consists of two bases and not three, and hence considerable inspection of the original array data may be required to make a final genotype call. Thus, Genorama is a base calling algorithm and not a true genotyping algorithm. Furthermore, Genorama requires any duplicate spots of the same probe to be positioned adjacently in the microarray grid, limiting the robustness of the experimental design in overcoming issues, such as random pin blockage during chip printing, localized hybridization failure and high local background problems.


    PROGRAM OVERVIEW
 TOP
 ABSTRACT
 INTRODUCTION
 PROGRAM OVERVIEW
 REFERENCES
 
SNP Chart is a visualization tool written in the Java language. SNP Chart is platform independent: it can be run on any operation system—Windows, Linux, Unix, Novell, Mac OS or any other that can implement the Java run-time environment. The architecture of the application follows the Model-View-Controller paradigm, based on universal reusable components, and supports open standards. Functionality can be easily extended with plug-ins that can implement integration with new data sources or perform new analyses, including statistical algorithms.

Data can be stored in any database supporting the ANSI 92 SQL standard. User authentication can be provided either against a LDAP Server or using a built-in component. To change a back-end database, or authentication mechanism, appropriate alterations can be made to the configuration file. The default configuration for the enterprise version is set to use the enterprise level IBM DB2 Universal Database and LDAP authentication.

For illustrative purposes, SNP Chart has three major functionalities—data import, data visualization with user genotype calling and data export. Functionalities can be differentially assigned to multiple users. Multichannel spot intensity data from genotyping microarrays are imported, along with colour TIFF images of the actual arrays themselves, SNP-specific oligonucleotides probe information, http links to public database resources, such as NCBI (http://www.ncbi.nih.gov/SNP/) and the SNP Consortium (http://snp.cshl.org/), and any other type of information deemed appropriate.

Experimental data from a single sample array or multiple sample arrays can be viewed by selecting an individual SNP ‘rs’ number (dbSNP). A SNP-specific chart is generated for each array (Fig. 36, 1), displaying all spot features and channel intensity measurements from multiple types of probes [including APEX probes and allele-specific (AS) APEX probes Gemignani et al., 2002] that were originally scattered across the microarray grid, but that provide information on a single SNP. The array colour TIFF image can also be accessed, with three different views displayed: the actual spot feature for a selected intensity data point; the sub-grid of the array; and the entire array grid (Fig. 36, 2). ‘Prototype’ charts for the selected SNP, specific to validated genotypes (e.g. CC, CT, TT and NEGative control) can also be displayed (Fig. 36, 3), allowing easy discrimination of the chart under review. A genotype call can be made by way of the Scoring and Genotyping panel (Fig. 36, 4). Automation of the calling of genotypes, based on the information displayed in the chart, is under development. This will allow faster analysis and reduce user-subjectivity issues. Nevertheless, it is unlikely that any automatic scoring algorithm will be perfect for all SNPs, and the data visualization of SNP Chart provides a useful ‘manual-override’ in cases of null calls.

The data export function allows single/multiple SNP data from selected experiments and samples to be downloaded to an Excel file. Three export formats are available: ‘Intensities Export’ retrieves all data, including channel intensities for each spot; ‘Scorer Genotypes Export’ and ‘Final Genotypes Export’ delivers the genotype calls without any associated spot/probe data, and are in a format acceptable for genetic epidemiological analysis programs.

To evaluate the accuracy of SNP Chart, we genotyped 12 Coriell DNA samples (http://coriell.umdnj.edu/) across 123 SNPs (Tebbutt, et al., in press). We were able to compare our microarray-based data (scored using SNP Chart) against 1141 genotypes that had been determined by other research groups. Of these 1141, we found 1124 to be identical to our data, with a single null call (0.1%) and 16 miss calls (1.4%) for a combined error rate of 1.5%.

In summary, SNP Chart allows users to collect, store, and request data from multiple array genotyping experiments and analyses. The software generates visual patterns of spot intensity values from multiple channels, from a multiple probe set specific for a given SNP, easily interpretable as a specific genotype. The advantage over existing array data display methods is that one can easily look at an entire multiple probe set for a specific SNP, which can be more informative than looking at individual probes separately. The authors recognize that automated calling of the genotypes is required to further enhance SNP Chart. Nevertheless, the current software is a valuable tool for manual quality control of microarray genotyping data. SNP Chart could also be applied to expression array data, where multiple probes interrogate the same gene, or similar genes and/or gene pathways.



View larger version (86K):
[in this window]
[in a new window]
 
Fig. 1 Example of microarray data visualization in SNP Chart. See text for details.

 

    Acknowledgments
 
We would like to thank Jian Ruan for laboratory technical assistance, Kelly Burkett, Jian Qing He and Denise Daley for helpful comments and Peter Paré for continued support. This research was supported by the Canadian Institutes of Health Research, CANARIE, and the Michael Smith Foundation for Health Research.

Received on May 27, 2004; revised on July 13, 2004; accepted on August 8, 2004

    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 PROGRAM OVERVIEW
 REFERENCES
 

    Ahmadian, A., Gharizadeh, B., Gustafsson, A.C., Sterky, F., Nyren, P., Uhlen, M., Lundeberg, J. (2000) Single-nucleotide polymorphism analysis by pyrosequencing. Anal. Biochem., 280, 103–110[CrossRef][Web of Science][Medline].

    Buetow, K.H., Edmonson, M., MacDonald, R., Clifford, R., Yip, P., Kelley, J., Little, D.P., Strausberg, R., Koester, H., Cantor, C.R., Braun, A. (2001) High-throughput development and characterization of a genomewide collection of gene-based single nucleotide polymorphism markers by chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Proc. Natl Acad. Sci. USA, 98, 581–584[Abstract/Free Full Text].

    Chan, K., Wong, M.S., Chan, T.K., Chan, V. (2004) A thalassaemia array for Southeast Asia. Br. J. Haematol., 124, 232–239[CrossRef][Web of Science][Medline].

    Dawson, E., Abecasis, G.R., Bumpstead, S., Chen, Y., Hunt, S., Beare, D.M., Pabial, J., Dibling, T., Tinsley, E., Kirby, S., et al. (2002) A first-generation linkage disequilibrium map of human chromosome 22. Nature, 418, 544–548[CrossRef][Medline].

    Gemignani, F., Perra, C., Landi, S., Canzian, F., Kurg, A., Tonisson, N., Galanello, R., Cao, A., Metspalu, A., Romeo, G. (2002) Reliable detection of beta-thalassemia and G6PD mutations by a DNA microarray. Clin. Chem., 48, 2051–2054[Free Full Text].

    Hirschhorn, J.N., Sklar, P., Lindblad-Toh, K., Lim, Y.M., Ruiz-Gutierrez, M., Bolk, S., Langhorst, B., Schaffner, S., Winchester, E., Lander, E.S. (2000) SBE-TAGS: an array-based method for efficient single-nucleotide polymorphism genotyping. Proc. Natl Acad. Sci. USA, 97, 12164–12169[Abstract/Free Full Text].

    Kennedy, G.C., Matsuzaki, H., Dong, S., Liu, W.M., Huang, J., Liu, G., Su, X., Cao, M., Chen, W., Zhang, J., et al. (2003) Large-scale genotyping of complex DNA. Nat. Biotechnol., 21, 1233–1237[CrossRef][Web of Science][Medline].

    Kurg, A., Tonisson, N., Georgiou, I., Shumaker, J., Tollett, J., Metspalu, A. (2000) Arrayed primer extension: solid-phase four-color DNA resequencing and mutation detection technology. Genet. Test, 4, 1–7[CrossRef][Web of Science][Medline].

    Landi, S., Gemignani, F., Gioia-Patricola, L., Chabrier, A., Canzian, F. (2003) Evaluation of a microarray for genotyping polymorphisms related to xenobiotic metabolism and DNA repair. BioTechniques, 35, 816–820 822, 824–817[Web of Science][Medline].

    Livak, K.J., Flood, S.J., Marmaro, J., Giusti, W., Deetz, K. (1995) Oligonucleotides with fluorescent dyes at opposite ends provide a quenched probe system useful for detecting PCR product and nucleic acid hybridization. PCR Methods Appl., 4, 357–362[Web of Science][Medline].

    Oliphant, A., Barker, D.L., Stuelpnagel, J.R., Chee, M.S. (2002) BeadArray technology: enabling an accurate, cost-effective approach to high-throughput genotyping. BioTechniques, Suppl., 56–58 60–51.

    Risch, N. and Merikangas, K. (1996) The future of genetic studies of complex human diseases. Science, 273, 1516–1517[Abstract/Free Full Text].

    Sanger, F., Nicklen, S., Coulson, A.R. (1977) DNA sequencing with chain-terminating inhibitors. Proc. Natl Acad. Sci. USA, 74, 5463–5467[Abstract/Free Full Text].

    Shumaker, J.M., Metspalu, A., Caskey, C.T. (1996) Mutation detection by solid phase primer extension. Hum. Mutat., 7, 346–354[CrossRef][Web of Science][Medline].

    Tebbutt, S.J., Burkett, K.M., He, J-Q, Ruan, J., Opushnyev, I.V., Tripp, B.W., Zeznik, J.A., Abara, C.O., Nelson, C.C., Walley, K.R. (2004) A microarray genotyping resource to determine population stratification in genetic association studies of complex disease. Bio Techniques, in press.

    Wang, D.G., Fan, J.B., Siao, C.J., Berno, A., Young, P., Sapolsky, R., Ghandour, G., Perkins, N., Winchester, E., Spencer, J., et al. (1998) Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science, 280, 1077–1082[Abstract/Free Full Text].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
D. C. Walley, B. W. Tripp, Y. C. Song, K. R. Walley, and S. J. Tebbutt
MACGT: multi-dimensional automated clustering genotyping tool for analysis of microarray-based mini-sequencing data
Bioinformatics, May 1, 2006; 22(9): 1147 - 1149.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/1/124    most recent
bth470v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (5)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Tebbutt, S. J.
Right arrow Articles by Andersen, M. I.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tebbutt, S. J.
Right arrow Articles by Andersen, M. I.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?