Bioinformatics Advance Access originally published online on January 12, 2005
Bioinformatics 2005 21(9):2123-2125; doi:10.1093/bioinformatics/bti264
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ALOHOMORA: a tool for linkage analysis using 10K SNP array data
1Bioinformatics Department, Gene Mapping Center, Max Delbrück Center (MDC) for Molecular Medicine Berlin-Buch, Germany
2Gene Mapping Center, Max Delbrück Center (MDC) for Molecular Medicine Berlin-Buch, Germany
3Cologne Center for Genomics, University of Cologne Cologne, Germany
*To whom correspondence should be addressed at Gene Mapping Center, Max Delbrück Center for Molecular Medicine (MDC), Berlin-Buch, Robert-Roessle-Strasse 10 13092 Berlin, Germany.
| Abstract |
|---|
|
|
|---|
Summary: ALOHOMORA is a software tool designed to facilitate genome-wide linkage studies performed with high-density single nucleotide polymorphism (SNP) marker panels such as the Affymetrix GeneChip® Human Mapping 10K Array. Genotype data are converted into appropriate formats for a number of common linkage programs and subjected to standard quality control routines before linkage runs are started. ALOHOMORA is written in Perl and may be used to perform state-of-the-art linkage scans in small and large families with any genetic model. Options for using different genetic maps or ethnicity-specific allele frequencies are implemented. Graphic outputs of whole-genome multipoint LOD score values are provided for the entire dataset as well as for individual families.
Availability: ALOHOMORA is available free of charge for non-commercial research institutions. For more details, see http://gmc.mdc-berlin.de/alohomora/
Contact: fruesch{at}mdc-berlin.de
In the past two decades, positional cloning via genome-wide linkage analysis in families has been a powerful approach to the elucidation not only of numerous Mendelian but also of common diseases, provided that high-risk alleles were involved in some of the families (Botstein and Risch, 2003; Carlson et al., 2004). Until recently it was common practice to use a panel of about 400 microsatellites at 10 cM average intermarker distance from the well-defined human genetic map for linkage analysis (Murray et al., 1994; Dib et al., 1996; Kong et al., 2002). However, recent progress in SNP discovery and genotyping provides the opportunity to use this marker type for linkage analysis as well (Collins et al., 1997; Matise et al., 2003). If properly selected only twice as many SNPs have to be analyzed in comparison to the highly polymorphic microsatellites to extract the same amount of linkage information from the studied families (Evans and Cardon, 2004). Moreover, the traditional 10 cM microsatellite scan is likely to miss linkage signals due to an inadequately low information content associated with this sparse map of markers (Evans and Cardon, 2004; Middleton et al., 2004; John et al., 2004). Thus, a high-density SNP panel for linkage analysis should comprise several thousands of markers. A convenient tool to genotype >10 000 SNPs approximately equally distributed over the whole genome in a single experiment is the Affymetrix GeneChip® Human Mapping 10K Array (Kennedy et al., 2003; Matsuzaki et al., 2004). Obviously, the substantial increase of markers raises a problem for the analysis of the data as some linkage programs were designed for the requirements of conventional low-number marker sets. For instance, Genehunter 2.1 (Kruglyak et al., 1996) is restricted to 300 markers and Simwalk2 (Sobel and Lange, 1996) to 31 only. Partially, this may be overcome by using recompiled versions of the programs allowing for a higher maximum number of markers. Alternatively, the analysis may be performed with subsets of markers using a sliding window mode.
When we started using the Mapping 10K SNP Array for linkage analysis no software was available to import the data into the common linkage programs. Therefore, we developed our own program ALOHOMORA that easily converts Affymetrix genotype data into linkage and haplotype information. The program is written in Perl/Tk running under Windows and Linux. The current version accepts genotype data as generated by the GeneChip DNA Analysis Software (GDAS v3.0) from Affymetrix.
With ALOHOMORA, a comprehensive quality control of the data can be performed accessing other freely available programs. Gender of samples is checked by counting the heterozygote SNPs on the X-chromosome and comparing it to the pedigree file information. The correct relationships within the families are checked by the program GRR (Abecasis et al., 2001). PedCheck is used for detection of Mendelian errors (OConnell and Weeks, 1998). SNPs with Mendelian errors and SNPs that are not informative for any individual of a dataset can be selectively removed from the data. Non-mendelian errors are identified by the Merlin option error (Abecasis et al., 2002) and the unlikely genotypes deleted in the individuals in which they occur. Other options are the chip version used, because SNP contents may differ between different versions of chips, the preferred genetic map and the allele frequencies for the appropriate ethnicity (Fig. 1A). For linkage analysis, data may be converted for Allegro (Gudbjartsson et al., 2000), Genehunter, Merlin and Simwalk2. For the chosen program, the user can define the genetic model when parametric analysis is performed, the size of a moving window and furthermore, select linkage program-specific options (Fig. 1B).
|
Non-parametric LOD score calculations are preferably performed with Merlin or Allegro, chromosome by chromosome using all SNPs on a chromosome simultaneously for a multipoint analysis. No limitation regarding the number of markers was observed up to 945 SNPs as known to be available for chromosome 2. Parametric linkage analysis was performed with Allegro v1.2, Genehunter 2.1v5 and with Simwalk2 v2.89. Due to the limitations of Genehunter and Simwalk2 with respect to the number of markers, the analysis was done with subsets of markers in the way of a non-overlapping moving window. For Genehunter window sizes of 50300 were used and pedigrees limited to max bits <20. Simwalk2 was recompiled for using up to 255 markers in one run. In cases of large pedigrees, when Genehunter drops individuals and both Allegro and Merlin skip the pedigree, we split the pedigree to appropriate sizes to run Allegro, Merlin or Genehunter and used Simwalk2 to calculate the pedigree as a whole. NPL and parametric LOD scores may be plotted for all chromosomes.
All four programs, Allegro, Merlin, Genehunter and Simwalk2 generate haplotyes. Mostly we used the haplotyping from Genehunters haplo.dump file. For visualization purposes, we developed HaploPainter as a user-friendly tool for the handling of haplotype information in extended pedigrees (Thiele and Nürnberg, 2005).
In conclusion, linkage mapping with high-density SNP arrays is expected to tremendously expedite positional cloning studies. While genotyping with SNP arrays is straightforward, difficulties in data analysis seem to have prevented a broader application so far. Here, we present a graphical user interface, ALOHOMORA, as an open source software for the scientific community to facilitate linkage analysis with chip data. Using this program we successfully analyzed several genome scans performed with the Affymetrix GeneChip® Human Mapping 10K SNP array. These projects were not restricted to autozygosity mapping rather included recessive and dominant traits (Kaindl et al., 2004; Uhlenberg et al., 2004; Janecke et al., 2004). Altogether, Mapping 10K, ALOHOMORA, and HaploPainter are suggested to form a perfect tool box for high-speed gene mapping.
| Acknowledgments |
|---|
We would like to thank our colleagues and guests of the Gene Mapping Center at the MDC for their advice and software testing. This work was funded by the Federal Ministry of Science and Education of Germany through the National Genome Research Network.
Received on December 4, 2004; revised on January 6, 2005; accepted on January 6, 2005
| REFERENCES |
|---|
|
|
|---|
Abecasis, G.R., Cherny, S.S., Cookson, W.O., Cardon, L.R. (2001) GRR: graphical representation of relationship errors. Bioinformatics, 17, 742743
Abecasis, G.R., Cherny, S.S., Cookson, W.O., Cardon, L.R. (2002) Merlinrapid analysis of dense genetic maps using sparse gene flow trees. Nat. Genet., 30, 97101[CrossRef][Web of Science][Medline].
Botstein, D. and Risch, N. (2003) Discovering genotypes underlying human phenotypes: past successes for Mendelian disease, future approaches for complex disease. Nat. Genet., 33, Suppl., S228S237.
Carlson, C.S., Eberle, M.A., Kruglyak, L., Nickerson, D.A. (2004) Mapping complex disease loci in whole-genome association studies. Nature, 429, 446452[CrossRef][Medline].
Collins, F.S., Guyer, M.S., Charkravarti, A. (1997) Variations on a theme: cataloguing human DNA sequence variation. Science, 278, 15801581
Dib, C., Faure, S., Fizames, C., Samson, D., Drouot, N., Vignal, A., Millasseau, P., Marc, S., Hazan, J., Seboun, E., et al. (1996) A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature, 380, 152154[CrossRef][Medline].
Evans, D.M. and Cardon, L.R. (2004) Guidelines for genotyping in genomewide linkage studies: single-nucleotide-polymorphism maps versus microsatellite maps. Am. J. Hum. Genet., 75, 687692[CrossRef][Web of Science][Medline].
Gudbjartsson, D.F., Jonasson, K., Frigge, M.L., Kong, A. (2000) Allegro, a new computer program for multipoint linkage analysis. Nat. Genet., 25, 1213[CrossRef][Web of Science][Medline].
Janecke, A.R., Thompson, D.A., Utermann, G., Becker, C., Hübner, C.A., Schmid, E., McHenry, C.L., Nair, A.R., Rüschendorf, F., Heckenlively, J., et al. (2004) Mutations in RDH12 encoding a photoreceptor cell retinol dehydrogenase cause childhood-onset severe retinal dystrophy. Nat. Genet., 36, 850854[CrossRef][Web of Science][Medline].
John, S., Shephard, N., Liu, G., Zeggini, E., Cao, M., Chen, W., Vasavda, N., Mills, T., Barton, A., Hinks, A., et al. (2004) Whole-genome scan, in a complex disease, using 11,245 single-nucleotide polymorphisms: comparison with microsatellites. Am. J. Hum. Genet., 75, 5464[CrossRef][Web of Science][Medline].
Kaindl, A.M., Rüschendorf, F., Krause, S., Goebel, H.H., Koehler, K., Becker, C., Pongratz, D., Müller-Höcker, J., Nürnberg, P., Stoltenburg-Didinger, G., et al. (2004) Missense mutations of ACTA1 cause dominant congenital myopathy with cores. J. Med. Genet., 41, 842848
Kennedy, G.C., Matsuzaki, H., Dong, S., Liu, W.M., Huang, J., Liu, G., Su, X., Cao, M., Chen, W., Zhang, J., et al. (2003) Large-scale genotyping of complex DNA. Nat. Biotechnol., 21, 12331237[CrossRef][Web of Science][Medline].
Kong, A., Gudbjartsson, D.F., Sainz, J., Jonsdottir, G.M., Gudjonsson, S.A., Richardsson, B., Sigurdardottir, S., Barnard, J., Hallbeck, B., Masson, G., et al. (2002) A high-resolution recombination map of the human genome. Nat. Genet., 31, 241247[CrossRef][Web of Science][Medline].
Kruglyak, L., Daly, M.J., Reeve-Daly, M.P., Lander, E.S. (1996) Parametric and nonparametric linkage analysis: a unified multipoint approach. Am. J. Hum. Genet., 58, 13471363[Web of Science][Medline].
Matise, T.C., Sachidanandam, R., Clark, A.G., Kruglyak, L., Wijsman, E., Kakol, J., Buyske, S., Chui, B., Cohen, P., de Toma, C., et al. (2003) A 3.9-centimorgan-resolution human single-nucleotide polymorphism linkage map and screening set. Am. J. Hum. Genet., 73, 271284[CrossRef][Web of Science][Medline].
Matsuzaki, H., Loi, H., Dong, S., Tsai, Y.Y., Fang, J., Law, J., Di, X., Liu, W.M., Yang, G., Liu, G., et al. (2004) Parallel genotyping of over 10000 SNPs using a one-primer assay on a high-density oligonucleotide array. [Erratum (2004) Genome Res., 14, 786.]. Genome Res., 14, 414425
Middleton, F.A., Pato, M.T., Gentile, K.L., Morley, C.P., Zhao, X., Eisener, A.F., Brown, A., Petryshen, T.L., Kirby, A.N., Medeiros, H., et al. (2004) Genomewide linkage analysis of bipolar disorder by use of a high-density single-nucleotide-polymorphism (SNP) genotyping assay: a comparison with microsatellite marker assays and finding of significant linkage to chromosome 6q22. Am. J. Hum. Genet., 74, 886897[CrossRef][Web of Science][Medline].
Murray, J.C., Buetow, K.H., Weber, J.L., Ludwigsen, S., Scherpbier-Heddema, T., Manion, F., Quillen, J., Sheffield, V.C., Sunden, S., Duyk, G.M., et al. (1994) A comprehensive human linkage map with centimorgan density. Cooperative Human Linkage Center (CHLC). Science, 265, 20492054
OConnell, J.R. and Weeks, D.E. (1998) PedCheck: a program for identification of genotype incompatibilities in linkage analysis. Am. J. Hum. Genet., 63, 259266[CrossRef][Web of Science][Medline].
Sobel, E. and Lange, K. (1996) Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker sharing statistics. Am. J. Hum. Genet., 58, 13231337[Web of Science][Medline].
Thiele, H. and Nürnberg, P. (2005) HaploPainter: a tool for drawing pedigrees with complex haplotypes. Bioinformatics, in press.
Uhlenberg, B., Schuelke, M., Rüschendorf, F., Ruf, N., Kaindl, A.M., Henneke, M., Thiele, H., Stoltenburg-Didinger, G., Aksu, F., Topalo
lu, H., et al. (2004) Mutations in the gene encoding gap junction protein
12 (connexin 46.6) cause Pelizaeus-Merzbacher-like disease. Am. J. Hum. Genet., 75, 251260[CrossRef][Web of Science][Medline].
This article has been cited by other articles:
![]() |
R. Attali, N. Warwar, A. Israel, I. Gurt, E. McNally, M. Puckelwartz, B. Glick, Y. Nevo, Z. Ben-Neriah, and J. Melki Mutation of SYNE-1, encoding an essential component of the nuclear lamina, is responsible for autosomal recessive arthrogryposis Hum. Mol. Genet., September 15, 2009; 18(18): 3462 - 3469. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bahlo and C. J. Bromhead Generating linkage mapping files from Affymetrix SNP chip data Bioinformatics, August 1, 2009; 25(15): 1961 - 1962. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Bockenhauer, S. Feather, H. C. Stanescu, S. Bandulik, A. A. Zdebik, M. Reichold, J. Tobin, E. Lieberer, C. Sterner, G. Landoure, et al. Epilepsy, Ataxia, Sensorineural Deafness, Tubulopathy, and KCNJ10 Mutations N. Engl. J. Med., May 7, 2009; 360(19): 1960 - 1970. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Y. Gregory-Evans, M. Moosajee, M. D. Hodges, D. S. Mackay, L. Game, N. Vargesson, A. Bloch-Zupan, F. Ruschendorf, L. Santos-Pinto, G. Wackens, et al. SNP genome scanning localizes oto-dental syndrome to chromosome 11q13 and microdeletions at this locus implicate FGF3 in dental and inner-ear disease and FADD in ocular coloboma Hum. Mol. Genet., October 15, 2007; 16(20): 2482 - 2493. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



