Bioinformatics Advance Access originally published online on July 12, 2005
Bioinformatics 2005 21(17):3565-3567; doi:10.1093/bioinformatics/bti571
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
easyLINKAGE-Plusautomated linkage analyses using large-scale SNP data
1Institute of Medical Genetics, Charité, Humboldt University Berlin Augustenburger Platz 1, 13353 Berlin, Germany
2Division of Nephrology, Department of Medicine, Medical University Clinic at the University of Würzburg Josef-Schneider-Strasse 2, 97080 Würzburg, Germany
3Department of Clinical Biochemistry and Pathobiochemistry, Medical University Clinic at the University of Würzburg Versbacher Strasse 5, 97078 Würzburg, Germany
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Summary: We extended the original easyLINKAGE program by enabling linkage analyses for large-scale SNP data in addition to those of microsatellites. We implemented new modules for Allegro, Merlin, SimWalk, GeneHunter Imprinting, GeneHunter TwoLocus, SuperLink and extended FastSLink by automatic loop breaking and new outputs. We added conditional linkage analyses as well as multipoint simulation studies, and extended error test routines by checking for Mendelian/non-Mendelian genotyping errors and for deviations from HardyWeinberg equilibrium. Data can be analyzed in sets of markers, in defined centimorgan intervals and by using different allele frequency algorithms. The outputs consist of genome-wide as well as chromosomal postscript plots of LOD scores, NPL scores, P-values and other parameters.
Availability: http://www.uni-wuerzburg.de/nephrologie/molecular_genetics/molecular_genetics.htm
Contact: tom.lindner{at}mail.uni-wuerzburg.de
Supplementary information: Supplementary information is available on the website. The current version is v4.01beta.
Nowadays, the use of SNP chips is becoming affordable at almost equal costs when compared with an average genome-wide scan of 400 microsatellite markers. The saving of time is huge, less technicians and fewer infrastructures are required to perform SNP hybridization. Recent studies have shown that the informativity obtained from SNP chip data is higher than those from microsatellite scans (Schaid et al., 2004). In a situation where several non-informative microsatellites cover a linked region that was not yet realized as such by the user, analyses could further fail to detect linkage. SNP chip based reanalyses of genome-wide linkage studies that used microsatellite markers in affected sibpairs without parents have been proposed (Evans and Cardon, 2004).
We extended the first version of easyLINKAGE (Lindner and Hoffmann, 2005) which could only analyze STRPs by the ability to analyze thousands of SNPs, such as those from the Affymetrix 10k chip (Fig. 1a). We implemented additional modules for Allegro. (Gudbjartsson et al., 2000), Merlin (Abecasis et al., 2002), SimWalk (Weeks et al., 1995), GeneHunter Imprinting/TwoLocus (Strauch et al., 2000) and SuperLink (Fishelson and Geiger, 2002) and integrated conditional multipoint analyses including the generation of the required weight files (Cox et al., 1999).
|
We applied further extensions to the simulation part of easyLINKAGE-Plus. For FastSLink loops will be broken automatically before feeding the pedigrees into the analysis. Final plots will show ELOD plots and the pedigree structure. We integrated multipoint simulation studies by making use of Allegro's simulation capabilities. The user must provide a pedigree structure file first. The user sets various simulation options and generates an inheritance model for parametric/nonparametric analyses. easyLINKAGE will perform the simulation based on those entries including an automatic subsequent linkage analysis with graphical outputs.
Error check routines were also extended in the current version of our software. The program uses PedCheck (O'Connell and Weeks, 1998) for the identification of Mendelian errors and Merlin for identifying non-Mendelian errors (e.g. unlikely genotypes owing to double recombination events) or deviations from HardyWeinberg equilibrium prior running subsequent linkage programs. easyLINKAGE offers the option for disregarding all SNPs/STRPs with genotyping errors before starting the linkage run.
The integration of SNP chip analyses is a great challenge for current multipoint linkage programs. In general, larger pedigrees have to be broken in such a way that as little information as possible is given up. Here, pre-two-point analyses with SuperLink could generate significant LOD scores even with less informative SNPs. Another important issue is the marker number itself. In contrast to GeneHunter or SimWalk, Allegro is not limited here. Besides that fact, there are a couple of implications of using hundreds of markers on a chromosome. First, programs can run into problems at the intermarker spacings
0.001 cM. Results could be off as a result of numerical instability for marker spacings of small size. This problem only depends on the spacing, not on the number of markers themselves. Second, if using very small distances, it is likely that markers are no longer in linkage equilibrium which current linkage programs would assume. Programs will compute the probability of the haplotypes using linkage equilibrium, thus believing the haplotype to be much rare than it really is. LODs and the P-values will no longer be valid owing to violation of the assumptions.
We tried to solve these limitations in several ways. If an intermarker distance happens to be
0.001, cM easyLINKAGE-Plus assumes automatically a fixed distance of 0.001 cM. Furthermore, chromosomes can be analyzed in defined sets of markers (Fig. 1b). We recommend performing at least two analyses with different set sizes in order to overlap the end/start point of two sets from earlier analyses. In addition, predefined centimorgan intervals can be analyzed for finemapping. Optionally, easyLINKAGE-Plus can remove fully uninformative markers (markers that are always homozygous for one allele in all typed samples).
The use of correct allele frequencies is another issue of SNP analyses. We used Affymetrix data for building an easyLINKAGE database that contains the reference frequencies for Asians, African Americans and Caucasians. easyLINKAGE assumes equal allele distribution for SNPs without available reference frequencies. If a reference allele frequency is exactly 1.0000 in a given population (but <1 in reality), easyLINKAGE sets this frequency to 0.9999 and the other to 0.0001. The opposite applies for situations where the frequency is exactly 0.0000.
easyLINKAGE-Plus generates structured text outputs and chromosomal/genome-wide plots of LOD scores, P-values and many other parameters. Plots display details of the used inheritance model, marker map, sex specific or sex-averaged marker positions, the number of known and unknown markers, in SNP projects even the number of uninformative SNPs, a table with the top five markers, the used pedigree file, date and time and elapsed time, working directory, allele frequency algorithm and other parameters. Plots can be generated as TOTAL plots averaging all families or plus individual family plots.
Many users complained about limited pedigree drawing capabilities of linkage programs that are supported by easyLINKAGE. Only GeneHunter provides pedigree plots, however they are limited. Therefore, easyLINKAGE-Plus extends the GeneHunter plots by showing marker names and their genetic position. In addition, the program provides input files for the software HaploPainter (Thiele and Nurnberg, 2005) which draws pedigrees including a colored presentation of markers, position and haplotypes plus recombination events.
ALOHOMORA is the only available program that comes close in terms of versatility when compared with easyLINKAGE-Plus (Rüschendorf and Nürnberg, 2005). However, easyLINKAGE-Plus is easier to handle since command line based inputs are not necessary. In addition to SNP data, our program analyzes microsatellite markers as well. A much wider range of linkage programs is covered by easyLINKAGE-Plus when compared with ALOHOMORA.
| Acknowledgments |
|---|
We thank Drs Alejandro Schaffer, Jurg Ott, Michael L. Frigge, Leonid Kruglyak and David Clayton for the permission to recompile the source code of their program for the use in Microsoft Windows and for publishing the binaries on our website. We thank Affymetrix Inc. for kindly providing their SNP databases. We thank Dr Peter Nürnberg (Cologne Center for Genomics) for providing large-scale SNP project data. K.H./T.H.L. were supported by grants from the Deutsche Forschungsgemeinschaft (SFB 577, project A9; LiDFG768/3-1/3-3, LiDFG768/4-1/4-2).
Conflict of Interest: none declared.
Received on March 2, 2005; revised on June 15, 2005; accepted on July 3, 2005
| REFERENCES |
|---|
|
|
|---|
Abecasis, G.R., et al. (2002) Merlinrapid analysis of dense genetic maps using sparse gene flow trees. Nat. Genet., 30, 97101[CrossRef][Web of Science][Medline].
Cox, N.J., et al. (1999) Loci on chromosomes 2 (NIDDM1) and 15 interact to increase susceptibility to diabetes in Mexican Americans. Nat. Genet., 21, 213215[CrossRef][Web of Science][Medline].
Evans, D.M. and Cardon, L.R. (2004) Guidelines for genotyping in genomewide linkage studies: single-nucleotide-polymorphism maps versus microsatellite maps. Am. J. Hum. Genet., 75, 687692[CrossRef][Web of Science][Medline].
Fishelson, M. and Geiger, D. (2002) Exact genetic linkage computations for general pedigrees. Bioinformatics, 18, (Suppl 1), S189S198[Abstract].
Gudbjartsson, D.F., et al. (2000) Allegro, a new computer program for multipoint linkage analysis. Nat. Genet., 25, 1213[CrossRef][Web of Science][Medline].
Lindner, T.H. and Hoffmann, K. (2005) easyLINKAGE: a PERL script for easy and automated two-/multi-point linkage analyses. Bioinformatics, 21, 405407
O'Connell, J.R. and Weeks, D.E. (1998) PedCheck: a program for identification of genotype incompatibilities in linkage analysis. Am. J. Hum. Genet., 63, 259266[CrossRef][Web of Science][Medline].
Schaid, D.J., et al. (2004) Comparison of microsatellites versus single-nucleotide polymorphisms in a genome linkage screen for prostate cancer-susceptibility Loci. Am. J. Hum. Genet., 75, 948965[CrossRef][Web of Science][Medline].
Strauch, K., et al. (2000) Parametric and nonparametric multipoint linkage analysis with imprinting and two-locus-trait models: application to mite sensitization. Am. J. Hum. Genet., 66, 19451957[CrossRef][Web of Science][Medline].
Thiele, H. and Nurnberg, P. (2005) HaploPainter: a tool for drawing pedigrees with complex haplotypes. Bioinformatics, 21, 17301732
Weeks, D.E., et al. (1995) Computer programs for multilocus haplotyping of general pedigrees. Am. J. Hum. Genet., 56, 15061507[Web of Science][Medline].
This article has been cited by other articles:
![]() |
I. van de Laar, M. Wessels, I. Frohn-Mulder, M. Dalinghaus, B. de Graaf, M. van Tienhoven, P. van der Moer, M. Husen-Ebbinge, M. Lequin, D. Dooijes, et al. First locus for primary pulmonary vein stenosis maps to chromosome 2q Eur. Heart J., October 2, 2009; 30(20): 2485 - 2492. [Abstract] [Full Text] [PDF] |
||||
![]() |
R.A. Oldenburg, M.F. van Dooren, B. de Graaf, E. Simons, L. Govaerts, S. Swagemakers, J.M.H. Verkerk, B.A. Oostra, and A.M. Bertoli-Avella A genome-wide linkage scan in a Dutch family identifies a premature ovarian failure susceptibility locus Hum. Reprod., December 1, 2008; 23(12): 2835 - 2841. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. I. den Hollander, J. J. C. van Lith-Verhoeven, M. L. Arends, T. M. Strom, F. P. M. Cremers, and C. B. Hoyng Novel Compound Heterozygous TULP1 Mutations in a Family With Severe Early-Onset Retinitis Pigmentosa Arch Ophthalmol, July 1, 2007; 125(7): 932 - 935. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



