Skip Navigation


Bioinformatics Advance Access originally published online on October 25, 2005
Bioinformatics 2005 21(24):4430-4431; doi:10.1093/bioinformatics/bti725
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/24/4430    most recent
bti725v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (7)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Pardi, F.
Right arrow Articles by Lewis, C. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pardi, F.
Right arrow Articles by Lewis, C. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oxfordjournals.org

GSMA: software implementation of the genome search meta-analysis method

Fabio Pardi 1, Douglas F. Levinson 2 and Cathryn M. Lewis 1,*

1Department of Medical and Molecular Genetics, King's College London 8th Floor Guy's Tower, Guy's Hospital, London SE1 9RT, United Kingdom
2Division of Clinical Neurobiology and Behavior, University of Pennsylvania School of Medicine 3535 Market Street, Rm 4006, Philadelphia, PA 19104-3309, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 IMPLEMENTATION
 REFERENCES
 

Meta-analysis can be used to pool results of genome-wide linkage scans. This is of great value in complex diseases, where replication of linked regions occurs infrequently. The genome search meta-analysis (GSMA) method is widely used for this analysis, and a computer program is now available to implement the GSMA.

Availability: http://www.kcl.ac.uk/depsta/memoge/gsma/

Contact: Cathryn.lewis{at}genetics.kcl.ac.uk


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 IMPLEMENTATION
 REFERENCES
 
Genome-wide linkage searches are widely used to identify regions of the genome, which may harbour susceptibility genes for complex diseases. The value of these studies has been confirmed by a few genes localized by studying regions highlighted by linkage studies, e.g. CARD15 for Crohn's disease and CAPN10 for type 2 diabetes. However, linkage studies for many complex diseases have been disappointing, with few regions showing significant evidence for linkage, and little replication between studies in the same disease (Altmuller et al., 2001). Meta-analysis of genome-wide results provides a rapid method to identify linked regions that individual studies may lack the power to detect.

GSMA method
The most widely used meta-analysis method for linkage studies is the genome search meta-analysis (GSMA) (Wise et al., 1999; Levinson et al., 2003). The GSMA is a non-parametric method, which is applicable to results for any genome-wide linkage study, regardless of family structure (e.g. extended pedigrees, affected sib pairs), markers or statistical analysis method. The GSMA method has been applied to 13 diseases to date, with many further studies in progress. Studies in schizophrenia, rheumatoid arthritis and type 2 diabetes have shown that the GSMA can identify novel regions that were not highlighted by results from original studies (Demenais et al., 2003; Fisher et al., 2003; Lewis et al., 2003).

For each scan, the GSMA requires output statistics across the genome. This may be non-parametric LOD scores calculated at 1 cM intervals (e.g. from Genehunter), parametric LOD scores calculated for a series of models and recombination fractions or a single-point linkage statistic for each marker. Any linkage statistic (NPL score, LOD score, P-value) may be used. Data on linkage test statistics and markers are usually obtained from output files of linkage analysis programs or read from published graphs and tables. Data extraction is a key component of any meta-analysis and detailed information on this stage of the GSMA is given on the GSMA website.

The genome is divided into n bins of approximately equal cM width (e.g. 120 bins of 30 cM). For each study, the maximum evidence for linkage in each bin is identified, and bins are ranked (n, n – 1,..., 1) on the basis of their relative evidence for linkage. These ranks are summed across studies. The summed rank (SR) forms a test statistic for each bin and can be tested for significance using its distribution function (Wise et al., 1999) or by simulation. This bin-wise statistic presents a multiple-testing problem, since with no linkage we expect 5% of the bins to achieve nominal significance (PSR < 0.05). A genome-wide interpretation of results is obtained through the ordered rank (OR) statistic. Each of these order statistics, e.g. the observed k-th highest summed rank, is compared with the distribution of k-th highest summed ranks obtained through simulation using re-assignment of ranks in each study. Simulation studies have shown that any bin with significant summed rank and ordered rank statistic (PSR < 0.05, POR < 0.05) has a high probability of containing a true susceptibility gene (Levinson et al., 2003). A weighted analysis, where study ranks are multiplied by a weight reflecting the informativeness of the study, can also be performed.

We have recently developed a software package to perform the GSMA. Executables are available for SUN, Windows, MAC or Linux from the GSMA website; source code (C++) is available from the authors. The GSMA website also has program documentation, a Procedure Guide (detailing methods for data extraction), a list of boundary markers for defining bins, and a bibliography of GSMA studies and methodology development.

Statistical software packages can also be used to obtain the SR statistics and P-values, using simulation or the Koziol and Feng method for P-value calculation (Koziol and Feng, 2004). However, the OR statistic is more difficult to calculate, and this useful statistic has therefore been used in only a few GSMA studies. A method for testing heterogeneity across studies in the GSMA was recently developed and is implemented by the HEGESMA software, which also performs a meta-analysis of the study ranks (Zintzaras and Ioannidis, 2005a,b). The GSMA program, together with the HEGESMA software, provides analysis tools which will enable a full range of testing procedures in the GSMA to be performed by any investigator.


    IMPLEMENTATION
 TOP
 ABSTRACT
 INTRODUCTION
 IMPLEMENTATION
 REFERENCES
 
The GSMA program allows for an arbitrary number of bins (n), and studies (m), with no maximum values specified in the program. Significance tests for the summed rank and the ordered rank are performed, for weighted and unweighted analyses. The P-values are assessed by simulation of observed ranks within each study.

Two input files are required: a matrix of the maximum linkage statistic (e.g. NPL score, LOD score) for each bin for each study (with bin labels in column 1 and study names in row 1), and a file listing the weighting factor for each study. For studies reporting P-values, the data should be entered as 1 P-value to ensure the correct ranking of results, with significant results assigned high ranks. Tied observations within studies are permitted. Most genome-wide linkage studies have results available for all bins, but the program deals with any missing values by replacing them with the median linkage statistic for that study (giving a rank of (n + 1)/2). The weight of each study should reflect its informativeness, although the relative weighting of extended pedigrees and affected sib pairs will depend on the genetic effects contributing to disease risk, and is therefore difficult to quantify. One commonly used weighting function is the square root of the number of affected individuals.

The program is run from the command line, with options for the file name, the number of simulations performed (default 10 000), and the P-value threshold for interesting results (default PSR = 0.1). Three output files are produced. A summary file lists the bins showing the highest evidence for linkage in the weighted and unweighted analyses, and the standardized weights (Fig. 1). Other output files contain a full listing of results by chromosome bin and an output table of ranks, for data checking purposes. The program outputs summed ranks, but these can be converted to average ranks, if required (Levinson et al., 2003).



View larger version (39K):
[in this window]
[in a new window]
 
Fig. 1 Output for simulated data of 4 studies and 120 bins, showing the most significant results with summed rank and ordered rank P-values.

 
Conflict of Interest: none declared.

Received on August 11, 2005; revised on October 12, 2005; accepted on October 17, 2005

    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 IMPLEMENTATION
 REFERENCES
 

    Altmuller, J., et al. (2001) Genomewide scans of complex human diseases: true linkage is hard to find. Am. J. Hum. Genet, . 69, 936–950[CrossRef][Web of Science][Medline].

    Demenais, F., et al. (2003) A meta-analysis of four European genome screens (GIFT consortium) shows evidence for a novel region on chromosome 17p11.2–q22 linked to type 2 diabetes. Hum. Mol. Genet, . 12, 1865–1873[Abstract/Free Full Text].

    Fisher, S.A., et al. (2003) Meta-analysis of four rheumatoid arthritis genome-wide linkage studies—confirmation of a susceptibility locus on chromosome 16. Arthritis Rheum, . 48, 1200–1206[CrossRef][Web of Science][Medline].

    Koziol, J.A. and Feng, A.C. (2004) A note on the genome scan meta-analysis statistic. Ann. Hum. Genet, . 68, 376–380[CrossRef][Web of Science][Medline].

    Levinson, D.F., et al. (2003) Genome scan meta-analysis of schizophrenia and bipolar disorder, part I: methods and power analysis. Am. J. Hum. Genet, . 73, 17–33[CrossRef][Web of Science][Medline].

    Lewis, C.M., et al. (2003) Genome scan meta-analysis of schizophrenia and bipolar disorder, part II: schizophrenia. Am. J. Hum. Genet, . 73, 34–48[CrossRef][Web of Science][Medline].

    Wise, L.H., et al. (1999) Meta-analysis of genome searches. Ann. Hum. Genet, . 63, 263–272[CrossRef][Web of Science][Medline].

    Zintzaras, E. and Ioannidis, J.P.A. (2005a) HEGESMA: genome search meta-analysis and heterogeneity testing. Bioinformatics, 21, 2672–2673.

    Zintzaras, E. and Ioannidis, J.P.A. (2005b) Heterogeneity testing in meta-analysis of genome searches. Genet. Epidemiol, . 28, 123–137[CrossRef][Web of Science][Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
DiabetesHome page
A. Malhotra, S. C. Elbein, M. C.Y. Ng, R. Duggirala, R. Arya, G. Imperatore, A. Adeyemo, T. I. Pollin, W.-C. Hsueh, J. C.N. Chan, et al.
Meta-Analysis of Genome-Wide Linkage Studies of Quantitative Lipid Traits in Families Ascertained for Type 2 Diabetes
Diabetes, March 1, 2007; 56(3): 890 - 896.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/24/4430    most recent
bti725v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (7)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Pardi, F.
Right arrow Articles by Lewis, C. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pardi, F.
Right arrow Articles by Lewis, C. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?