Skip Navigation


Bioinformatics Advance Access originally published online on April 3, 2006
Bioinformatics 2006 22(11):1404-1405; doi:10.1093/bioinformatics/btl124
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/11/1404    most recent
btl124v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Falchi, M.
Right arrow Articles by Borlino, C. C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Falchi, M.
Right arrow Articles by Borlino, C. C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

PowQ: a user-friendly package for the design of variance component multipoint linkage analysis studies

Mario Falchi 1,2,3,* and Cesare Cappio Borlino 3

1 Twin Research and Genetic Epidemiology Unit, St Thomas' Hospital London, UK
2 Genetica Medica, Dipartimento Materno-Infantile, University of Modena and Reggio Emilia Modena, Italy
3 Shardna Life Sciences Cagliari, Italy

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 REFERENCES
 

Summary: A user-friendly, graphical package for power evaluation and enhancement planning through variance component linkage analysis in a multipoint framework.

Availability: The package is made available at: http://www.twin-research.ac.uk/WebPowQ/PowQ.htm

Contact: mario.falchi{at}kcl.ac.uk

Power of a linkage study for a quantitative trait is defined as the probability of detecting a true quantitative trait locus (QTL) in a given sample under study. Power evaluation represents a crucial initial step in the design of a linkage mapping experiment to corroborate the feasibility of the study for the specific sample size and family structure, under various genetic models and marker densities. It is particularly important in human linkage studies, because of the costs of genotyping and phenotyping and the difficulties in recruiting families. Moreover, power estimation is commonly requested when submitting grant proposals.

In variance component (VC) linkage analysis statistical evidence for a QTL at a chromosomal position is evaluated through the likelihood ratio test between the model in which the QTL effect is estimated (alternative hypothesis) and the reduced model in which the QTL effect is constrained to equal 0 (null hypothesis). Assuming that the phenotypic and genotypic information are correct, the null hypothesis of no linkage can still be incorrectly accepted giving rise to a Type II error with probability ß. The power of the study (1 – ß) represents the probability of successfully rejecting the null hypothesis of no linkage when it is false. The power of a study can be estimated either by asymptotic approaches or through simulation studies and it relies on several factors. These include the magnitude of the QTL effect, the sample size, the sampling unit, the required level of significance {alpha} and the informativity of the markers. Analytical expressions have been derived for specific sampling schemes (Williams and Blangero, 1999; Sham et al., 2000), and have been successfully implemented, for instance in the Genetic Power Calculator for sibships of different sizes (Purcell et al., 2003).

When the study sample comprises assorted family structures, including extended pedigrees, as might happen in specific studies, algebraic manipulation required to obtain closed equations for power estimation becomes complex. Moreover the asymptotic distribution of the test statistic frequently assumed by theoretical calculations might not hold under certain conditions—e.g. with the underpowered sample often used in human genetic studies. Under these scenarios power can be evaluated more efficiently through simulations.

We implemented a user-friendly package, named PowQ (Fig. 1), which allows power calculations on assorted family structures, even on large inbred genealogies comprising hundreds of individuals. A graphical interface assists the user in the trait-model specification, which currently comprises a diallelic QTL with additive effects, a residual additive polygenic effect and a random individual specific effect. Founder individuals of each family are assumed to be unrelated and their genotype frequencies are assumed to be in Hardy–Weinberg equilibrium. Power is computed by randomly sampling from the inheritance space, assuming complete knowledge of the identical-by-descent (IBD) sharing among pedigree members and counting the number of times the test statistic—calculated on the user specified subjects—falls above the required thresholds. PowQ has been extensively validated and power estimates are concordant with what expected by using the Williams and Blangero (1999) formulas under their fixed sampling schemes.


Figure 1
View larger version (27K):
[in this window]
[in a new window]
 
Fig. 1 Sample screen images. The main picture is the program window (showing the power for the analysed dataset). On the right, from the top to the bottom are shown the genetic model, the average LOD-score and the LOD-score versus QTL-effect estimation in a comparative analysis.

 
PowQ allows power evaluation studies in a multipoint framework, assuming complete knowledge of the inheritance pattern. Under this context, the power estimates depend only on the sample size/sampling unit and on the genetic model, but not on marker informativeness. Although these estimates of power are not exact, they will usually be sufficiently accurate for practical purposes. Moreover, this approach allows the analysis of large pedigrees, since the computational burden for the IBD estimation from genotypic data is a strong limiting factor in the evaluation of a large number of simulations. As showed by Falchi et al. (2004) on large pedigrees with a few untyped generations, the mean deviations between the estimated and real IBD in a multipoint context were very small with negligible effect on power of VC linkage analysis.

The effects on power estimates owing to incomplete marker information, as usually observed with the real data, cannot be easily quantified although power attenuation would be proportional to the informativeness of the map of markers available for the analysed families. The amount of inheritance information that can be extracted from a map of markers can be easily calculated by evaluating an entropy-based measure of information content (Kruglyak, 1997) as implemented in Merlin (Abecasis et al., 2002).

It can be shown that in many circumstances a single-point framework is likely to provide an underestimate of the real power. In linkage analysis studies, it has been previously noted that chance variations in the QTL location estimates are often expected for complex traits (e.g. Hsueh et al., 2001). Even in simulation studies when the markers' IBD state among relatives is fully known, the highest linkage signal can be observed not at the QTL closest marker (e.g. Goring et al., 2001). This variation depends on power, and on the degree of correlation between the trait and the IBD patterns among relative pairs. For instance, the latter can vary according to the recombination fraction between the markers and the QTL. Recombination events might disrupt the correlation between the trait and the inheritance patterns observed in an adjacent chromosomal region, but it can be still observed between the trait and another flanking chromosomal segment which co-segregate with the QTL. Under this context, and even for fully known IBD states, a single-point analysis using only the QTL closest marker often results in a loss of power. Through a multipoint-based power analysis PowQ provides both accurate estimates of the expected power and the empirical confidence intervals for the location of the QTL. The latter information could be useful when planning a replication study, or to decide whether undertaking a sufficiently powered study—but with a large confidence interval for the QTL localization—which could potentially show a linkage signal several centimorgans away from the QTL.

An additional feature of PowQ is that it allows power comparison of nested samples that is easily calculated providing an initial sample and its possible extensions/reductions, both in terms of family units and in terms of family members. Relative power computation allows identification of the most informative and/or economical sample. Sample size increase/decrease can be carefully planned. When dealing with extended pedigrees, the effects on power of adding and/or removing a number of individuals from the analysis are difficult to be analytically predicted as they are peculiar to the type and number of pairwise relationships that each individual contracts with the others. In particular PowQ also allows investigation of the effects on power derived from sub-pedigrees extracted from a single large genealogy, as commonly done to diminish IBD estimation computational efforts (Falchi et al., 2004).

Results, graphically shown in real time during the analysis, can be exported in JPG format. They summarize, within all the simulations, the observed power, the distribution of the total narrow heritability of the trait, mean LOD-scores at the markers and the correlation between observed QTL-effect and LOD-score. This latter information allows predicting the estimated QTL effect, independently from the generating value, when significant evidence of linkage is observed (Goring et al., 2001).

Detailed results for each simulation are stored in ASCII format for further examination and can be imported in standard statistical packages. The VC engine follows the Merlin implementation (Abecasis et al., 2002). The package is entirely written in Java language and is therefore executable on every supported operating system. PowQ will be extended to include false positive rate computations (Type I errors), to encompass dominance effects, and to allow for association analysis power evaluations.


    Acknowledgments
 
The authors would like to thank Paola Forabosco and Tim Spector for their comments. The authors would also like to thank the three anonymous reviewers for their comments. The work of M.F. was partially supported by EU funding for the Euroclot and GenomEutwin FP6 (Ref: LSHM-CT-2004-005268, QLK2-CT-2002-01254) programs.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Martin Bishop

Received on January 31, 2006; revised on March 2, 2006; accepted on March 28, 2006

    REFERENCES
 TOP
 ABSTRACT
 REFERENCES
 

    Abecasis, G.R., et al. (2002) Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nat. Genet, . 30, 97–101[CrossRef][ISI][Medline].

    Falchi, M., et al. (2004) A genomewide search using an original pairwise sampling approach for large genealogies identifies a new locus for total and low-density lipoprotein cholesterol in two genetically differentiated isolates of Sardinia. Am. J. Hum. Genet, . 75, 1015–1031[CrossRef][Medline].

    Goring, H.H., et al. (2001) Large upward bias in estimation of locus-specific effects from genomewide scans. Am. J. Hum. Genet, . 69, 1357–1369[CrossRef][ISI][Medline].

    Hsueh, W.C., et al. (2001) Replication of linkage to quantitative trait loci: variation in location and magnitude of the lod score. Genet. Epidemiol, . 21, Suppl. 1, S473–S478.

    Kruglyak, L. (1997) The use of a genetic map of biallelic markers in linkage studies. Nat. Genet, . 17, 21–24[CrossRef][ISI][Medline].

    Purcell, S., et al. (2003) Genetic power calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics, 19, 149–150[Abstract/Free Full Text].

    Sham, P.C., et al. (2000) Power of linkage versus association analysis of quantitative traits, by use of variance-components models, for sibship data. Am. J. Hum. Genet, . 66, 1616–1630[CrossRef][ISI][Medline].

    Williams, J.T. and Blangero, J. (1999) Power of variance component linkage analysis to detect quantitative trait loci. Ann. Hum. Genet, . 63, 545–563[CrossRef][ISI][Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/11/1404    most recent
btl124v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Falchi, M.
Right arrow Articles by Borlino, C. C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Falchi, M.
Right arrow Articles by Borlino, C. C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?