PGA: power calculator for case-control genetic association analyses

Idan Menashe, Philip S Rosenberg, Bingshu E Chen, Idan Menashe, Philip S Rosenberg, Bingshu E Chen

Abstract

Background: Statistical power calculations inform the design and interpretation of genetic association studies, but few programs are tailored to case-control studies of single nucleotide polymorphisms (SNPs) in unrelated subjects.

Results: We have developed the "Power for Genetic Association analyses" (PGA) package which comprises algorithms and graphical user interfaces for sample size and minimum detectable risk calculations using SNP or haplotype effects under different genetic models and study constrains. The software accounts for linkage disequilibrium and statistical multiple comparisons. The results are presented in graphs or tables and can be printed or exported in standard file formats.

Conclusion: PGA is user friendly software that can facilitate decision making for association studies of candidate genes, fine-mapping studies, and whole-genome scans. Stand-alone executable files and a Matlab toolbox are available for download at: http://dceg.cancer.gov/bb/tools/pga.

Figures

Figure 1
Figure 1
Graphical user interfaces for statistical power calculations. (A) PGA1 – statistical power is calculated and plotted for different sample sizes and various genetic and statistical parameters. Input variables (e.g. 'Genetic mode of inheritance', 'disease allele frequency', 'relative risk (RR)', etc.) can be specified using slider controls, or by typing specific values in the corresponding text boxes. Pressing the 'Run' button executes the calculations and plots the relationships between power and sample size according to the specified study parameters. A keyed legend listing the corresponding parameters is shown on the graph. Up to eight different analyses (color-coded) can be displayed simultaneously, allowing the comparison of different scenarios. (B) PGA2 – Minimal detectable relative risk (MDRR) is calculated and plotted for various minor allele frequencies (MAFs) of potential genotyped loci. Input and output is similar to PGA1.
Figure 2
Figure 2
Effective degrees of freedom calculator. (A) HapMap SNP genotype data from human chromosome 8q24 (chr8:128100000-128700000) is used as an input. The calculated EDF for SNPs with MAF > 0.05 in this dataset is 608. (B) LD map for the selected SNPs is also displayed in the output.

References

    1. Cardon LR, Bell JI. Association study designs for complex diseases. Nature reviews. 2001;2:91–99. doi: 10.1038/35052543.
    1. Lander ES, Schork NJ. Genetic dissection of complex traits. Science. 1994;265:2037–2048. doi: 10.1126/science.8091226.
    1. Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science. 1996;273:1516–1517. doi: 10.1126/science.273.5281.1516.
    1. Gordon D, Finch SJ. Factors affecting statistical power in the detection of genetic association. The Journal of clinical investigation. 2005;115:1408–1418. doi: 10.1172/JCI24756.
    1. Lubin JH, Gail MH. On power and sample size for studying features of the relative odds of disease. American journal of epidemiology. 1990;131:552–566.
    1. De La Vega FM, Gordon D, Su X, Scafe C, Isaac H, Gilbert DA, Spier EG. Power and sample size calculations for genetic case/control studies using gene-centric SNP maps: application to human chromosomes 6, 21, and 22 in three populations. Human heredity. 2005;60:43–60. doi: 10.1159/000087918.
    1. Knight J. A survey of current software for genetic power calculations. Human genomics. 2004;1:225–227.
    1. S.A.G.E. - Statistical Analysis for Genetic Epidemiology
    1. Lange C, DeMeo D, Silverman EK, Weiss ST, Laird NM. PBAT: tools for family-based association studies. American journal of human genetics. 2004;74:367–369. doi: 10.1086/381563.
    1. Ploughman LM, Boehnke M. Estimating the power of a proposed linkage study for a complex genetic trait. American journal of human genetics. 1989;44:543–551.
    1. Weeks DE, Ott J, Lathrop GM. SLINK: a general simulation program for linkage analysis. American journal of human genetics. 1990;47:A204.
    1. Purcell S, Cherny SS, Sham PC. Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics (Oxford, England) 2003;19:149–150. doi: 10.1093/bioinformatics/19.1.149.
    1. Gordon D, Haynes C, Blumenfeld J, Finch SJ. PAWE-3D: visualizing power for association with error in case-control genetic studies of complex traits. Bioinformatics (Oxford, England) 2005;21:3935–3937. doi: 10.1093/bioinformatics/bti643.
    1. Skol AD, Scott LJ, Abecasis GR, Boehnke M. Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nature genetics. 2006;38:209–213. doi: 10.1038/ng1706.
    1. Gudmundsson J, Sulem P, Manolescu A, Amundadottir LT, Gudbjartsson D, Helgason A, Rafnar T, Bergthorsson JT, Agnarsson BA, Baker A, Sigurdsson A, Benediktsdottir KR, Jakobsdottir M, Xu J, Blondal T, Kostic J, Sun J, Ghosh S, Stacey SN, Mouy M, Saemundsdottir J, Backman VM, Kristjansson K, Tres A, Partin AW, Albers-Akkers MT, Godino-Ivan Marcos J, Walsh PC, Swinkels DW, Navarrete S, Isaacs SD, Aben KK, Graif T, Cashy J, Ruiz-Echarri M, Wiley KE, Suarez BK, Witjes JA, Frigge M, Ober C, Jonsson E, Einarsson GV, Mayordomo JI, Kiemeney LA, Isaacs WB, Catalona WJ, Barkardottir RB, Gulcher JR, Thorsteinsdottir U, Kong A, Stefansson K. Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nature genetics. 2007;39:631–637. doi: 10.1038/ng1999.
    1. Haiman CA, Patterson N, Freedman ML, Myers SR, Pike MC, Waliszewska A, Neubauer J, Tandon A, Schirmer C, McDonald GJ, Greenway SC, Stram DO, Le Marchand L, Kolonel LN, Frasco M, Wong D, Pooler LC, Ardlie K, Oakley-Girvan I, Whittemore AS, Cooney KA, John EM, Ingles SA, Altshuler D, Henderson BE, Reich D. Multiple regions within 8q24 independently affect risk for prostate cancer. Nature genetics. 2007;39:638–644. doi: 10.1038/ng2015.
    1. Yeager M, Orr N, Hayes RB, Jacobs KB, Kraft P, Wacholder S, Minichiello MJ, Fearnhead P, Yu K, Chatterjee N, Wang Z, Welch R, Staats BJ, Calle EE, Feigelson HS, Thun MJ, Rodriguez C, Albanes D, Virtamo J, Weinstein S, Schumacher FR, Giovannucci E, Willett WC, Cancel-Tassin G, Cussenot O, Valeri A, Andriole GL, Gelmann EP, Tucker M, Gerhard DS, Fraumeni JF, Jr., Hoover R, Hunter DJ, Chanock SJ, Thomas G. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nature genetics. 2007;39:645–649. doi: 10.1038/ng2022.
    1. International HapMap Project
    1. Nyholt DR. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. American journal of human genetics. 2004;74:765–769. doi: 10.1086/383251.
    1. Chen BE, Sakoda LC, Hsing AW, Rosenberg PS. Resampling-based multiple hypothesis testing procedures for genetic case-control association studies. Genetic epidemiology. 2006;30:495–507. doi: 10.1002/gepi.20162.
    1. Witte JS. Multiple prostate cancer risk variants on 8q24. Nature genetics. 2007;39:579–580. doi: 10.1038/ng0507-579.
    1. Manolio TA, Rodriguez LL, Brooks L, Abecasis G, Ballinger D, Daly M, Donnelly P, Faraone SV, Frazer K, Gabriel S, Gejman P, Guttmacher A, Harris EL, Insel T, Kelsoe JR, Lander E, McCowin N, Mailman MD, Nabel E, Ostell J, Pugh E, Sherry S, Sullivan PF, Thompson JF, Warram J, Wholley D, Milos PM, Collins FS. New models of collaboration in genome-wide association studies: the Genetic Association Information Network. Nature genetics. 2007;39:1045–1051. doi: 10.1038/ng2127.
    1. Frayling TM. Genome-wide association studies provide new insights into type 2 diabetes aetiology. Nature reviews. 2007;8:657–662.
    1. Benjamini Y, Y. H. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. 1995;57:289–300.

Source: PubMed

3
Iratkozz fel