A method to decipher pleiotropy by detecting underlying heterogeneity driven by hidden subgroups applied to autoimmune and neuropsychiatric diseases

Buhm Han, Jennie G Pouget, Kamil Slowikowski, Eli Stahl, Cue Hyunkyu Lee, Dorothee Diogo, Xinli Hu, Yu Rang Park, Eunji Kim, Peter K Gregersen, Solbritt Rantapää Dahlqvist, Jane Worthington, Javier Martin, Steve Eyre, Lars Klareskog, Tom Huizinga, Wei-Min Chen, Suna Onengut-Gumuscu, Stephen S Rich, Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium, Naomi R Wray, Soumya Raychaudhuri, Buhm Han, Jennie G Pouget, Kamil Slowikowski, Eli Stahl, Cue Hyunkyu Lee, Dorothee Diogo, Xinli Hu, Yu Rang Park, Eunji Kim, Peter K Gregersen, Solbritt Rantapää Dahlqvist, Jane Worthington, Javier Martin, Steve Eyre, Lars Klareskog, Tom Huizinga, Wei-Min Chen, Suna Onengut-Gumuscu, Stephen S Rich, Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium, Naomi R Wray, Soumya Raychaudhuri

Abstract

There is growing evidence of shared risk alleles for complex traits (pleiotropy), including autoimmune and neuropsychiatric diseases. This might be due to sharing among all individuals (whole-group pleiotropy) or a subset of individuals in a genetically heterogeneous cohort (subgroup heterogeneity). Here we describe the use of a well-powered statistic, BUHMBOX, to distinguish between those two situations using genotype data. We observed a shared genetic basis for 11 autoimmune diseases and type 1 diabetes (T1D; P < 1 × 10(-4)) and for 11 autoimmune diseases and rheumatoid arthritis (RA; P < 1 × 10(-3)). This sharing was not explained by subgroup heterogeneity (corrected PBUHMBOX > 0.2; 6,670 T1D cases and 7,279 RA cases). Genetic sharing between seronegative and seropostive RA (P < 1 × 10(-9)) had significant evidence of subgroup heterogeneity, suggesting a subgroup of seropositive-like cases within seronegative cases (PBUHMBOX = 0.008; 2,406 seronegative RA cases). We also observed a shared genetic basis for major depressive disorder (MDD) and schizophrenia (P < 1 × 10(-4)) that was not explained by subgroup heterogeneity (PBUHMBOX = 0.28; 9,238 MDD cases).

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1. Overview of BUHMBOX
Figure 1. Overview of BUHMBOX
(a) Under the scenario of subgroup heterogeneity, risk alleles of disease B (DB)-associated loci will be enriched in a subgroup of disease A (DA) cases, producing positive correlations between DB risk allele dosages from independent loci. (b) Under the scenario where there is no heterogeneity and DA and DB share alleles due to pleiotropy (i.e. whole-group pleiotropy), DB risk alleles will be uniformly distributed and have no correlations. Red boxes: risk alleles; white boxes: non-risk alleles.
Figure 2. Power gain by weighting SNPs…
Figure 2. Power gain by weighting SNPs by allele frequency and effect size
We compared the statistical power of BUHMBOX with a weighting scheme that optimally weights correlations between SNPs (weighted) to an alternative approach that weights correlations uniformly (unweighted; equation (12) in Supplementary Note). We simulated 1,000 case individuals and assumed 50 risk loci, whose OR and RAFs were sampled from the GWAS catalog. Colored bands denote 95% confidence intervals of power estimates.
Figure 3. BUHMBOX power analysis
Figure 3. BUHMBOX power analysis
Power of BUHMBOX for detecting heterogeneity as a function of the number of risk loci, number of case samples, and the proportion of samples that actually have different phenotype (heterogeneity proportion, π). We assume that we have the same number of controls as cases. White lines denote 20, 40, 60, and 80% power. (a) Power as a function of number of case individuals and heterogeneity proportion, when the number of risk loci is fixed at 50. (b) Power as a function of number of risk loci and heterogeneity proportion, when the case sample size is fixed at 2,000.
Figure 4. Genetic sharing between autoimmune diseases…
Figure 4. Genetic sharing between autoimmune diseases and psychiatric disorders
In (a) and (b), we show only the diseases that have significantly positive GRS p-values out of the 17 tested. Y-axis denotes the expected heterogeneity proportion (π) to explain observed genetic sharing. Vertical bars indicate 95% confidence intervals. Heterogeneity proportion estimates are based on GRS analysis, assuming no pleiotropy for (a) T1D, (b) RA, (c) seronegative RA, and (d) MDD.
Figure 5. Statistical power of BUHMBOX to…
Figure 5. Statistical power of BUHMBOX to detect heterogeneity
We calculated power by performing 1,000 simulations with corresponding sample size, number of risk alleles, risk allele frequencies, and odds ratios. To calculate power for (c) and (d), we used a significance threshold of 0.05. For (a) and (b), the threshold was adjusted using the Bonferroni correction accounting for 11 tests in T1D and RA, respectively.
Figure 6. BUHMBOX results
Figure 6. BUHMBOX results
We show only diseases with significantly positive GRS p-values (for complete results for all traits tested, see Supplementary Table 4). Significant GRS p-values indicate evidence of shared genetic structure; significant BUHMBOX p-value indicates evidence of heterogeneity. Point size represents the number of DB-associated SNPs included in the analysis. Dashed vertical lines denote the Bonferroni-adjusted significance threshold for the BUHMBOX test statistic. Arrow indicates significant BUHMBOX test statistic.

References

    1. Sivakumaran S, et al. Abundant pleiotropy in human complex diseases and traits. Am J Hum Genet. 2011;89:607–618.
    1. Cotsapas C, et al. Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet. 2011;7:e1002254.
    1. Cross-Disorder Group of the Psychiatric Genomics Consortium. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet. 2013;381:1371–1379.
    1. Fortune MD, et al. Statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls. Nat Genet. 2015;47:839–846.
    1. Lee SH, Yang J, Goddard ME, Visscher PM, Wray NR. Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics. 2012;28:2540–2542.
    1. Cross-Disorder Group of the Psychiatric Genomics Consortium. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat Genet. 2013;45:984–994.
    1. Bulik-Sullivan B, et al. An atlas of genetic correlations across human diseases and traits. Nat Genet. 2015;47:1236–1241.
    1. Pendergrass SA, et al. Phenome-wide association study (PheWAS) for detection of pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) Network. PLoS Genet. 2013;9:e1003087.
    1. Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med. 2015;372:793–795.
    1. Criswell LA, et al. Analysis of families in the multiple autoimmune disease genetics consortium (MADGC) collection: the PTPN22 620W allele associates with multiple autoimmune phenotypes. Am J Hum Genet. 2005;76:561–571.
    1. Kendler KS, Neale MC, Kessler RC, Heath AC, Eaves LJ. Major depression and generalized anxiety disorder. Same genes, (partly) different environments? Arch Gen Psychiatry. 1992;49:716–722.
    1. Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res. 2007;17:1520–1528.
    1. Purcell SM, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460:748–752.
    1. Lee SH, et al. New data and an old puzzle: the negative association between schizophrenia and rheumatoid arthritis. Int J Epidemiol. 2015;44:1706–21.
    1. Power RA, et al. Polygenic risk scores for schizophrenia and bipolar disorder predict creativity. Nat Neurosci. 2015;18:953–955.
    1. Solovieff N, Cotsapas C, Lee PH, Purcell SM, Smoller JW. Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet. 2013;14:483–495.
    1. Wray NR, Lee SH, Kendler KS. Impact of diagnostic misclassification on estimation of genetic correlations using genome-wide genotypes. Eur J Hum Genet. 2012;20:668–674.
    1. Silverberg MS, et al. Diagnostic misclassification reduces the ability to detect linkage in inflammatory bowel disease genetic studies. Gut. 2001;49:773–776.
    1. van der Linden MP, et al. Value of anti-modified citrullinated vimentin and third-generation anti-cyclic citrullinated peptide compared with second-generation anti-cyclic citrullinated peptide and rheumatoid factor in predicting disease outcome in undifferentiated arthritis and rheumatoid arthritis. Arthritis Rheum. 2009;60:2232–2241.
    1. Wiik AS, van Venrooij WJ, Pruijn GJ. All you wanted to know about anti-CCP but were afraid to ask. Autoimmun Rev. 2010;10:90–93.
    1. Bromet EJ, et al. Diagnostic shifts during the decade following first admission for psychosis. Am J Psychiatry. 2011;168:1186–1194.
    1. Gibson P, et al. Subtypes of medulloblastoma have distinct developmental origins. Nature. 2010;468:1095–1099.
    1. Smoller JW, Lunetta KL, Robins J. Implications of comorbidity and ascertainment bias for identifying disease genes. Am J Med Genet. 2000;96:817–822.
    1. Burrell RA, McGranahan N, Bartek J, Swanton C. The causes and consequences of genetic heterogeneity in cancer evolution. Nature. 2013;501:338–345.
    1. Jeste SS, Geschwind DH. Disentangling the heterogeneity of autism spectrum disorder through genetic findings. Nat Rev Neurol. 2014;10:74–81.
    1. Flint J, Kendler KS. The genetics of major depression. Neuron. 2014;81:484–503.
    1. Cho JH, Feldman M. Heterogeneity of autoimmune diseases: pathophysiologic insights from genetics and implications for new therapies. Nat Med. 2015;21:730–738.
    1. Welter D, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–6.
    1. Raychaudhuri S, et al. Genetic variants at CD28, PRDM1 and CD2/CD58 are associated with rheumatoid arthritis risk. Nat Genet. 2009;41:1313–1318.
    1. Eyre S, et al. High-density genetic mapping identifies new susceptibility loci for rheumatoid arthritis. Nat Genet. 2012;44:1336–1340.
    1. The International HapMap Consortium. The International HapMap Project. Nature. 2003;426:789–796.
    1. Smyth DJ, et al. Shared and distinct genetic variants in type 1 diabetes and celiac disease. N Engl J Med. 2008;359:2767–2777.
    1. Festen EA, et al. A meta-analysis of genome-wide association scans identifies IL18RAP, PTPN2, TAGAP, and PUS10 as shared risk loci for Crohn’s disease and celiac disease. PLoS Genet. 2011;7:e1001283.
    1. Zhernakova A, et al. Meta-analysis of genome-wide association studies in celiac disease and rheumatoid arthritis identifies fourteen non-HLA shared loci. PLoS Genet. 2011;7:e1002004.
    1. Jostins L, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature. 2012;491:119–124.
    1. Cotsapas C, Hafler DA. Immune-mediated disease genetics: the shared basis of pathogenesis. Trends Immunol. 2013;34:22–26.
    1. Onengut-Gumuscu S, et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat Genet. 2015;47:381–386.
    1. Han B, et al. Fine mapping seronegative and seropositive rheumatoid arthritis to shared and distinct HLA alleles by adjusting for the effects of heterogeneity. Am J Hum Genet. 2014;94:522–532.
    1. Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511:421–427.
    1. Major Depressive Disorder Working Group of the Psychiatric GWAS Consortium. A mega-analysis of genome-wide association studies for major depressive disorder. Mol Psychiatry. 2013;18:497–511.
    1. Wray NR, Maier R. Genetic basis of complex genetic disease: The contribution of disease heterogeneity to missing heritability. Curr Epidemiol Rep. 2014;1:220–227.
    1. Jennrich RI. An asymptotic χ2 test for the equality of two correlation matrices. J Am Statist Assoc. 1970;65:904–912.
    1. Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Statist Assoc. 1989;84:1065–1073.
    1. Lin DY, Sullivan PF. Meta-analysis of genome-wide association studies with overlapping subjects. Am J Hum Genet. 2009;85:862–872.
    1. Purcell S, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–575.

Source: PubMed

3
Abonnere