A variant at 9p21.3 functionally implicates CDKN2B in paediatric B-cell precursor acute lymphoblastic leukaemia aetiology

Eric A Hungate, Sapana R Vora, Eric R Gamazon, Takaya Moriyama, Timothy Best, Imge Hulur, Younghee Lee, Tiffany-Jane Evans, Eva Ellinghaus, Martin Stanulla, Jéremie Rudant, Laurent Orsi, Jacqueline Clavel, Elizabeth Milne, Rodney J Scott, Ching-Hon Pui, Nancy J Cox, Mignon L Loh, Jun J Yang, Andrew D Skol, Kenan Onel, Eric A Hungate, Sapana R Vora, Eric R Gamazon, Takaya Moriyama, Timothy Best, Imge Hulur, Younghee Lee, Tiffany-Jane Evans, Eva Ellinghaus, Martin Stanulla, Jéremie Rudant, Laurent Orsi, Jacqueline Clavel, Elizabeth Milne, Rodney J Scott, Ching-Hon Pui, Nancy J Cox, Mignon L Loh, Jun J Yang, Andrew D Skol, Kenan Onel

Abstract

Paediatric B-cell precursor acute lymphoblastic leukaemia (BCP-ALL) is the most common cancer of childhood, yet little is known about BCP-ALL predisposition. In this study, in 2,187 cases of European ancestry and 5,543 controls, we discover and replicate a locus indexed by rs77728904 at 9p21.3 associated with BCP-ALL susceptibility (Pcombined=3.32 × 10(-15), OR=1.72) and independent from rs3731217, the previously reported ALL-associated variant in this region. Of correlated SNPs tagged by this locus, only rs662463 is significant in African Americans, suggesting it is a plausible causative variant. Functional analysis shows that rs662463 is a cis-eQTL for CDKN2B, with the risk allele associated with lower expression, and suggests that rs662463 influences BCP-ALL risk by regulating CDKN2B expression through CEBPB signalling. Functional analysis of rs3731217 suggests it is associated with BCP-ALL by acting within a splicing regulatory element determining CDKN2A exon 3 usage (P=0.01). These findings provide new insights into the critical role of the CDKN2 locus in BCP-ALL aetiology.

Figures

Figure 1. Meta-analysis results for paediatric BCP-ALL.
Figure 1. Meta-analysis results for paediatric BCP-ALL.
(a) Manhattan plot of associations for the discovery GWAS of 1,210 cases and 4,144 controls from four independent studies. The red line denotes the threshold for genome-wide significance. Peaks surpassing this threshold are found at the following: 7p12.2 (IKZF1), 10q21.2 (ARID5B) and 9p21.3 (CDKN2). (b) The top panel is the regional LocusZoom plot of the 9p21.3 locus extending 500 kb on either side of the index SNP, rs77728904 (shown in purple), and including all genotyped and imputed SNPs with MAF >0.01. The bottom panel is zoomed in (indicated by the solid black lines between plots) to include only the CDKN2 locus and surrounding recombination peaks. The r2 (visualized by colour) demonstrates that rs77728904 and rs3731217 are in independent linkage blocks between the recombination peaks. r2 for all SNPs in both panels is shown relative to rs77728904. (c) Forest plot of OR with 95% confidence intervals (CIs) for rs77728904 in each of the studies comprising the discovery meta-analysis (Disc (Meta)), the full meta-analysis, the replication analysis and the combined analysis. Squares represent the OR; horizontal lines represent the CI; the solid vertical line represents OR=1; the blue dashed vertical line represents the OR in the combined discovery+replication sets.
Figure 2. Haploview LD map of the…
Figure 2. Haploview LD map of the 9p21.3 locus showing differences in the LD structure by ancestry in 1,000 Genomes Phase 1 populations.
LD is reported as r2 with r2=0.01 white; 0.01<r2<1 shades of grey; r2=1 black. ASW, African American ancestry; CEU, European ancestry; MXL, Hispanic ancestry. rs77728904 is highlighted in orange and rs662463 is highlighted in blue.
Figure 3. Multi-ethnic and functional analysis suggesting…
Figure 3. Multi-ethnic and functional analysis suggesting rs662463 is the causative BCP-ALL SNP tagged by the rs77728904-defined locus.
(a) eQTL analysis from GTEx in whole blood showing the association of CDKN2B expression with rs77728904 and rs662463 genotypes. For both SNPs, the minor allele is the risk allele. Each grey circle represents an individual. Each box plot shows the median rank normalized gene expression (black horizontal line), the first through third quartiles (purple box) and 1.5 × the interquartile range (whiskers). (b) Motifs derived from ChIP data showing the effect of rs662463 on CEBPB binding. The genomic sequence (Ref) surrounding rs662463 is shown below the CEBPB-binding site motif logo for K562 cells from Factorbook, with the reference protective G-allele boxed and the risk A-allele in red. The CEBPB motif logo represents the position weight matrix (PWM) for each base. The PWM LOD scores calculated by HaploReg from TRANSFAC for two CEBPB-binding motifs (‘Tr1' or TRANSFAC accession M00912 and ‘Tr2' or TRANSFAC accession M00109) including either the protective or the risk allele of rs662463 demonstrate that the risk allele disrupts CEBPB binding. (c) RNA-seq analysis in European ancestry LCLs, suggesting that the rs662463 genotype influences the correlation between CDKN2B and CEBPB expression, and that the rs662463 risk allele is associated with lower CDKN2B expression. Shown are the best-fit lines overall (black line), for LCLs homozygous for the protective allele (blue line) and for LCLs with at least one copy (one copy: green circles and two copies: red circles) of the risk allele (green line). (d) RNA-seq analysis in whole blood from an independent set of European ancestry individuals, demonstrating that the rs662463 genotype significantly influences the correlation between CDKN2B and CEBPB expression. The rs662463 risk allele attenuates this correlation and is associated with lower CDKN2B expression. Shown are the best-fit lines overall (black line), for individuals homozygous for the protective allele (blue line) and for individuals with at least one copy (one copy: green circles and two copies: red circles) of the risk allele (green line).
Figure 4. Functional analysis demonstrating rs3731217 is…
Figure 4. Functional analysis demonstrating rs3731217 is associated with CDKN2A exon 3 usage.
(a) Cartoon showing the major protein isoforms encoded by CDKN2A: p14ARF (blue arrows), p16γ (red arrows) and p16INK4a (black arrow). The four main exons are labelled with exon 3 surrounded by a red box. rs3731217 is located in two overlapping intronic splicing elements between exon 1α and exon 1β. The image is modified from AceView. (b) RNA-seq data from LCLs correlating exon usage with rs3731217 genotype showing that exon 3 usage is significantly associated with the protective G-allele (0, 1 and 2 on the x axis refer to G-allele dosage). Each box plot shows the median rank normalized gene expression (black horizontal line), the first through third quartiles (box) and 1.5 × the interquartile range (whiskers).

References

    1. Mullighan C. G. Molecular genetics of B-precursor acute lymphoblastic leukemia. J. Clin. Invest. 122, 3407–3415 (2012).
    1. Cazzaniga G. et al.. Developmental origins and impact of BCR-ABL1 fusion and IKZF1 deletions in monozygotic twins with Ph+ acute lymphoblastic leukemia. Blood 118, 5559–5564 (2011).
    1. van der Weyden L. et al.. Modeling the evolution of ETV6-RUNX1-induced B-cell precursor acute lymphoblastic leukemia in mice. Blood 118, 1041–1051 (2011).
    1. Enciso-Mora V. et al.. Common genetic variation contributes significantly to the risk of childhood B-cell precursor acute lymphoblastic leukemia. Leukemia 26, 2212–2215 (2012).
    1. Trevino L. R. et al.. Germline genomic variants associated with childhood acute lymphoblastic leukemia. Nat. Genet. 41, 1001–1005 (2009).
    1. Papaemmanuil E. et al.. Loci on 7p12.2, 10q21.2 and 14q11.2 are associated with risk of childhood acute lymphoblastic leukemia. Nat. Genet. 41, 1006–1010 (2009).
    1. Xu H. et al.. Novel susceptibility variants at 10p12.31-12.2 for childhood acute lymphoblastic leukemia in ethnically diverse populations. J. Natl Cancer Inst. 105, 733–742 (2013).
    1. Sherborne A. L. et al.. Variation in CDKN2A at 9p21.3 influences childhood acute lymphoblastic leukemia risk. Nat. Genet. 42, 492–494 (2010).
    1. Ellinghaus E. et al.. Identification of germline susceptibility loci in ETV6-RUNX1-rearranged childhood acute lymphoblastic leukemia. Leukemia 26, 902–909 (2012).
    1. Migliorini G. et al.. Variation at 10p12.2 and 10p14 influences risk of childhood B-cell acute lymphoblastic leukemia and phenotype. Blood 122, 3298–3307 (2013).
    1. Perez-Andreu V. et al.. Inherited GATA3 variants are associated with Ph-like childhood acute lymphoblastic leukemia and risk of relapse. Nat. Genet. 45, 1494–1498 (2013).
    1. Orsi L. et al.. Genetic polymorphisms and childhood acute lymphoblastic leukemia: GWAS of the ESCALE study (SFCE). Leukemia 26, 2561–2564 (2012).
    1. Rudant J. et al.. Childhood acute leukemia, early common infections, and allergy: the ESCALE study. Am. J. Epidemiol. 172, 1015–1027 (2010).
    1. Borowitz M. J. et al.. Clinical significance of minimal residual disease in childhood acute lymphoblastic leukemia and its relationship to other prognostic factors: a Children's Oncology Group study. Blood 111, 5477–5485 (2008).
    1. Evans T. J. et al.. Confirmation of childhood acute lymphoblastic leukemia variants, ARID5B and IKZF1, and interaction with parental environmental exposures. PLoS ONE 9, e110255 (2014).
    1. Akasaka T. et al.. Five members of the CEBP transcription factor family are targeted by recurrent IGH translocations in B-cell precursor acute lymphoblastic leukemia (BCP-ALL). Blood 109, 3451–3461 (2007).
    1. McVean G. A. et al.. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
    1. Walsh K. M. et al.. Novel childhood ALL susceptibility locus BMI1-PIP4K2A is specifically associated with the hyperdiploid subtype. Blood 121, 4808–4809 (2013).
    1. Pasmant E. et al.. Characterization of a germ-line deletion, including the entire INK4/ARF locus, in a melanoma-neural system tumor family: identification of ANRIL, an antisense noncoding RNA whose expression coclusters with ARF. Cancer Res. 67, 3963–3969 (2007).
    1. Nicolae D. L. et al.. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).
    1. Trynka G. et al.. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013).
    1. Lonsdale J. et al.. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
    1. Battle A. et al.. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).
    1. Ward L. D. & Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–D934 (2012).
    1. Boyle A. P. et al.. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 22, 1790–1797 (2012).
    1. Dunham I. et al.. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
    1. Wang J. et al.. : a Wiki-based database for transcription factor-binding data generated by the ENCODE consortium. Nucleic Acids Res. 41, D171–D176 (2013).
    1. Matys V. et al.. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108–D110 (2006).
    1. Lappalainen T. et al.. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
    1. Stranger B. E. et al.. Patterns of cis regulatory variation in diverse human populations. PLoS Genet. 8, e1002639 (2012).
    1. Lee Y. et al.. Variants affecting exon skipping contribute to complex traits. PLoS Genet. 8, e1002998 (2012).
    1. Quelle D. E., Zindy F., Ashmun R. A. & Sherr C. J. Alternative reading frames of the INK4a tumor suppressor gene encode two unrelated proteins capable of inducing cell cycle arrest. Cell 83, 993–1000 (1995).
    1. Lin Y. C. et al.. Human p16gamma, a novel transcriptional variant of p16(INK4A), coexpresses with p16(INK4A) in cancer cells and inhibits cell-cycle progression. Oncogene 26, 7017–7027 (2007).
    1. Stott F. J. et al.. The alternative product from the human CDKN2A locus, p14(ARF), participates in a regulatory feedback loop with p53 and MDM2. EMBO J. 17, 5001–5014 (1998).
    1. Tanguay R. L. & Gallie D. R. Translational efficiency is regulated by the length of the 3′ untranslated region. Mol. Cell. Biol. 16, 146–156 (1996).
    1. Stranks G. et al.. Deletions and rearrangement of CDKN2 in lymphoid malignancy. Blood 85, 893–901 (1995).
    1. Mullighan C. G., Williams R. T., Downing J. R. & Sherr C. J. Failure of CDKN2A/B (INK4A/B-ARF)-mediated tumor suppression and resistance to targeted therapy in acute lymphoblastic leukemia induced by BCR-ABL. Genes Dev. 22, 1411–1415 (2008).
    1. Mullighan C. G. et al.. Genome-wide analysis of genetic alterations in acute lymphoblastic leukaemia. Nature 446, 758–764 (2007).
    1. Ladomery M. Aberrant alternative splicing is another hallmark of cancer. Int. J. Cell Biol. 2013, 463786 (2013).
    1. Zhang J. & Manley J. L. Misregulation of pre-mRNA alternative splicing in cancer. Cancer Discov. 3, 1228–1237 (2013).
    1. Loo J. C. et al.. Germline splicing mutations of CDKN2A predispose to melanoma. Oncogene 22, 6387–6394 (2003).
    1. Harland M., Mistry S., Bishop D. T. & Bishop J. A. A deep intronic mutation in CDKN2A is associated with disease in a subset of melanoma pedigrees. Hum. Mol. Genet. 10, 2679–2686 (2001).
    1. Winick N. et al.. Delayed intensification (DI) enhances event-free survival (EFS) of children with B-precursor acute lymphoblastic leukemia (ALL) who received intensification therapy with six courses of intravenous methotrexate (MTX): POG 9904/9905: A Children's Oncology Group Study (COG). ASH Annu. Meet. Abs. 110, 583 (2007).
    1. Manolio T. A. et al.. New models of collaboration in genome-wide association studies: the Genetic Association Information Network. Nat. Genet. 39, 1045–1051 (2007).
    1. Korn J. M. et al.. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, 1253–1260 (2008).
    1. Ehret G. B. et al.. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 478, 103–109 (2011).
    1. Metayer C. et al.. The Childhood Leukemia International Consortium. Cancer Epidemiol. 37, 336–347 (2013).
    1. Krawczak M. et al.. PopGen: population-based recruitment of patients and controls for the analysis of complex genotype-phenotype relationships. Community Genet. 9, 55–61 (2006).
    1. Purcell S. et al.. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
    1. Meyer L. R. et al.. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 41, D64–D69 (2013).
    1. Price A. L. et al.. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
    1. Howie B. N., Donnelly P. & Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
    1. Delaneau O., Marchini J. & Zagury J. F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2012).
    1. Marchini J., Howie B., Myers S., McVean G. & Donnelly P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906–913 (2007).
    1. Willer C. J., Li Y. & Abecasis G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
    1. Higgins J. P., Thompson S. G., Deeks J. J. & Altman D. G. Measuring inconsistency in meta-analyses. BMJ 327, 557–560 (2003).
    1. Devlin B. & Roeder K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).
    1. Pruim R. J. et al.. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010).
    1. Barrett J. C., Fry B., Maller J. & Daly M. J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005).
    1. Bild D. E. et al.. Multi-ethnic study of atherosclerosis: objectives and design. Am. J. Epidemiol. 156, 871–881 (2002).
    1. Pritchard J. K., Stephens M. & Donnelly P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).
    1. Yang J. J. et al.. Ancestry and pharmacogenomics of relapse in acute lymphoblastic leukemia. Nat. Genet. 43, 237–241 (2011).
    1. Mao X. et al.. A genomewide admixture mapping panel for Hispanic/Latino populations. Am. J. Hum. Genet. 80, 1171–1178 (2007).
    1. Zeller T. et al.. Genetics and beyond--the transcriptome of human monocytes and disease susceptibility. PLoS ONE 5, e10693 (2010).
    1. Bernstein B. E. et al.. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).
    1. International HapMap Consortium. et al.. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).
    1. Sandelin A., Alkema W., Engstrom P., Wasserman W. W. & Lenhard B. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res. 32, D91–D94 (2004).
    1. Trapnell C. et al.. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).
    1. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. ISBN 3-900051-07-0. (2008).
    1. Thierry-Mieg D. & Thierry-Mieg J. AceView: a comprehensive cDNA-supported gene and transcripts annotation. Genome Biol. 7 Suppl 1,, S12 1–14 (2006).

Source: PubMed

3
Se inscrever