Rare Variants in the DNA Repair Pathway and the Risk of Colorectal Cancer

Marco Matejcic, Hiba A Shaban, Melanie W Quintana, Fredrick R Schumacher, Christopher K Edlund, Leah Naghi, Rish K Pai, Robert W Haile, A Joan Levine, Daniel D Buchanan, Mark A Jenkins, Jane C Figueiredo, Gad Rennert, Stephen B Gruber, Li Li, Graham Casey, David V Conti, Stephanie L Schmit, Marco Matejcic, Hiba A Shaban, Melanie W Quintana, Fredrick R Schumacher, Christopher K Edlund, Leah Naghi, Rish K Pai, Robert W Haile, A Joan Levine, Daniel D Buchanan, Mark A Jenkins, Jane C Figueiredo, Gad Rennert, Stephen B Gruber, Li Li, Graham Casey, David V Conti, Stephanie L Schmit

Abstract

Background: Inherited susceptibility is an important contributor to colorectal cancer risk, and rare variants in key genes or pathways could account in part for the missing proportion of colorectal cancer heritability.

Methods: We conducted an exome-wide association study including 2,327 cases and 2,966 controls of European ancestry from three large epidemiologic studies. Single variant associations were tested using logistic regression models, adjusting for appropriate study-specific covariates. In addition, we examined the aggregate effects of rare coding variation at the gene and pathway levels using Bayesian model uncertainty techniques.

Results: In an exome-wide gene-level analysis, we identified ST6GALNAC2 as the top associated gene based on the Bayesian risk index (BRI) method [summary Bayes factor (BF)BRI = 2604.23]. A rare coding variant in this gene, rs139401613, was the top associated variant (P = 1.01 × 10-6) in an exome-wide single variant analysis. Pathway-level association analyses based on the integrative BRI (iBRI) method found extreme evidence of association with the DNA repair pathway (BFiBRI = 17852.4), specifically with the nonhomologous end joining (BFiBRI = 437.95) and nucleotide excision repair (BFiBRI = 36.96) subpathways. The iBRI method also identified RPA2, PRKDC, ERCC5, and ERCC8 as the top associated DNA repair genes (summary BFiBRI ≥ 10), with rs28988897, rs8178232, rs141369732, and rs201642761 being the most likely associated variants in these genes, respectively.

Conclusions: We identified novel variants and genes associated with colorectal cancer risk and provided additional evidence for a role of DNA repair in colorectal cancer tumorigenesis.

Impact: This study provides new insights into the genetic predisposition to colorectal cancer, which has potential for translation into improved risk prediction.

Conflict of interest statement

Conflict of interest disclosure

The authors declare no potential conflicts of interest.

©2021 American Association for Cancer Research.

Figures

Figure 1.. Project flow chart summarizing analyses…
Figure 1.. Project flow chart summarizing analyses and key results.
A total of 142,390 polymorphic variants in 5,293 samples (2,327 CRC cases and 2,966 controls) were analyzed. The associations between common and rare variants and the risk of CRC were estimated through the likelihood ratio tests. A meta-analysis of the study-specific test statistics revealed strong associations between CRC and rs139401613 in ST6GALNAC2 (meta P=1.01×10−6), rs35467001 (meta P=7.72×10−6; 3.9%) and rs34322745 (meta P=5.16×10−5) in SDK2. Only rare variants (MAF<1%) were retained for gene- and pathway-level analyses. A summary Bayes Factor (BF) was computed as the product of the three study-specific BFs from the Bayesian Risk Index (BRI) method. The top associated gene was ST6GALNAC2 (summary BFBRI=2604.23) followed by OSTM1, COL22A1, EPHA7, TTC28, SPTBN5, FSIP1, AKR1D1, NOTCH3, C6orf120, OR11H4, NAT1, EVI2B, CENPQ, SMPDL3A, CEP43 and GPC3 with summary BFBRI ranging from 104.43 to 948.28. The integrative Bayesian Risk Index (iBRI) method was used to perform pathway level analysis for DNA repair, TGF-β signaling, vitamin D and folate metabolism. Given evidence of strong association with the DNA repair pathway (summary BFBRI=17852.4), we further investigated the likely subpathways, genes and variants that are driving the association. At the subpathway level, extreme associations were found for NHEJ (summary BFiBRI=437.85) and NER (summary BFiBRI=36.96). The top associated gene was RPA2 (summary BFiBRI=164.28), and the most likely associated variant in this gene was rs28988897 (summary BFiBRI=657.56). Strong associations were also reported for PRKDC (summary BFiBRI=109.77) [rs8178232 (summary BFiBRI=17.68)], ERCC5 (summary BFiBRI=65.52) [rs141369732 (summary BFiBRI=6.39)] and ERCC8 (summary BFiBRI=21.9) [rs201642761 (summary BFiBRI=0.92)]. Abbreviations: KY = Kentucky Case-Control Study; CCFR = Colon Cancer Family Registry; MECC = Molecular Epidemiology of Colorectal Cancer Study; ca = cases; co = controls; meta P = meta-analysis p-value; MAF = minor allele frequency; BRI = Bayesian Risk Index; iBRI = integrative Bayesian Risk Index; BF = Bayesian Factor.
Figure 2.. Top model inclusions for top…
Figure 2.. Top model inclusions for top DNA repair genes.
Top 10 DNA repair genes included in the top 25 models identified using the iBRI approach in each study: KY (A), CCFR (B) and MECC (C). The top 10 genes ordered by iBRI Bayes Factor (BF) are plotted on the left axis, and the respective iBRI BFs are reported on the right. The top 25 models ordered by posterior probability are plotted on the x-axis. Within the plot, each blue rectangle represents the inclusion of a gene within the respective model, and the width of each column is proportional to the posterior model probability. A gene is defined as being included in a model if at least one variant within the region was included in the model. Abbreviations: BF = Bayes Factor.

Source: PubMed

3
订阅