Discovery of common and rare genetic risk variants for colorectal cancer

Jeroen R Huyghe, Stephanie A Bien, Tabitha A Harrison, Hyun Min Kang, Sai Chen, Stephanie L Schmit, David V Conti, Conghui Qu, Jihyoun Jeon, Christopher K Edlund, Peyton Greenside, Michael Wainberg, Fredrick R Schumacher, Joshua D Smith, David M Levine, Sarah C Nelson, Nasa A Sinnott-Armstrong, Demetrius Albanes, M Henar Alonso, Kristin Anderson, Coral Arnau-Collell, Volker Arndt, Christina Bamia, Barbara L Banbury, John A Baron, Sonja I Berndt, Stéphane Bézieau, D Timothy Bishop, Juergen Boehm, Heiner Boeing, Hermann Brenner, Stefanie Brezina, Stephan Buch, Daniel D Buchanan, Andrea Burnett-Hartman, Katja Butterbach, Bette J Caan, Peter T Campbell, Christopher S Carlson, Sergi Castellví-Bel, Andrew T Chan, Jenny Chang-Claude, Stephen J Chanock, Maria-Dolores Chirlaque, Sang Hee Cho, Charles M Connolly, Amanda J Cross, Katarina Cuk, Keith R Curtis, Albert de la Chapelle, Kimberly F Doheny, David Duggan, Douglas F Easton, Sjoerd G Elias, Faye Elliott, Dallas R English, Edith J M Feskens, Jane C Figueiredo, Rocky Fischer, Liesel M FitzGerald, David Forman, Manish Gala, Steven Gallinger, W James Gauderman, Graham G Giles, Elizabeth Gillanders, Jian Gong, Phyllis J Goodman, William M Grady, John S Grove, Andrea Gsur, Marc J Gunter, Robert W Haile, Jochen Hampe, Heather Hampel, Sophia Harlid, Richard B Hayes, Philipp Hofer, Michael Hoffmeister, John L Hopper, Wan-Ling Hsu, Wen-Yi Huang, Thomas J Hudson, David J Hunter, Gemma Ibañez-Sanz, Gregory E Idos, Roxann Ingersoll, Rebecca D Jackson, Eric J Jacobs, Mark A Jenkins, Amit D Joshi, Corinne E Joshu, Temitope O Keku, Timothy J Key, Hyeong Rok Kim, Emiko Kobayashi, Laurence N Kolonel, Charles Kooperberg, Tilman Kühn, Sébastien Küry, Sun-Seog Kweon, Susanna C Larsson, Cecelia A Laurie, Loic Le Marchand, Suzanne M Leal, Soo Chin Lee, Flavio Lejbkowicz, Mathieu Lemire, Christopher I Li, Li Li, Wolfgang Lieb, Yi Lin, Annika Lindblom, Noralane M Lindor, Hua Ling, Tin L Louie, Satu Männistö, Sanford D Markowitz, Vicente Martín, Giovanna Masala, Caroline E McNeil, Marilena Melas, Roger L Milne, Lorena Moreno, Neil Murphy, Robin Myte, Alessio Naccarati, Polly A Newcomb, Kenneth Offit, Shuji Ogino, N Charlotte Onland-Moret, Barbara Pardini, Patrick S Parfrey, Rachel Pearlman, Vittorio Perduca, Paul D P Pharoah, Mila Pinchev, Elizabeth A Platz, Ross L Prentice, Elizabeth Pugh, Leon Raskin, Gad Rennert, Hedy S Rennert, Elio Riboli, Miguel Rodríguez-Barranco, Jane Romm, Lori C Sakoda, Clemens Schafmayer, Robert E Schoen, Daniela Seminara, Mitul Shah, Tameka Shelford, Min-Ho Shin, Katerina Shulman, Sabina Sieri, Martha L Slattery, Melissa C Southey, Zsofia K Stadler, Christa Stegmaier, Yu-Ru Su, Catherine M Tangen, Stephen N Thibodeau, Duncan C Thomas, Sushma S Thomas, Amanda E Toland, Antonia Trichopoulou, Cornelia M Ulrich, David J Van Den Berg, Franzel J B van Duijnhoven, Bethany Van Guelpen, Henk van Kranen, Joseph Vijai, Kala Visvanathan, Pavel Vodicka, Ludmila Vodickova, Veronika Vymetalkova, Korbinian Weigl, Stephanie J Weinstein, Emily White, Aung Ko Win, C Roland Wolf, Alicja Wolk, Michael O Woods, Anna H Wu, Syed H Zaidi, Brent W Zanke, Qing Zhang, Wei Zheng, Peter C Scacheri, John D Potter, Michael C Bassik, Anshul Kundaje, Graham Casey, Victor Moreno, Goncalo R Abecasis, Deborah A Nickerson, Stephen B Gruber, Li Hsu, Ulrike Peters, Jeroen R Huyghe, Stephanie A Bien, Tabitha A Harrison, Hyun Min Kang, Sai Chen, Stephanie L Schmit, David V Conti, Conghui Qu, Jihyoun Jeon, Christopher K Edlund, Peyton Greenside, Michael Wainberg, Fredrick R Schumacher, Joshua D Smith, David M Levine, Sarah C Nelson, Nasa A Sinnott-Armstrong, Demetrius Albanes, M Henar Alonso, Kristin Anderson, Coral Arnau-Collell, Volker Arndt, Christina Bamia, Barbara L Banbury, John A Baron, Sonja I Berndt, Stéphane Bézieau, D Timothy Bishop, Juergen Boehm, Heiner Boeing, Hermann Brenner, Stefanie Brezina, Stephan Buch, Daniel D Buchanan, Andrea Burnett-Hartman, Katja Butterbach, Bette J Caan, Peter T Campbell, Christopher S Carlson, Sergi Castellví-Bel, Andrew T Chan, Jenny Chang-Claude, Stephen J Chanock, Maria-Dolores Chirlaque, Sang Hee Cho, Charles M Connolly, Amanda J Cross, Katarina Cuk, Keith R Curtis, Albert de la Chapelle, Kimberly F Doheny, David Duggan, Douglas F Easton, Sjoerd G Elias, Faye Elliott, Dallas R English, Edith J M Feskens, Jane C Figueiredo, Rocky Fischer, Liesel M FitzGerald, David Forman, Manish Gala, Steven Gallinger, W James Gauderman, Graham G Giles, Elizabeth Gillanders, Jian Gong, Phyllis J Goodman, William M Grady, John S Grove, Andrea Gsur, Marc J Gunter, Robert W Haile, Jochen Hampe, Heather Hampel, Sophia Harlid, Richard B Hayes, Philipp Hofer, Michael Hoffmeister, John L Hopper, Wan-Ling Hsu, Wen-Yi Huang, Thomas J Hudson, David J Hunter, Gemma Ibañez-Sanz, Gregory E Idos, Roxann Ingersoll, Rebecca D Jackson, Eric J Jacobs, Mark A Jenkins, Amit D Joshi, Corinne E Joshu, Temitope O Keku, Timothy J Key, Hyeong Rok Kim, Emiko Kobayashi, Laurence N Kolonel, Charles Kooperberg, Tilman Kühn, Sébastien Küry, Sun-Seog Kweon, Susanna C Larsson, Cecelia A Laurie, Loic Le Marchand, Suzanne M Leal, Soo Chin Lee, Flavio Lejbkowicz, Mathieu Lemire, Christopher I Li, Li Li, Wolfgang Lieb, Yi Lin, Annika Lindblom, Noralane M Lindor, Hua Ling, Tin L Louie, Satu Männistö, Sanford D Markowitz, Vicente Martín, Giovanna Masala, Caroline E McNeil, Marilena Melas, Roger L Milne, Lorena Moreno, Neil Murphy, Robin Myte, Alessio Naccarati, Polly A Newcomb, Kenneth Offit, Shuji Ogino, N Charlotte Onland-Moret, Barbara Pardini, Patrick S Parfrey, Rachel Pearlman, Vittorio Perduca, Paul D P Pharoah, Mila Pinchev, Elizabeth A Platz, Ross L Prentice, Elizabeth Pugh, Leon Raskin, Gad Rennert, Hedy S Rennert, Elio Riboli, Miguel Rodríguez-Barranco, Jane Romm, Lori C Sakoda, Clemens Schafmayer, Robert E Schoen, Daniela Seminara, Mitul Shah, Tameka Shelford, Min-Ho Shin, Katerina Shulman, Sabina Sieri, Martha L Slattery, Melissa C Southey, Zsofia K Stadler, Christa Stegmaier, Yu-Ru Su, Catherine M Tangen, Stephen N Thibodeau, Duncan C Thomas, Sushma S Thomas, Amanda E Toland, Antonia Trichopoulou, Cornelia M Ulrich, David J Van Den Berg, Franzel J B van Duijnhoven, Bethany Van Guelpen, Henk van Kranen, Joseph Vijai, Kala Visvanathan, Pavel Vodicka, Ludmila Vodickova, Veronika Vymetalkova, Korbinian Weigl, Stephanie J Weinstein, Emily White, Aung Ko Win, C Roland Wolf, Alicja Wolk, Michael O Woods, Anna H Wu, Syed H Zaidi, Brent W Zanke, Qing Zhang, Wei Zheng, Peter C Scacheri, John D Potter, Michael C Bassik, Anshul Kundaje, Graham Casey, Victor Moreno, Goncalo R Abecasis, Deborah A Nickerson, Stephen B Gruber, Li Hsu, Ulrike Peters

Abstract

To further dissect the genetic architecture of colorectal cancer (CRC), we performed whole-genome sequencing of 1,439 cases and 720 controls, imputed discovered sequence variants and Haplotype Reference Consortium panel variants into genome-wide association study data, and tested for association in 34,869 cases and 29,051 controls. Findings were followed up in an additional 23,262 cases and 38,296 controls. We discovered a strongly protective 0.3% frequency variant signal at CHD1. In a combined meta-analysis of 125,478 individuals, we identified 40 new independent signals at P < 5 × 10-8, bringing the number of known independent signals for CRC to ~100. New signals implicate lower-frequency variants, Krüppel-like factors, Hedgehog signaling, Hippo-YAP signaling, long noncoding RNAs and somatic drivers, and support a role for immune function. Heritability analyses suggest that CRC risk is highly polygenic, and larger, more comprehensive studies enabling rare variant analysis will improve understanding of biology underlying this risk and influence personalized screening strategies and drug development.

Conflict of interest statement

Competing Interests Statement

Goncalo R Abecasis has received compensation from 23andMe and Helix. He is currently an employee of Regeneron Pharmaceuticals. Heather Hampel performs collaborative research with Ambry Genetics, InVitae Genetics, and Myriad Genetic Laboratories, Inc., is on the scientific advisory board for InVitae Genetics and Genome Medical, and has stock in Genome Medical. Rachel Pearlman has participated in collaborative funded research with Myriad Genetics Laboratories and Invitae Genetics but has no financial competitive interest.

Figures

Figure 1. Conditionally independent association signals at…
Figure 1. Conditionally independent association signals at the BMP2 locus.
Regional association plot showing the unconditional −log10(P-value) for the association with CRC risk in the combined meta-analysis of up to 125,478 individuals, as a function of genomic position (Build 37) for each variant in the region. The lead variants are indicated by a diamond symbol and its positions are indicated by dashed vertical lines. The color-labeling and shape of all other variants indicate the lead variant with which they are in strongest LD. The two new genome-wide significant signals are indicated by an asterisk.
Figure 2. Functional genomic annotation of new…
Figure 2. Functional genomic annotation of new CRC risk locus overlapping KLF5 super-enhancer.
Top: Regional association plot showing the unconditional −log10(P-value) for the association with CRC risk in the combined meta-analysis of up to 125,478 individuals, as a function of genomic position (Build 37) for each variant in the region. The lead variants are indicated by a diamond symbol and its positions are indicated by dashed vertical lines. The color-labeling and shape of all other variants indicate the lead variant with which they are in strongest LD. Bottom: UCSC genome browser annotations for region overlapping the super-enhancer flanked by KLF5 and KLF12, and spanning variants in LD with rs78341008, and with two conditionally independent association signals indexed by rs45597035 and rs1924816. The region is annotated with the following tracks (from top to bottom): UCSC gene annotations; epigenomic profiles showing MACS2 peak calls as transparent overlays for different samples taken from non-diseased colonic crypt cells or colon tissue (purple) and from different primary CRC cell lines or tumor samples (teal); position of the lead variants and variants in LD with the lead; variants in the 99% credible set; the union of super-enhancers called using the ROSE package; gray bars highlight the targeted enhancers (e1,e3, and e4) previously shown by Zhang et al. to have combinatorial effects on KLF5 expression. ATAC-seq data newly generated for this study show high resolution annotation of putative binding regions within the active super-enhancer further fine-mapping putative causal variants at each of the three signals.
Figure 3. Recommended age to start CRC…
Figure 3. Recommended age to start CRC screening based on a polygenic risk score (PRS).
The PRS was constructed using the 95 known and newly discovered variants. The horizontal lines represent the recommended age for the first endoscopy for an average-risk person in the current screening guideline for CRC. The risk threshold to determine the age for the first screening was set as the average of 10-year CRC risks for a 50-year-old man (1.25%) and woman (0.68%), i.e. (1.25%+0.68%)/2 = 0.97%, who have not previously received an endoscopy. Details are given in the Online Methods.

References

    1. Ferlay J et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 136, E359–86 (2015).
    1. Lichtenstein P et al. Environmental and heritable factors in the causation of cancer--analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med 343, 78–85 (2000).
    1. Czene K, Lichtenstein P & Hemminki K Environmental and heritable causes of cancer among 9.6 million individuals in the Swedish Family-Cancer Database. Int J Cancer 99, 260–266 (2002).
    1. Sud A, Kinnersley B & Houlston RS Genome-wide association studies of cancer: current insights and future perspectives. Nat Rev Cancer 17, 692–704 (2017).
    1. Tomlinson I et al. A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21. Nat Genet 39, 984–988 (2007).
    1. Broderick P et al. A genome-wide association study shows that common alleles of SMAD7 influence colorectal cancer risk. Nat Genet 39, 1315–1317 (2007).
    1. Tomlinson IPM et al. A genome-wide association study identifies colorectal cancer susceptibility loci on chromosomes 10p14 and 8q23.3. Nat Genet 40, 623–630 (2008).
    1. Tenesa A et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24 and 18q21. Nat Genet 40, 631–637 (2008).
    1. Study COGENT et al. Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer. Nat Genet 40, 1426–1435 (2008).
    1. Houlston RS et al. Meta-analysis of three genome-wide association studies identifies susceptibility loci for colorectal cancer at 1q41, 3q26.2, 12q13.13 and 20q13.33. Nat Genet 42, 973–977 (2010).
    1. Tomlinson IPM et al. Multiple common susceptibility variants near BMP pathway loci GREM1, BMP4, and BMP2 explain part of the missing heritability of colorectal cancer. PLoS Genet 7, e1002105 (2011).
    1. Dunlop MG et al. Common variation near CDKN1A, POLD3 and SHROOM2 influences colorectal cancer risk. Nat Genet 44, 770–776 (2012).
    1. Peters U et al. Identification of Genetic Susceptibility Loci for Colorectal Tumors in a Genome-Wide Meta-analysis. Gastroenterology 144, 799–807.e24 (2013).
    1. Jia W-H et al. Genome-wide association analyses in East Asians identify new susceptibility loci for colorectal cancer. Nat Genet 45, 191–196 (2013).
    1. Whiffin N et al. Identification of susceptibility loci for colorectal cancer in a genome-wide meta-analysis. Hum Mol Genet 23, 4729–4737 (2014).
    1. Wang H et al. Trans-ethnic genome-wide association study of colorectal cancer identifies a new susceptibility locus in VTI1A. Nat Commun 5, 4613 (2014).
    1. Zhang B et al. Large-scale genetic study in East Asians identifies six new loci associated with colorectal cancer risk. Nat Genet 46, 533–542 (2014).
    1. Schumacher FR et al. Genome-wide association study of colorectal cancer identifies six new susceptibility loci. Nat Commun 6, 7138 (2015).
    1. Al-Tassan NA et al. A new GWAS and meta-analysis with 1000Genomes imputation identifies novel risk variants for colorectal cancer. Sci Rep 5, 10442 (2015).
    1. Orlando G et al. Variation at 2q35 (PNKD and TMBIM1) influences colorectal cancer risk and identifies a pleiotropic effect with inflammatory bowel disease. Hum Mol Genet 25, 2349–2359 (2016).
    1. Zeng C et al. Identification of susceptibility loci and genes for colorectal cancer risk. Gastroenterology 150, 1633–1645 (2016).
    1. Schmit SL et al. Novel common genetic susceptibility loci for colorectal cancer. J Natl Cancer Inst 1–12 (2018). doi:10.1093/jnci/djy099
    1. Fuchsberger C et al. The genetic architecture of type 2 diabetes. Nature 536, 41–47 (2016).
    1. 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
    1. McCarthy S et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet 48, 1279–1283 (2016).
    1. Amos CI et al. The oncoarray consortium: A network for understanding the genetic architecture of common cancers. Cancer Epidemiol Biomarkers Prev 26, 126–135 (2017).
    1. Zhao D & DePinho RA Synthetic essentiality: Targeting tumor suppressor deficiencies in cancer. Bioessays 39, (2017).
    1. Zhao D et al. Synthetic essentiality of chromatin remodelling factor CHD1 in PTEN-deficient cancer. Nature 542, 484–488 (2017).
    1. Xiao Y et al. RGMb is a novel binding partner for PD-L2 and its engagement with PD-L2 promotes respiratory tolerance. J Exp Med 211, 943–959 (2014).
    1. Topalian SL et al. Safety, activity, and immune correlates of anti-PD-1 antibody in cancer. N Engl J Med 366, 2443–2454 (2012).
    1. Zhang X et al. Somatic superenhancer duplications and hotspot mutations lead to oncogenic activation of the KLF5 transcription factor. Cancer Discov 8, 108–125 (2018).
    1. Giannakis M et al. Genomic Correlates of Immune-Cell Infiltrates in Colorectal Carcinoma. Cell Rep 15, 857–865 (2016).
    1. Dekker RJ et al. KLF2 provokes a gene expression pattern that establishes functional quiescent differentiation of the endothelium. Blood 107, 4354–4363 (2006).
    1. Boon RA et al. KLF2 suppresses TGF-beta signaling in endothelium through induction of Smad7 and inhibition of AP-1. Arterioscler Thromb Vasc Biol 27, 532–539 (2007).
    1. Chakroborty D et al. Dopamine stabilizes tumor blood vessels by up-regulating angiopoietin 1 expression in pericytes and Kruppel-like factor-2 expression in tumor endothelial cells. Proc Natl Acad Sci U S A 108, 20730–20735 (2011).
    1. Lee S-J et al. Regulation of hypoxia-inducible factor 1α (HIF-1α) by lysophosphatidic acid is dependent on interplay between p53 and Krüppel-like factor 5. J Biol Chem 288, 25244–25253 (2013).
    1. Zhang H et al. Lysophosphatidic acid facilitates proliferation of colon cancer cells via induction of Krüppel-like factor 5. J Biol Chem 282, 15541–15549 (2007).
    1. Ma Z et al. Long non-coding RNA SNHG15 inhibits P15 and KLF2 expression to promote pancreatic cancer proliferation through EZH2-mediated H3K27me3. Oncotarget 8, 84153–84167 (2017).
    1. Evangelista M, Tian H & de Sauvage FJ The hedgehog signaling pathway in cancer. Clin Cancer Res 12, 5924–5928 (2006).
    1. Gerling M et al. Stromal Hedgehog signalling is downregulated in colon cancer and its restoration restrains tumour growth. Nat Commun 7, 12321 (2016).
    1. Mille F et al. The Shh receptor Boc promotes progression of early medulloblastoma to advanced tumors. Dev Cell 31, 34–47 (2014).
    1. Mathew E et al. Dosage-dependent regulation of pancreatic cancer growth and angiogenesis by hedgehog signaling. Cell Rep 9, 484–494 (2014).
    1. Zhao B, Li L, Lei Q & Guan K-L The Hippo-YAP pathway in organ size control and tumorigenesis: an updated version. Genes Dev 24, 862–874 (2010).
    1. Camargo FD et al. YAP1 increases organ size and expands undifferentiated progenitor cells. Curr Biol 17, 2054–2060 (2007).
    1. Ma X, Zhang H, Xue X & Shah YM Hypoxia-inducible factor 2α (HIF-2α) promotes colon cancer growth by potentiating Yes-associated protein 1 (YAP1) activity. J Biol Chem 292, 17046–17056 (2017).
    1. MacArthur J et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 45, D896–D901 (2017).
    1. Seshagiri S et al. Recurrent R-spondin fusions in colon cancer. Nature 488, 660–664 (2012).
    1. Song F et al. Identification of a melanoma susceptibility locus and somatic mutation in TET2. Carcinogenesis 35, 2097–2101 (2014).
    1. Eeles RA et al. Identification of seven new prostate cancer susceptibility loci through a genome-wide association study. Nat Genet 41, 1116–1121 (2009).
    1. Michailidou K et al. Association analysis identifies 65 new breast cancer risk loci. Nature 551, 92–94 (2017).
    1. Schunkert H et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nat Genet 43, 333–338 (2011).
    1. Scott LJ et al. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316, 1341–1345 (2007).
    1. Al Olama AA et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat Genet 46, 1103–1109 (2014).
    1. Timofeeva MN et al. Influence of common genetic variation on lung cancer risk: meta-analysis of 14 900 cases and 29 485 controls. Hum Mol Genet 21, 4980–4995 (2012).
    1. Shete S et al. Genome-wide association study identifies five susceptibility loci for glioma. Nat Genet 41, 899–904 (2009).
    1. Bishop DT et al. Genome-wide association study identifies three loci associated with melanoma risk. Nat Genet 41, 920–925 (2009).
    1. Sapkota Y et al. Meta-analysis identifies five novel loci associated with endometriosis highlighting key genes involved in hormone metabolism. Nat Commun 8, 15539 (2017).
    1. Cannon-Albright LA et al. Assignment of a locus for familial melanoma, MLM, to chromosome 9p13-p22. Science 258, 1148–1152 (1992).
    1. Hussussian CJ et al. Germline p16 mutations in familial melanoma. Nat Genet 8, 15–21 (1994).
    1. Seoane J et al. TGFbeta influences Myc, Miz-1 and Smad to control the CDK inhibitor p15INK4b. Nat Cell Biol 3, 400–408 (2001).
    1. Jung B, Staudacher JJ & Beauchamp D Transforming Growth Factor β Superfamily Signaling in Development of Colorectal Cancer. Gastroenterology 152, 36–52 (2017).
    1. Guda K et al. Inactivating germ-line and somatic mutations in polypeptide N-acetylgalactosaminyltransferase 12 in human colon cancers. Proc Natl Acad Sci U S A 106, 12921–12925 (2009).
    1. Groden J et al. Identification and characterization of the familial adenomatous polyposis coli gene. Cell 66, 589–600 (1991).
    1. Saharia A et al. FEN1 ensures telomere stability by facilitating replication fork re-initiation. J Biol Chem 285, 27057–27066 (2010).
    1. Eeles RA et al. Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array. Nat Genet 45, 385–91, 391e1 (2013).
    1. Liu JZ et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat Genet 47, 979–986 (2015).
    1. Paternoster L et al. Multi-ancestry genome-wide association study of 21,000 cases and 95,000 controls identifies new risk loci for atopic dermatitis. Nat Genet 47, 1449–1456 (2015).
    1. Laken SJ et al. Familial colorectal cancer in Ashkenazim due to a hypermutable tract in APC. Nat Genet 17, 79–83 (1997).
    1. Niell BL, Long JC, Rennert G & Gruber SB Genetic anthropology of the colorectal cancer-susceptibility allele APC I1307K: evidence of genetic drift within the Ashkenazim. Am J Hum Genet 73, 1250–1260 (2003).
    1. Karami S et al. Telomere structure and maintenance gene variants and risk of five cancer types. Int J Cancer 139, 2655–2670 (2016).
    1. Congrains A, Kamide K, Ohishi M & Rakugi H ANRIL: molecular mechanisms and implications in human health. Int J Mol Sci 14, 1278–1292 (2013).
    1. Zhang X et al. Identification of focally amplified lineage-specific super-enhancers in human epithelial cancers. Nat Genet 48, 176–182 (2016).
    1. Rheinbay E et al. Discovery and characterization of coding and non-coding driver mutations in more than 2,500 whole cancer genomes. BioRxiv (2017). doi:10.1101/237313
    1. Iotchkova V et al. GARFIELD - GWAS Analysis of Regulatory or Functional Information Enrichment with LD correction. BioRxiv (2016). doi:10.1101/085738
    1. Segrè AV et al. Common inherited variation in mitochondrial genes is not enriched for associations with type 2 diabetes or related glycemic traits. PLoS Genet 6, (2010).
    1. Yang J et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat Genet 47, 1114–1120 (2015).
    1. Bhatia G et al. Subtle stratification confounds estimates of heritability from rare variants. BioRxiv (2016). doi:10.1101/048181
    1. Zhong H & Prentice RL Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies. Biostatistics 9, 621–634 (2008).
    1. Cheetham SW, Gruhl F, Mattick JS & Dinger ME Long noncoding RNAs and the genetics of cancer. Br J Cancer 108, 2419–2425 (2013).
    1. Popejoy AB & Fullerton SM Genomics is failing on diversity. Nature 538, 161–164 (2016).
    1. Nelson MR et al. The support of human genetic evidence for approved drug indications. Nat Genet 47, 856–860 (2015).
METHODS-ONLY REFERENCES
    1. Li H & Durbin R Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
    1. Jun G, Wing MK, Abecasis GR & Kang HM An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data. Genome Res 25, 918–925 (2015).
    1. Browning BL & Yu Z Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies. Am J Hum Genet 85, 847–861 (2009).
    1. Li H A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).
    1. 1000 Genomes Project Consortium et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
    1. Manichaikul A et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
    1. Laurie CC et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet Epidemiol 34, 591–602 (2010).
    1. Bycroft C et al. Genome-wide genetic data on ~500,000 UK Biobank participants. BioRxiv (2017). doi:10.1101/166298
    1. Chang CC et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
    1. Price AL et al. Long-range LD can confound genome scans in admixed populations. Am J Hum Genet 83, 132–135 (2008).
    1. Weale ME Quality control for genome-wide association studies. Methods Mol Biol 628, 341–372 (2010).
    1. 1000 Genomes Project Consortium et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
    1. Delaneau O, Howie B, Cox AJ, Zagury J-F & Marchini J Haplotype estimation using sequencing reads. Am J Hum Genet 93, 687–696 (2013).
    1. Das S et al. Next-generation genotype imputation service and methods. Nat Genet 48, 1284–1287 (2016).
    1. Sun J, Zheng Y & Hsu L A unified mixed-effects model for rare-variant association in sequencing studies. Genet Epidemiol 37, 334–344 (2013).
    1. Moutsianas L et al. The power of gene-based rare variant methods to detect disease-associated variation and test hypotheses about complex disease. PLoS Genet 11, e1005165 (2015).
    1. Kang HM et al. Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42, 348–354 (2010).
    1. Cook JP, Mahajan A & Morris AP Guidance for the utility of linear models in meta-analysis of genetic association studies of binary phenotypes. Eur J Hum Genet 25, 240–245 (2017).
    1. Willer CJ, Li Y & Abecasis GR METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
    1. Yang J et al. Genomic inflation factors under polygenic inheritance. Eur J Hum Genet 19, 807–812 (2011).
    1. Bulik-Sullivan BK et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 47, 291–295 (2015).
    1. Michailidou K et al. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet 45, 353–61, 361e1 (2013).
    1. Wellcome Trust Case Control Consortium et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat Genet 44, 1294–1301 (2012).
    1. Wakefield J A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am J Hum Genet 81, 208–227 (2007).
    1. Wang K, Li M & Hakonarson H ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38, e164 (2010).
    1. Adzhubei I, Jordan DM & Sunyaev SR Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet Chapter 7, Unit7.20 (2013).
    1. Kircher M et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet 46, 310–315 (2014).
    1. Quang D, Chen Y & Xie X DANN: a deep learning approach for annotating the pathogenicity of genetic variants. Bioinformatics 31, 761–763 (2015).
    1. Ionita-Laza I, McCallum K, Xu B & Buxbaum JD A spectral approach integrating functional genomic annotations for coding and noncoding variants. Nat Genet 48, 214–220 (2016).
    1. Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
    1. Corradin O et al. Combinatorial effects of multiple enhancer variants in linkage disequilibrium dictate levels of gene expression to confer susceptibility to common traits. Genome Res 24, 1–13 (2014).
    1. Pruitt KD et al. The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. Genome Res 19, 1316–1323 (2009).
    1. Harmston N et al. Topologically associating domains are ancient features that coincide with Metazoan clusters of extreme noncoding conservation. Nat Commun 8, 441 (2017).
    1. Berlivet S et al. Clustering of tissue-specific sub-TADs accompanies the regulation of HoxA genes in developing limbs. PLoS Genet 9, e1004018 (2013).
    1. Hu Z & Tee W-W Enhancers and chromatin structures: regulatory hubs in gene expression and diseases. Biosci Rep 37, (2017).
    1. Consortium GTEx. The Genotype-Tissue Expression (GTEx) project. Nat Genet 45, 580–585 (2013).
    1. Ward LD & Kellis M HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res 40, D930–4 (2012).
    1. Landt SG et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res 22, 1813–1831 (2012).
    1. Witte JS, Visscher PM & Wray NR The contribution of genetic variants to disease depends on the ruler. Nat Rev Genet 15, 765–776 (2014).
    1. Cox A et al. A common coding variant in CASP8 is associated with breast cancer risk. Nat Genet 39, 352–358 (2007).
    1. Johns LE & Houlston RS A systematic review and meta-analysis of familial colorectal cancer risk. Am J Gastroenterol 96, 2992–3003 (2001).
    1. Hsu L et al. A model to determine colorectal cancer risk using common genetic susceptibility loci. Gastroenterology 148, 1330–9.e14 (2015).
    1. Jeon J et al. Determining risk of colorectal cancer and starting age of screening based on lifestyle, environmental, and genetic factors. Gastroenterology 154, 2152–2164.e19 (2018).

Source: PubMed

3
订阅