The UK10K project identifies rare variants in health and disease
UK10K Consortium, Klaudia Walter, Josine L Min, Jie Huang, Lucy Crooks, Yasin Memari, Shane McCarthy, John R B Perry, ChangJiang Xu, Marta Futema, Daniel Lawson, Valentina Iotchkova, Stephan Schiffels, Audrey E Hendricks, Petr Danecek, Rui Li, James Floyd, Louise V Wain, Inês Barroso, Steve E Humphries, Matthew E Hurles, Eleftheria Zeggini, Jeffrey C Barrett, Vincent Plagnol, J Brent Richards, Celia M T Greenwood, Nicholas J Timpson, Richard Durbin, Nicole Soranzo
Abstract
The contribution of rare and low-frequency variants to human traits is largely unexplored. Here we describe insights from sequencing whole genomes (low read depth, 7×) or exomes (high read depth, 80×) of nearly 10,000 individuals from population-based and disease collections. In extensively phenotyped cohorts we characterize over 24 million novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with levels of triglycerides (APOB), adiponectin (ADIPOQ) and low-density lipoprotein cholesterol (LDLR and RGAG1) from single-marker and rare variant aggregation tests. We describe population structure and functional annotation of rare and low-frequency variants, use the data to estimate the benefits of sequencing for association studies, and summarize lessons from disease-specific collections. Finally, we make available an extensive resource, including individual-level genetic and phenotypic data and web-based tools to facilitate the exploration of association results.
Conflict of interest statement
P.F. is a member of the Scientific Advisory Board of Omicia, Inc.
Figures
References
- Manolio TA. Bringing genome-wide association findings into clinical use. Nature Rev. Genet. 2013;14:549–558. doi: 10.1038/nrg3523.
- Voight BF, et al. The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet. 2012;8:e1002793. doi: 10.1371/journal.pgen.1002793.
- Cortes A, Brown MA. Promise and pitfalls of the Immunochip. Arthritis Res. Ther. 2011;13:101. doi: 10.1186/ar3204.
- Simons YB, Turchin MC, Pritchard JK, Sella G. The deleterious mutation load is insensitive to recent population history. Nature Genet. 2014;46:220–224. doi: 10.1038/ng.2896.
- 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature467, 1061–1073 (2010)
- Lange LA, et al. Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol. Am. J. Hum. Genet. 2014;94:233–245. doi: 10.1016/j.ajhg.2014.01.010.
- Hindorff LA, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA. 2009;106:9362–9367. doi: 10.1073/pnas.0903103106.
- Huang J, et al. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nature Commun. 2015;6:8111. doi: 10.1038/ncomms9111.
- Zheng, H. et al. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture. Nature10.1038/nature14878 (2015)
- Taylor PN, et al. Whole-genome sequence-based analysis of thyroid function. Nature Commun. 2015;6:5681. doi: 10.1038/ncomms6681.
- Timpson NJ, et al. A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans. Nature Commun. 2014;5:4871. doi: 10.1038/ncomms5871.
- Geihs, M. et al. An interactive genome browser of association results from the UK10K cohorts project. Bioinformatics10.1093/bioinformatics/btv491 (2015)
- Boyd A, et al. Cohort Profile: the ‘children of the 90s’—the index offspring of the Avon Longitudinal Study of Parents and Children. Int. J. Epidemiol. 2013;42:111–127. doi: 10.1093/ije/dys064.
- Moayyeri A, Hammond CJ, Hart DJ, Spector TD. The UK Adult Twin Registry (TwinsUK Resource) Twin Res. Hum. Genet. 2013;16:144–149. doi: 10.1017/thg.2012.89.
- Li Y, Sidore C, Kang HM, Boehnke M, Abecasis GR. Low-coverage sequencing: implications for design of complex trait association studies. Genome Res. 2011;21:940–951. doi: 10.1101/gr.117259.110.
- Wheeler E, et al. Genome-wide SNP and CNV analysis identifies common and low-frequency variants associated with severe early-onset obesity. Nature Genet. 2013;45:513–517. doi: 10.1038/ng.2607.
- Gudbjartsson DF, et al. Large-scale whole-genome sequencing of the Icelandic population. Nature Genet. 2015;47:435–444. doi: 10.1038/ng.3247.
- Williams FM, et al. Genes contributing to pain sensitivity in the normal population: an exome sequencing study. PLoS Genet. 2012;8:e1003095. doi: 10.1371/journal.pgen.1003095.
- Genome of the Netherlands Consortium. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nature Genet.46, 818–825 (2014)
- Welter D, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–D1006. doi: 10.1093/nar/gkt1229.
- Xu C, et al. Estimating genome-wide significance for whole-genome sequencing studies. Genet. Epidemiol. 2014;38:281–290. doi: 10.1002/gepi.21797.
- Jørgensen AB, Frikke-Schmidt R, Nordestgaard BG, Tybjærg-Hansen A. Loss-of-function mutations in APOC3 and risk of ischemic vascular disease. N. Engl. J. Med. 2014;371:32–41. doi: 10.1056/NEJMoa1308027.
- The TG and HDL Working Group of the Exome Sequencing Project, National Heart, Lung, and Blood Institute. Loss-of-function mutations in APOC3, triglycerides, and coronary disease. N. Engl. J. Med.371, 22–31 (2014)
- McCarthy MI, et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Rev. Genet. 2008;9:356–369. doi: 10.1038/nrg2344.
- Park JH, et al. Distribution of allele frequencies and effect sizes and their interrelationships for common genetic susceptibility variants. Proc. Natl Acad. Sci. USA. 2011;108:18026–18031. doi: 10.1073/pnas.1114759108.
- Yang J, et al. Common SNPs explain a large proportion of the heritability for human height. Nature Genet. 2010;42:565–569. doi: 10.1038/ng.608.
- Nordestgaard BG, Benn M, Schnohr P, Tybjaerg-Hansen A. Nonfasting triglycerides and risk of myocardial infarction, ischemic heart disease, and death in men and women. J. Am. Med. Assoc. 2007;298:299–308. doi: 10.1001/jama.298.3.299.
- Whittall RA, Matheus S, Cranston T, Miller GJ, Humphries SE. The intron 14 2140+5G>A variant in the low density lipoprotein receptor gene has no effect on plasma cholesterol levels. J. Med. Genet. 2002;39:e57. doi: 10.1136/jmg.39.9.e57.
- Teslovich TM, et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 2010;466:707–713. doi: 10.1038/nature09270.
- Asimit J, Zeggini E. Rare variant association analysis methods for complex traits. Annu. Rev. Genet. 2010;44:293–308. doi: 10.1146/annurev-genet-102209-163421.
- Wu MC, et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 2011;89:82–93. doi: 10.1016/j.ajhg.2011.05.029.
- Liu DJ, Leal SM. Estimating genetic effects and quantifying missing heritability explained by identified rare-variant associations. Am. J. Hum. Genet. 2012;91:585–596. doi: 10.1016/j.ajhg.2012.08.008.
- Morisaki H, et al. CDH13 gene coding T-cadherin influences variations in plasma adiponectin levels in the Japanese population. Hum. Mutat. 2012;33:402–410. doi: 10.1002/humu.21652.
- Dastani Z, et al. Novel loci for adiponectin levels and their influence on type 2 diabetes and metabolic traits: a multi-ethnic meta-analysis of 45,891 individuals. PLoS Genet. 2012;8:e1002607. doi: 10.1371/journal.pgen.1002607.
- Lee S, Wu MC, Lin X. Optimal tests for rare variant effects in sequencing association studies. Biostatistics. 2012;13:762–775. doi: 10.1093/biostatistics/kxs014.
- The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature488, 57–74 (2012)
- Adams D, et al. BLUEPRINT to decode the epigenetic signature written in blood. Nature Biotechnol. 2012;30:224–226. doi: 10.1038/nbt.2153.
- Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature518, 317–330 (2015)
- Tennessen JA, et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–69. doi: 10.1126/science.1219240.
- Zuk O, et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl Acad. Sci. USA. 2014;111:E455–E464. doi: 10.1073/pnas.1322563111.
- Green RC, et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet. Med. 2013;15:565–574. doi: 10.1038/gim.2013.73.
- Kaye J, et al. Managing clinically significant findings in research: the UK10K example. Eur. J. Hum. Genet. 2014;22:1100–1104. doi: 10.1038/ejhg.2013.290.
- Amendola LM, et al. Actionable exomic incidental findings in 6503 participants: challenges of variant classification. Genome Res. 2015;25:305–315. doi: 10.1101/gr.183483.114.
- Landrum MJ, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980–D985. doi: 10.1093/nar/gkt1113.
- Leslie S, et al. The fine-scale genetic structure of the British population. Nature. 2015;519:309–314. doi: 10.1038/nature14230.
- Mathieson I, McVean G. Differential confounding of rare and common variants in spatially structured populations. Nature Genet. 2012;44:243–246. doi: 10.1038/ng.1074.
- Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature447, 661–678 (2007)
- Lawson DJ, Hellenthal G, Myers S, Falush D. Inference of population structure using dense haplotype data. PLoS Genet. 2012;8:e1002453. doi: 10.1371/journal.pgen.1002453.
- Benjamini Y, Hochberg Y. controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B. 1995;57:289–300.
Source: PubMed