Factors influencing success of clinical genome sequencing across a broad spectrum of disorders

Jenny C Taylor, Hilary C Martin, Stefano Lise, John Broxholme, Jean-Baptiste Cazier, Andy Rimmer, Alexander Kanapin, Gerton Lunter, Simon Fiddy, Chris Allan, A Radu Aricescu, Moustafa Attar, Christian Babbs, Jennifer Becq, David Beeson, Celeste Bento, Patricia Bignell, Edward Blair, Veronica J Buckle, Katherine Bull, Ondrej Cais, Holger Cario, Helen Chapel, Richard R Copley, Richard Cornall, Jude Craft, Karin Dahan, Emma E Davenport, Calliope Dendrou, Olivier Devuyst, Aimée L Fenwick, Jonathan Flint, Lars Fugger, Rodney D Gilbert, Anne Goriely, Angie Green, Ingo H Greger, Russell Grocock, Anja V Gruszczyk, Robert Hastings, Edouard Hatton, Doug Higgs, Adrian Hill, Chris Holmes, Malcolm Howard, Linda Hughes, Peter Humburg, David Johnson, Fredrik Karpe, Zoya Kingsbury, Usha Kini, Julian C Knight, Jonathan Krohn, Sarah Lamble, Craig Langman, Lorne Lonie, Joshua Luck, Davis McCarthy, Simon J McGowan, Mary Frances McMullin, Kerry A Miller, Lisa Murray, Andrea H Németh, M Andrew Nesbit, David Nutt, Elizabeth Ormondroyd, Annette Bang Oturai, Alistair Pagnamenta, Smita Y Patel, Melanie Percy, Nayia Petousi, Paolo Piazza, Sian E Piret, Guadalupe Polanco-Echeverry, Niko Popitsch, Fiona Powrie, Chris Pugh, Lynn Quek, Peter A Robbins, Kathryn Robson, Alexandra Russo, Natasha Sahgal, Pauline A van Schouwenburg, Anna Schuh, Earl Silverman, Alison Simmons, Per Soelberg Sørensen, Elizabeth Sweeney, John Taylor, Rajesh V Thakker, Ian Tomlinson, Amy Trebes, Stephen Rf Twigg, Holm H Uhlig, Paresh Vyas, Tim Vyse, Steven A Wall, Hugh Watkins, Michael P Whyte, Lorna Witty, Ben Wright, Chris Yau, David Buck, Sean Humphray, Peter J Ratcliffe, John I Bell, Andrew Om Wilkie, David Bentley, Peter Donnelly, Gilean McVean, Jenny C Taylor, Hilary C Martin, Stefano Lise, John Broxholme, Jean-Baptiste Cazier, Andy Rimmer, Alexander Kanapin, Gerton Lunter, Simon Fiddy, Chris Allan, A Radu Aricescu, Moustafa Attar, Christian Babbs, Jennifer Becq, David Beeson, Celeste Bento, Patricia Bignell, Edward Blair, Veronica J Buckle, Katherine Bull, Ondrej Cais, Holger Cario, Helen Chapel, Richard R Copley, Richard Cornall, Jude Craft, Karin Dahan, Emma E Davenport, Calliope Dendrou, Olivier Devuyst, Aimée L Fenwick, Jonathan Flint, Lars Fugger, Rodney D Gilbert, Anne Goriely, Angie Green, Ingo H Greger, Russell Grocock, Anja V Gruszczyk, Robert Hastings, Edouard Hatton, Doug Higgs, Adrian Hill, Chris Holmes, Malcolm Howard, Linda Hughes, Peter Humburg, David Johnson, Fredrik Karpe, Zoya Kingsbury, Usha Kini, Julian C Knight, Jonathan Krohn, Sarah Lamble, Craig Langman, Lorne Lonie, Joshua Luck, Davis McCarthy, Simon J McGowan, Mary Frances McMullin, Kerry A Miller, Lisa Murray, Andrea H Németh, M Andrew Nesbit, David Nutt, Elizabeth Ormondroyd, Annette Bang Oturai, Alistair Pagnamenta, Smita Y Patel, Melanie Percy, Nayia Petousi, Paolo Piazza, Sian E Piret, Guadalupe Polanco-Echeverry, Niko Popitsch, Fiona Powrie, Chris Pugh, Lynn Quek, Peter A Robbins, Kathryn Robson, Alexandra Russo, Natasha Sahgal, Pauline A van Schouwenburg, Anna Schuh, Earl Silverman, Alison Simmons, Per Soelberg Sørensen, Elizabeth Sweeney, John Taylor, Rajesh V Thakker, Ian Tomlinson, Amy Trebes, Stephen Rf Twigg, Holm H Uhlig, Paresh Vyas, Tim Vyse, Steven A Wall, Hugh Watkins, Michael P Whyte, Lorna Witty, Ben Wright, Chris Yau, David Buck, Sean Humphray, Peter J Ratcliffe, John I Bell, Andrew Om Wilkie, David Bentley, Peter Donnelly, Gilean McVean

Abstract

To assess factors influencing the success of whole-genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases or families across a broad spectrum of disorders in whom previous screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritization. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy. Overall, we identified disease-causing variants in 21% of cases, with the proportion increasing to 34% (23/68) for mendelian disorders and 57% (8/14) in family trios. We also discovered 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder, although only 4 were ultimately considered reportable. Our results demonstrate the value of genome sequencing for routine clinical diagnosis but also highlight many outstanding challenges.

Figures

Figure 1. Overview of projects and results
Figure 1. Overview of projects and results
For each disorder, the number of independent cases (bars) studied is shown alongside information about the nature of the disorder: familial disorders (category 1, light green triangles), severe single-generation disorders suspected to be caused by de novo or recessive mutations (category 2, dark green), unrelated sporadic disorders (category 3, light blue) and extreme cases of common complex diseases (category 4, dark blue). The proportion of cases with each outcome class A-E is also shown (see Online Methods): pathogenic variant in novel gene for disorder (A, red circles), pathogenic variant in gene for related disorder (B, brown), pathogenic variant in known gene for disorder (C, pink), candidate pathogenic variant with validation studies underway (D, orange) and no single candidate variant, or negative results for validation of top candidate/s (blue). Size of points proportional to outcome fraction. Disorders are ranked by fraction of cases with confirmed pathogenic variants (class A to C).
Figure 2. The burden of variants of…
Figure 2. The burden of variants of unknown significance
(a) Histograms of the number of previously unreported coding variants at conserved positions in different sets of candidate gene (Tiers 1, 1+2 and 1+2+3 for columns left to right) for early-onset epilepsy, under different inheritance models, across 216 WGS500 samples. (b) Histogram of the number of previously unreported coding variants at conserved positions in known X-linked mental retardation genes (XLMR), for the 99 male WGS500 samples. The candidate genes were chosen by high-throughput searches (Online Methods). Sample identifiers indicate individuals with the disorder in question. Sample names in green text indicate that the variant is not likely to be pathogenic (since it does not fit a plausible inheritance model or is less functionally compelling than another candidate); blue text indicates that the variant is thought to be causal (see Supplementary Table 6). OTH: Ohtahara syndrome; EOE: nonsyndromic early onset epilepsy; MR: mental retardation. See Supplementary Fig. 4 for the analysis of craniosynostosis.
Figure 3. Identification of de novo HUWE1…
Figure 3. Identification of de novo HUWE1 mutation associated with severe craniosynostosis
(a) Upper panel, the proband (CRS_4659; female, aged 6 months) presented with an abnormal skull shape. Lower panel, three-dimensional CT scan aged 5 months shows multisuture synostosis with multiple craniolacunae. (b) Family pedigree showing dideoxy sequence chromatograms with de novo G>A mutation of the X-linked HUWE1 gene in the proband (red arrow). Schematic X chromosomes are annotated from top to bottom with the HUWE1 alleles, haplotype of AA/CC polymorphisms located 1.15 kb away from mutation and used to deduce paternal origin, and androgen receptor (AR) trinucleotide repeat allele size (allele sizes in CRS_4654 and CRS_5215 are in brackets to emphasize that phase is unknown relative to other parts of the two X chromosomes). Note that the HUWE1 mutation abolishes a HpaII restriction site. (c) Analysis of X-inactivation in whole blood samples at AR locus. For each individual, AR alleles are indicated by arrows in the upper panel, while the lower panel shows proportions of methylated alleles and percentage representation of the more highly inactivated X chromosome. (d) Exclusive expression of cDNA from the HUWE1 mutant allele in both fibroblast (Fib) and Epstein Barr virus (EBV)-transformed lymphoblastoid cells from the proband. Arrows highlight absence of expression of the normal allele in either cell type. Product sizes (bp) from different alleles are shown on the right. WT: wild-type, Mut: mutant. (e) X chromosome ideogram showing eight de novo mutations identified. Where known, the parental allele on which the variant arose is indicated.
Figure 4. Candidate pathogenic noncoding variants
Figure 4. Candidate pathogenic noncoding variants
(a) Multi-species alignment of a region of the 5′ UTR of EPO in which a variant was identified at a conserved position (red text) in two families with erythrocytosis. (b) Erythrocytosis pedigrees studied, showing affected individuals (shaded grey), those sequenced (red borders), and genotypes of all individuals for whom we had DNA. We had no information about the father of PAR09 (dotted box).(c) Summary of read mapping in an individual with hypoparathyroidism showing evidence for an interstitial insertion-deletion event in which a ~ 50 kb region of chromosome 2p25.3 (top panel) has been duplicated and inserted into chromosome X, resulting in a 1.4 kb deletion 81.5 kb downstream of SOX3 (bottom panel). Yellow reads: mate maps to chrX; red reads: mate maps to chr2; grey reads: read and mate map to the same chromosome; white reads: read has mapping quality 0. (d) Pedigree showing segregation of the complex variant within the affected pedigree, with PCR validation below. M: mutation; N: normal. Primers 2SPF and XSPR flank the distal breakpoint of the deletion-insertion and are shown in Supplementary Figure 8. Primers XSPF and XSPR detect the normal allele. The mutation was not seen in 150 alleles from 100 unrelated normocalcemic individuals (50 males and 50 females, including N1 and N2, who are shown).

References

References for main text

    1. Need AC, et al. Clinical application of exome sequencing in undiagnosed genetic conditions. J Med Genet. 2012;49:353–61.
    1. Bamshad MJ, et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet. 2011;12:745–55.
    1. Yang Y, et al. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N Engl J Med. 2013;369:1502–11.
    1. Gonzaga-Jauregui C, Lupski JR, Gibbs RA. Human genome sequencing in health and disease. Annu Rev Med. 2012;63:35–61.
    1. Dixon-Salazar TJ, et al. Exome sequencing can improve diagnosis and alter patient management. Sci Transl Med. 2012;4:138ra78.
    1. The 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65.
    1. Tennessen JA, et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–9.
    1. Beaulieu CL, et al. FORGE Canada Consortium: Outcomes of a 2-Year National Rare-Disease Gene-Discovery Project. Am J Hum Genet. 2014;94:809–17.
    1. Biesecker LG, Green RC. Diagnostic clinical genome and exome sequencing. N Engl J Med. 2014;370:2418–25.
    1. Saunders CJ, et al. Rapid whole-genome sequencing for genetic disease diagnosis in neonatal intensive care units. Sci Transl Med. 2012;4:154ra135.
    1. Gilissen C, et al. Genome sequencing identifies major causes of severe intellectual disability. Nature. 2014;511:344–7.
    1. Jacob HJ, et al. Genomics in clinical practice: lessons from the front lines. Sci Transl Med. 2013;5:194cm5.
    1. Cazier JB, et al. Whole-genome sequencing of bladder cancers reveals somatic CDKN1A mutations and clinicopathological associations with mutation burden. Nat Commun. 2014;5:3756.
    1. Babbs C, et al. Homozygous mutations in a predicted endonuclease are a novel cause of congenital dyserythropoietic anemia type I. Haematologica. 2013;98:1383–7.
    1. Martin HC, et al. Clinical whole-genome sequencing in severe early-onset epilepsy reveals new genes and improves molecular diagnosis. Hum Mol Genet. 2014
    1. Sharma VP, et al. Mutations in TCF12, encoding a basic helix-loop-helix partner of TWIST1, are a frequent cause of coronal craniosynostosis. Nat Genet. 2013;45:304–7.
    1. Cossins J, et al. Congenital myasthenic syndromes due to mutations in ALG2 and ALG14. Brain. 2013;136:944–56.
    1. Lise S, et al. Recessive mutations in SPTBN2 implicate beta-III spectrin in both cognitive and motor development. PLoS Genet. 2012;8:e1003074.
    1. Palles C, et al. Germline mutations affecting the proofreading domains of POLE and POLD1 predispose to colorectal adenomas and carcinomas. Nat Genet. 2012;45:136–144.
    1. McCarthy DJ, et al. Choice of transcripts and software has a large effect on variant annotation. Genome Med. 2014;6:26.
    1. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic acids research. 2010;38:e164.
    1. McLaren W, et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics. 2010;26:2069–70.
    1. Nelson MR, et al. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science. 2012;337:100–4.
    1. Stenson PD, et al. The Human Gene Mutation Database: 2008 update. Genome Med. 2009;1:13.
    1. Pagel P, et al. The MIPS mammalian protein-protein interaction database. Bioinformatics. 2005;21:832–4.
    1. de Ligt J, et al. Diagnostic exome sequencing in persons with severe intellectual disability. N Engl J Med. 2012;367:1921–9.
    1. Swaminathan G, Tsygankov AY. The Cbl family proteins: ring leaders in regulation of cell signaling. J Cell Physiol. 2006;209:21–43.
    1. Denayer E, Legius E. What’s new in the neuro-cardio-facial-cutaneous syndromes? Eur J Pediatr. 2007;166:1091–8.
    1. Martinelli S, et al. Heterozygous germline mutations in the CBL tumor-suppressor gene cause a Noonan syndrome-like phenotype. Am J Hum Genet. 2010;87:250–7.
    1. Niemeyer CM, et al. Germline CBL mutations cause developmental abnormalities and predispose to juvenile myelomonocytic leukemia. Nat Genet. 2010;42:794–800.
    1. Perez B, et al. Germline mutations of the CBL gene define a new genetic syndrome with predisposition to juvenile myelomonocytic leukaemia. J Med Genet. 2010;47:686–91.
    1. Nava C, et al. Analysis of the chromosome X exome in patients with autism spectrum disorders identified novel candidate genes, including TMLHE. Transl Psychiatry. 2012;2:e179.
    1. Isrie M, et al. HUWE1 mutation explains phenotypic severity in a case of familial idiopathic intellectual disability. Eur J Med Genet. 2013;56:379–82.
    1. Froyen G, et al. Submicroscopic duplications of the hydroxysteroid dehydrogenase HSD17B10 and the E3 ubiquitin ligase HUWE1 are associated with mental retardation. Am J Hum Genet. 2008;82:432–43.
    1. McMullin MF. The classification and diagnosis of erythrocytosis. Int J Lab Hematol. 2008;30:447–59.
    1. Jelkmann W. Regulation of erythropoietin production. Journal of Physiology. 2011;589:1251–8.
    1. Bowl MR, et al. An interstitial deletion-insertion involving chromosomes 2p25.3 and Xq27.1, near SOX3, causes X-linked recessive hypoparathyroidism. J. Clin. Invest. 2005;115:2822–31.
    1. Zajac JD, Danks JA. The development of the parathyroid gland: from fish to human. Current Opinion in Nephrology and Hypertension. 2008;17:353–6.
    1. Green RC, et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genetics in Medicine. 2013;15:565–74.
    1. MacArthur DG, et al. Guidelines for investigating causality of sequence variants in human disease. Nature. 2014;508:469–76.
    1. Metcalfe K, et al. Family history of cancer and cancer risks in women with BRCA1 or BRCA2 mutations. J Natl Cancer Inst. 2010;102:1874–8.
    1. Zuk O, et al. Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci U S A. 2014;111:E455–64.
    1. Moutsianas L, et al. The Power of Gene-based Rare Variant Methods to Detect Disease-associated Variation and Test Hypotheses about Complex Disease. PLoS Genet. in press.
    1. Kapplinger JD, et al. Distinguishing arrhythmogenic right ventricular cardiomyopathy/dysplasia-associated mutations from background genetic noise. J Am Coll Cardiol. 2011;57:2317–27.
    1. Castera L, et al. Next-generation sequencing for the diagnosis of hereditary breast and ovarian cancer using genomic capture targeting multiple candidate genes. Eur J Hum Genet. 2014;22:1305–13.
    1. Chong HK, et al. The validation and clinical implementation of BRCAplus: a comprehensive high-risk breast cancer diagnostic assay. PLoS One. 2014;9:e97408.
    1. Borg A, et al. Characterization of BRCA1 and BRCA2 deleterious mutations and variants of unknown clinical significance in unilateral and bilateral breast cancer: the WECARE study. Hum Mutat. 2010;31:E1200–40.
    1. Rebbeck TR, et al. Bilateral prophylactic mastectomy reduces breast cancer risk in BRCA1 and BRCA2 mutation carriers: the PROSE Study Group. J Clin Oncol. 2004;22:1055–62.
    1. Hakansson S, et al. Moderate frequency of BRCA1 and BRCA2 germ-line mutations in Scandinavian familial breast cancer. Am J Hum Genet. 1997;60:1068–78.
    1. Landrum MJ, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980–5.
    1. Caputo S, et al. Description and analysis of genetic variants in French hereditary breast and ovarian cancer families recorded in the UMD-BRCA1/BRCA2 databases. Nucleic Acids Res. 2012;40:D992–1002.
    1. Brohet RM, et al. Breast and ovarian cancer risks in a large series of clinically ascertained families with a high proportion of BRCA1 and BRCA2 Dutch founder mutations. J Med Genet. 2014;51:98–107.
    1. Moss AJ, et al. Clinical aspects of type-1 long-QT syndrome by location, coding type, and biophysical function of mutations involving the KCNQ1 gene. Circulation. 2007;115:2481–9.
    1. Choi G, et al. Spectrum and frequency of cardiac channel defects in swimming-triggered arrhythmia syndromes. Circulation. 2004;110:2119–24.
    1. Kapplinger JD, et al. Spectrum and prevalence of mutations from the first 2,500 consecutive unrelated patients referred for the FAMILION long QT syndrome genetic test. Heart Rhythm. 2009;6:1297–303.
    1. Crotti L, et al. Long QT syndrome-associated mutations in intrauterine fetal death. JAMA. 2013;309:1473–82.
    1. Li Y, et al. Intracellular ATP binding is required to activate the slowly activating K+ channel I(Ks) Proc Natl Acad Sci U S A. 2013;110:18922–7.
    1. Vukcevic M, et al. Functional properties of RYR1 mutations identified in Swedish patients with malignant hyperthermia and central core disease. Anesth Analg. 2010;111:185–90.
Methods only references
    1. Lamble S, et al. Improved workflows for high throughput library preparation using the transposome-based Nextera system. BMC Biotechnol. 2013;13:104.
    1. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
    1. Lunter G, Goodson M. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res. 2011;21:936–9.
    1. Rimmer A, et al. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat Genet. 2014;46:912–8.
    1. Pagnamenta AT, et al. Exome sequencing can detect pathogenic mosaic mutations present at low allele frequencies. J Hum Genet. 2012;57:70–2.
    1. Ruark E, et al. Mosaic PPM1D mutations are associated with predisposition to breast and ovarian cancer. Nature. 2013;493:406–10.
    1. Thorvaldsdottir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2012
    1. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164.
    1. Yau C. OncoSNP-SEQ: a statistical approach for the identification of somatic copy number alterations from next-generation sequencing of cancer genomes. Bioinformatics. 2013;29:2482–4.
    1. Plagnol V, et al. A robust model for read count data in exome sequencing experiments and implications for copy number variant calling. Bioinformatics. 2012;28:2747–54.
    1. McQuillan R, et al. Runs of homozygosity in European populations. Am J Hum Genet. 2008;83:359–72.
    1. Colella S, et al. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 2007;35:2013–25.
    1. Abecasis GR, Cherny SS, Cookson WO, Cardon LR. Merlin--rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002;30:97–101.
    1. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.
    1. Diez-Roux G, et al. A high-resolution anatomical atlas of the transcriptome in the mouse embryo. PLoS Biol. 2011;9:e1000582.

Source: PubMed

3
Abonneren