Mutations causing medullary cystic kidney disease type 1 lie in a large VNTR in MUC1 missed by massively parallel sequencing

Andrew Kirby, Andreas Gnirke, David B Jaffe, Veronika Barešová, Nathalie Pochet, Brendan Blumenstiel, Chun Ye, Daniel Aird, Christine Stevens, James T Robinson, Moran N Cabili, Irit Gat-Viks, Edward Kelliher, Riza Daza, Matthew DeFelice, Helena Hůlková, Jana Sovová, Petr Vylet'al, Corinne Antignac, Mitchell Guttman, Robert E Handsaker, Danielle Perrin, Scott Steelman, Snaevar Sigurdsson, Steven J Scheinman, Carrie Sougnez, Kristian Cibulskis, Melissa Parkin, Todd Green, Elizabeth Rossin, Michael C Zody, Ramnik J Xavier, Martin R Pollak, Seth L Alper, Kerstin Lindblad-Toh, Stacey Gabriel, P Suzanne Hart, Aviv Regev, Chad Nusbaum, Stanislav Kmoch, Anthony J Bleyer, Eric S Lander, Mark J Daly, Andrew Kirby, Andreas Gnirke, David B Jaffe, Veronika Barešová, Nathalie Pochet, Brendan Blumenstiel, Chun Ye, Daniel Aird, Christine Stevens, James T Robinson, Moran N Cabili, Irit Gat-Viks, Edward Kelliher, Riza Daza, Matthew DeFelice, Helena Hůlková, Jana Sovová, Petr Vylet'al, Corinne Antignac, Mitchell Guttman, Robert E Handsaker, Danielle Perrin, Scott Steelman, Snaevar Sigurdsson, Steven J Scheinman, Carrie Sougnez, Kristian Cibulskis, Melissa Parkin, Todd Green, Elizabeth Rossin, Michael C Zody, Ramnik J Xavier, Martin R Pollak, Seth L Alper, Kerstin Lindblad-Toh, Stacey Gabriel, P Suzanne Hart, Aviv Regev, Chad Nusbaum, Stanislav Kmoch, Anthony J Bleyer, Eric S Lander, Mark J Daly

Abstract

Although genetic lesions responsible for some mendelian disorders can be rapidly discovered through massively parallel sequencing of whole genomes or exomes, not all diseases readily yield to such efforts. We describe the illustrative case of the simple mendelian disorder medullary cystic kidney disease type 1 (MCKD1), mapped more than a decade ago to a 2-Mb region on chromosome 1. Ultimately, only by cloning, capillary sequencing and de novo assembly did we find that each of six families with MCKD1 harbors an equivalent but apparently independently arising mutation in sequence markedly under-represented in massively parallel sequencing data: the insertion of a single cytosine in one copy (but a different copy in each family) of the repeat unit comprising the extremely long (∼1.5-5 kb), GC-rich (>80%) coding variable-number tandem repeat (VNTR) sequence in the MUC1 gene encoding mucin 1. These results provide a cautionary tale about the challenges in identifying the genes responsible for mendelian, let alone more complex, disorders through massively parallel sequencing.

Conflict of interest statement

COMPETING FINANCIAL INTERESTS

Andrew Kirby, Andreas Gnirke, Brendan Blumenstiel and Matthew DeFelice are listed as inventors on the C-insertion genotyping assay under patent review. The other authors declare no competing interests.

Figures

Figure 1. Linkage of six MCKD1 families…
Figure 1. Linkage of six MCKD1 families to chromosome 1
LOD curve shows the combined linkage-score of six MCKD1 pedigrees across 12 Mb of chromosome 1, with the peak score well above the threshold of 3.6 for genome-wide significance. Red X’s mark the locations of opposite-allele homozygous genotype calls between affected members within each pedigree and highlight regions where affected individuals de facto share no alleles IBD, thereby delineating genomic segments unlikely to harbor causal variation. The shaded region (hg19:chr1:154,370,020–156,439,000) was considered most likely to contain any causal mutations, bounded on each side by recombination breakpoints in two different pedigrees.
Figure 2. Discovery of +C insertion within…
Figure 2. Discovery of +C insertion within MUC1 coding VNTR
(a) The major domains of the full-length MUC1 precursor protein are shown: N-terminal signal sequence, VNTR, SEA module (where cleavage occurs), transmembrane domain, and C-terminal cytoplasmic domain. Based on fully and unambiguously assembled VNTR alleles, the frameshift caused by insertion of a C in the coding strand (as described in the main text) is expected to introduce a novel stop codon shortly beyond the VNTR domain. (b and c) Where possible, knowledge of segregating phased SNP-marker haplotypes was used to select for de novo VNTR sequencing and assembly of those individuals sharing only a single haplotype across the region, as this aided identification of the VNTR allele segregating with the shared risk haplotype. (d and e) Independent de novo assembly of the shared VNTR allele in two individuals from family 4 shows exactly identical complete sequence, with the seventh 60-base unit (red X) out of 44 containing a +C insertion event. The assembly is oriented relative to the coding strand of MUC1 and covers bases chr1:155,160,963-155,162,030 (hg19). Each unique 60-base repeat segment is represented by a different letter or number (Supplementary Fig. 2). (e) Translational impact of +C frameshift.
Figure 3. Detection of MUC1 +C insertion…
Figure 3. Detection of MUC1 +C insertion by probe-extension (PE) assay
(a) Exemplar electropherograms for the MUC1-VNTR +C-insertion PE assay (Online Methods) performed on homozygous reference-allele and heterozygote samples. (b) Allele-intensity scatterplot for large linkage family 2. X-axis values correspond to the detected intensity at the mass of the +C PE product, while Y-axis values reflect that of the reference repeat-unit extension product. Datum coloring reflects MCKD1 diagnosis: blue = unaffected (or HapMap samples), red = affected, white = unknown. Individuals known to carry the linkage-analysis risk haplotype are represented by “+”, while other family members are depicted as dots. (c) Allele-intensity scatterplot for all MCKD1 linkage families. Samples having log-transformed intensities below 0.25 for both alleles were excluded as failed assays. WGA and low DNA-concentration samples were also excluded for underperforming. (d) Allele-intensity scatterplot for HapMap samples together with selected positive controls (MCKD1 individuals known to carry the insertion).
Figure 4. Immunohistochemical and immunofluorescence studies of…
Figure 4. Immunohistochemical and immunofluorescence studies of MUC1-fs protein
In MCKD1 patients, MUC1-fs is expressed and present in renal epithelial cells of Henle’s loop, distal convoluted tubule, and collecting duct. (a) Strong intracellular staining of MUC1-fs protein in MCKD1 patient, and (b) absence of the specific staining in control; TALH - thick ascending limb of Henle’s loop; CD – collecting duct; PT – proximal tubule. (c) Immunofluorescence analysis showing diffuse and/or fine granular intracellular and membrane staining of MUC1-fs protein, and its partial colocalization with normal MUC1 in collecting duct of an MCKD1 patient. MUC1-fs staining is absent in control, and colocalization with normal MUC1 is therefore not detected. The values of fluorescent signal overlaps are transformed to a pseudo-color scale shown at right bottom in the corresponding lookup table. (d) Immunofluorescence analysis showing different intracellular localizations and partial sub-membrane colocalization of MUC1-fs and normal MUC1 proteins in collecting duct of MCKD1 patient. Note specific staining of both forms in distinct membrane microdomains. (e) Absence of MUC1-fs staining and characteristic membrane localization of normal MUC1 in control.

References

    1. Bleyer AJ, Hart PS, Kmoch S. Hereditary interstitial kidney disease. Semin Nephrol. 2010;30:366–373.
    1. Castro AF, Coresh J. CKD surveillance using laboratory data from the population-based National Health and Nutrition Examination Survey (NHANES) Am J Kidney Dis. 2009;53:S46–55.
    1. Christodoulou K, et al. Chromosome 1 localization of a gene for autosomal dominant medullary cystic kidney disease. Hum Mol Genet. 1998;7:905–911.
    1. Wolf MTF, et al. Medullary cystic kidney disease type 1: mutational analysis in 37 genes based on haplotype sharing. Hum Genet. 2006;119:649–658.
    1. Serafini-Cessi F, Malagolini N, Cavallone D. Tamm-Horsfall glycoprotein: biology and clinical relevance. Am J Kidney Dis. 2003;42:658–676.
    1. Vylet’al P, et al. Alterations of uromodulin biology: a common denominator of the genetically heterogeneous FJHN/MCKD syndrome. Kidney Int. 2006;70:1155–1169.
    1. Scolari F, et al. Uromodulin storage diseases: clinical aspects and mechanisms. Am J Kidney Dis. 2004;44:987–999.
    1. Choi M, et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci U S A. 2009;106:19096–19101.
    1. Al-Romaih KI, et al. Genetic diagnosis in consanguineous families with kidney disease by homozygosity mapping coupled with whole-exome sequencing. Am J Kidney Dis. 2011;58:186–195.
    1. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073.
    1. Gemayel R, Vinces MD, Legendre M, Verstrepen KJ. Variable tandem repeats accelerate evolution of coding and regulatory sequences. Annu Rev Genet. 2010;44:445–477.
    1. Legendre M, Pochet N, Pak T, Verstrepen KJ. Sequence-based estimation of minisatellite and microsatellite repeat variability. Genome Res. 2007;17:1787–1796.
    1. Horne AW, et al. MUC 1: a genetic susceptibility to infertility? Lancet. 2001;357:1336–1337.
    1. Fowler JC, Teixeira AS, Vinall LE, Swallow DM. Hypervariability of the membrane-associated mucin and cancer marker MUC1. Hum Genet. 2003;113:473–479.
    1. Auranen M, Ala-Mello S, Turunen JA, Järvelä I. Further evidence for linkage of autosomal-dominant medullary cystic kidney disease on chromosome 1q21. Kidney Int. 2001;60:1225–1232.
    1. Spicer AP, Rowse GJ, Lidner TK, Gendler SJ. Delayed mammary tumor progression in Muc-1 null mice. J Biol Chem. 1995;270:30093–30101.
    1. Lander E, Kruglyak L. Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet. 1995;11:241–247.
    1. Purcell S, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–575.
    1. Abecasis GR, Cherny SS, Cookson WO, Cardon LR. Merlin--rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002;30:97–101.
    1. Korn JM, et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet. 2008;40:1253–1260.
    1. Handsaker RE, Korn JM, Nemesh J, McCarroll SA. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat Genet. 2011;43:269–276.

Source: PubMed

3
Suscribir