Multiplex target capture with double-stranded DNA probes

Peidong Shen, Wenyi Wang, Aung-Kyaw Chi, Yu Fan, Ronald W Davis, Curt Scharfe, Peidong Shen, Wenyi Wang, Aung-Kyaw Chi, Yu Fan, Ronald W Davis, Curt Scharfe

Abstract

Target enrichment technologies utilize single-stranded oligonucleotide probes to capture candidate genomic regions from a DNA sample before sequencing. We describe target capture using double-stranded probes, which consist of single-stranded, complementary long padlock probes (cLPPs), each selectively capturing one strand of a genomic target through circularization. Using two probes per target increases sensitivity for variant detection and cLPPs are easily produced by PCR at low cost. Additionally, we introduce an approach for generating capture libraries with uniformly randomized template orientations. This facilitates bidirectional sequencing of both the sense and antisense template strands during one paired-end read, which maximizes target coverage.

Figures

Figure 1
Figure 1
Probe construction, target capture and reciprocal paired-end sequencing.(a) Each cLPP contains a common linker flanked by post-capture amplification sites (red and green) and two target-specific capturing arms (blue and orange). Probe ends are trimmed (BsaI and MlyI) and 5'-phosphorylated to produce functional cLPPs. (b) Multiplex probe-target hybridization followed by gap-filling and ligation triggers probe circularization and target capture. (c) Capture libraries are multiplex-amplified using hybrid primers that anneal to the probes' amplification sites and add Illumina sequencing adaptors (P5 or P7) and sample-specific barcodes. This is done in two separate PCRs during which the adaptors swap positions at the ends of templates. Both PCRs are pooled for reciprocal PE sequencing of both DNA strands.
Figure 2
Figure 2
Coverage distribution across target regions. (a) Cumulative mean percent base coverage across 5,619 targets captured using cLPPs and ssLPPs, respectively, and shown separately for sequence read 1 and read 2. All bases have a minimum of 10× coverage. (b) Log ratio of coverage of read 1 and 2. Each boxplot corresponds to coverage distribution of a group of amplicons within a defined size range with number of amplicons, percent bases covered (≥10×) and average GC content shown for each group. All groups present a statistically significant distribution different from each other and each maintains a mean significantly different from 0.

References

    1. Harismendy O, Ng PC, Strausberg RL, Wang X, Stockwell TB, Beeson KY, Schork NJ, Murray SS, Topol EJ, Levy S, Frazer KA. Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biol. 2009;5:R32. doi: 10.1186/gb-2009-10-3-r32.
    1. Mamanova L, Coffey AJ, Scott CE, Kozarewa I, Turner EH, Kumar A, Howard E, Shendure J, Turner DJ. Target-enrichment strategies for next-generation sequencing. Nat Methods. 2010;5:111–118. doi: 10.1038/nmeth.1419.
    1. Fuller CW, Middendorf LR, Benner SA, Church GM, Harris T, Huang X, Jovanovich SB, Nelson JR, Schloss JA, Schwartz DC, Vezenov DV. The challenges of sequencing by synthesis. Nat Biotechnol. 2009;5:1013–1023. doi: 10.1038/nbt.1585.
    1. Schrijver I, Aziz N, Farkas DH, Furtado M, Gonzalez AF, Greiner TC, Grody WW, Hambuch T, Kalman L, Kant JA, Klein RD, Leonard DG, Lubin IM, Mao R, Nagan N, Pratt VM, Sobel ME, Voelkerding KV, Gibson JS. Opportunities and challenges associated with clinical diagnostic genome sequencing: a report of the Association for Molecular Pathology. J Mol Diagn. 2012;5:525–540. doi: 10.1016/j.jmoldx.2012.04.006.
    1. Mertes F, Elsharawy A, Sauer S, van Helvoort JM, van der Zaag PJ, Franke A, Nilsson M, Lehrach H, Brookes AJ. Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief Funct Genomics. 2011;5:374–386. doi: 10.1093/bfgp/elr033.
    1. Fredriksson S, Baner J, Dahl F, Chu A, Ji H, Welch K, Davis RW. Multiplex amplification of all coding sequences within 10 cancer genes by Gene-Collector. Nucleic Acids Res. 2007;5:e47. doi: 10.1093/nar/gkm078.
    1. Varley KE, Mitra RD. Nested Patch PCR enables highly multiplexed mutation discovery in candidate genes. Genome Res. 2008;5:1844–1850. doi: 10.1101/gr.078204.108.
    1. Tewhey R, Warner JB, Nakano M, Libby B, Medkova M, David PH, Kotsopoulos SK, Samuels ML, Hutchison JB, Larson JW, Topol EJ, Weiner MP, Harismendy O, Olson J, Link DR, Frazer KA. Microdroplet-based PCR enrichment for large-scale targeted sequencing. Nat Biotechnol. 2009;5:1025–1031. doi: 10.1038/nbt.1583.
    1. Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, Weinstock GM, Gibbs RA. Direct selection of human genomic loci by microarray hybridization. Nat Methods. 2007;5:903–905. doi: 10.1038/nmeth1111.
    1. Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ, Hannon GJ, McCombie WR. Genome-wide in situ exon capture for selective resequencing. Nat Genet. 2007;5:1522–1527. doi: 10.1038/ng.2007.42.
    1. Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ, Zwick ME. Microarray-based genomic selection for high-throughput resequencing. Nat Methods. 2007;5:907–909. doi: 10.1038/nmeth1109.
    1. Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, Fisher S, Russ C, Gabriel S, Jaffe DB, Lander ES, Nusbaum C. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol. 2009;5:182–189. doi: 10.1038/nbt.1523.
    1. Turner EH, Lee C, Ng SB, Nickerson DA, Shendure J. Massively parallel exon capture and library-free resequencing across 16 genomes. Nat Methods. 2009;5:315–316. doi: 10.1038/nmeth.f.248.
    1. O'Roak BJ, Vives L, Fu W, Egertson JD, Stanaway IB, Phelps IG, Carvill G, Kumar A, Lee C, Ankenman K, Munson J, Hiatt JB, Turner EH, Levy R, O'Day DR, Krumm N, Coe BP, Martin BK, Borenstein E, Nickerson DA, Mefford HC, Doherty D, Akey JM, Bernier R, Eichler EE, Shendure J. Multiplex targeted sequencing identifies recurrently mutated genes in autism spectrum disorders. Science. 2012;5:1619–1622. doi: 10.1126/science.1227764.
    1. Larsson C, Koch J, Nygren A, Janssen G, Raap AK, Landegren U, Nilsson M. In situ genotyping individual DNA molecules by target-primed rolling-circle amplification of padlock probes. Nat Methods. 2004;5:227–232. doi: 10.1038/nmeth723.
    1. Diep D, Plongthongkum N, Gore A, Fung HL, Shoemaker R, Zhang K. Library-free methylation sequencing with bisulfite padlock probes. Nat Methods. 2012;5:270–272. doi: 10.1038/nmeth.1871.
    1. Krishnakumar S, Zheng J, Wilhelmy J, Faham M, Mindrinos M, Davis R. A comprehensive assay for targeted multiplex amplification of human DNA sequences. Proc Natl Acad Sci USA. 2008;5:9296–9301. doi: 10.1073/pnas.0803240105.
    1. Shen P, Wang W, Krishnakumar S, Palm C, Chi AK, Enns GM, Davis RW, Speed TP, Mindrinos MN, Scharfe C. High-quality DNA sequence capture of 524 disease candidate genes. Proc Natl Acad Sci USA. 2011;5:6549–6554. doi: 10.1073/pnas.1018981108.
    1. Dahl F, Gullberg M, Stenberg J, Landegren U, Nilsson M. Multiplex amplification enabled by selective circularization of large sets of genomic DNA fragments. Nucleic Acids Res. 2005;5:e71. doi: 10.1093/nar/gni070.
    1. Dahl F, Stenberg J, Fredriksson S, Welch K, Zhang M, Nilsson M, Bicknell D, Bodmer WF, Davis RW, Ji H. Multigene amplification and massively parallel sequencing for cancer mutation discovery. Proc Natl Acad Sci USA. 2007;5:9387–9392. doi: 10.1073/pnas.0702165104.
    1. Hiatt JB, Pritchard CC, Salipante SJ, O'Roak BJ, Shendure J. Single molecule molecular inversion probes for targeted, high-accuracy detection of low-frequency variation. Genome Res. 2013;5:843–854. doi: 10.1101/gr.147686.112.
    1. Myllykangas S, Buenrostro JD, Natsoulis G, Bell JM, Ji HP. Efficient targeted resequencing of human germline and cancer genomes by oligonucleotide-selective sequencing. Nat Biotechnol. 2011;5:1024–1027. doi: 10.1038/nbt.1996.
    1. Clark MJ, Chen R, Lam HY, Karczewski KJ, Chen R, Euskirchen G, Butte AJ, Snyder M. Performance comparison of exome DNA sequencing technologies. Nat Biotechnol. 2011;5:908–914. doi: 10.1038/nbt.1975.
    1. Porreca GJ, Zhang K, Li JB, Xie B, Austin D, Vassallo SL, LeProust EM, Peck BJ, Emig CJ, Dahl F, Gao Y, Church GM, Shendure J. Multiplex amplification of large sets of human exons. Nat Methods. 2007;5:931–936. doi: 10.1038/nmeth1110.
    1. LeProust EM, Peck BJ, Spirin K, McCuen HB, Moore B, Namsaraev E, Caruthers MH. Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process. Nucleic Acids Res. 2010;5:2522–2540. doi: 10.1093/nar/gkq163.
    1. Nilsson M, Malmgren H, Samiotaki M, Kwiatkowski M, Chowdhary BP, Landegren U. Padlock probes: circularizing oligonucleotides for localized DNA detection. Science. 1994;5:2085–2088. doi: 10.1126/science.7522346.
    1. Hardenbol P, Baner J, Jain M, Nilsson M, Namsaraev EA, Karlin-Neumann GA, Fakhrai-Rad H, Ronaghi M, Willis TD, Landegren U, Davis RW. Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat Biotechnol. 2003;5:673–678. doi: 10.1038/nbt821.
    1. Akhras MS, Unemo M, Thiyagarajan S, Nyren P, Davis RW, Fire AZ, Pourmand N. Connector inversion probe technology: a powerful one-primer multiplex DNA amplification system for numerous scientific applications. PLoS One. 2007;5:e915. doi: 10.1371/journal.pone.0000915.
    1. Anderson S. Shotgun DNA sequencing using cloned DNase I-generated fragments. Nucleic Acids Res. 1981;5:3015–3027. doi: 10.1093/nar/9.13.3015.
    1. Zhang DY, Seelig G. Dynamic DNA nanotechnology using strand-displacement reactions. Nat Chem. 2011;5:103–113. doi: 10.1038/nchem.957.
    1. Altshuler DM, Gibbs RA, Peltonen L, Altshuler DM, Gibbs RA, Peltonen L, Dermitzakis E, Schaffner SF, Yu F, Peltonen L, Dermitzakis E, Bonnen PE, Altshuler DM, Gibbs RA, de Bakker PI, Deloukas P, Gabriel SB, Gwilliam R, Hunt S, Inouye M, Jia X, Palotie A, Parkin M, Whittaker P, Yu F, Chang K, Hawes A, Lewis LR, Ren Y, Wheeler D. et al.Integrating common and rare genetic variation in diverse human populations. Nature. 2010;5:52–58. doi: 10.1038/nature09298.
    1. Mondal K, Shetty AC, Patel V, Cutler DJ, Zwick ME. Targeted sequencing of the human × chromosome exome. Genomics. 2011;5:260–265. doi: 10.1016/j.ygeno.2011.04.004.
    1. Wang W, Carvalho B, Miller ND, Pevsner J, Chakravarti A, Irizarry RA. Estimating genome-wide copy number using allele-specific mixture models. J Comput Biol. 2008;5:857–866. doi: 10.1089/cmb.2007.0148.
    1. Nord AS, Lee M, King MC, Walsh T. Accurate and exact CNV identification from targeted high-throughput sequence data. BMC Genomics. 2011;5:184. doi: 10.1186/1471-2164-12-184.
    1. Li J, Lupat R, Amarasinghe KC, Thompson ER, Doyle MA, Ryland GL, Tothill RW, Halgamuge SK, Campbell IG, Gorringe KL. CONTRA: copy number analysis for targeted resequencing. Bioinformatics. 2012;5:1307–1313. doi: 10.1093/bioinformatics/bts146.
    1. Sena EP, Zarling DA. Targeting in linear DNA duplexes with two complementary probe strands for hybrid stability. Nat Genet. 1993;5:365–372. doi: 10.1038/ng0493-365.
    1. Casbon JA, Osborne RJ, Brenner S, Lichtenstein CP. A method for counting PCR template molecules with application to next-generation sequencing. Nucleic Acids Res. 2011;5:e81. doi: 10.1093/nar/gkr217.
    1. Minoche AE, Dohm JC, Himmelbauer H. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems. Genome Biol. 2011;5:R112. doi: 10.1186/gb-2011-12-11-r112.

Source: PubMed

3
Subscribe