Massively parallel sequencing: the next big thing in genetic medicine

Tracy Tucker, Marco Marra, Jan M Friedman, Tracy Tucker, Marco Marra, Jan M Friedman

Abstract

Massively parallel sequencing has reduced the cost and increased the throughput of genomic sequencing by more than three orders of magnitude, and it seems likely that costs will fall and throughput improve even more in the next few years. Clinical use of massively parallel sequencing will provide a way to identify the cause of many diseases of unknown etiology through simultaneous screening of thousands of loci for pathogenic mutations and by sequencing biological specimens for the genomic signatures of novel infectious agents. In addition to providing these entirely new diagnostic capabilities, massively parallel sequencing may also replace arrays and Sanger sequencing in clinical applications where they are currently being used. Routine clinical use of massively parallel sequencing will require higher accuracy, better ways to select genomic subsets of interest, and improvements in the functionality, speed, and ease of use of data analysis software. In addition, substantial enhancements in laboratory computer infrastructure, data storage, and data transfer capacity will be needed to handle the extremely large data sets produced. Clinicians and laboratory personnel will require training to use the sequence data effectively, and appropriate methods will need to be developed to deal with the incidental discovery of pathogenic mutations and variants of uncertain clinical significance. Massively parallel sequencing has the potential to transform the practice of medical genetics and related fields, but the vast amount of personal genomic data produced will increase the responsibility of geneticists to ensure that the information obtained is used in a medically and socially responsible manner.

Figures

Figure 1
Figure 1
Sanger Sequencing Workflow DNA fragments are enriched by PCR and sequenced with a combination of regular deoxynucleotides and terminating labeled dideoxynucleotides (ddNTPs), each with a base-specific color. Different fragment lengths are generated and size separated by capillary electrophoresis, and the location of each of the ddNTPs is identified by excitation with a laser. Reprinted with permission from Applied Biosystems.
Figure 2
Figure 2
Illumina Genome Analyzer Workflow Sequencing libraries are generated by fragmenting genomic DNA, denaturation, and adaptor ligation. Fragments are added to the flow cell chamber coated with oligonucleotides complementary to the adaptors. Hybridization forms a “bridge,” and amplification is primed from the 3′ end and continues until it reaches the 5′ end. After several rounds of amplification, discrete clusters of fragments, all with the same sequence, are formed. The clusters are denatured, and sequencing primers, polymerase, and fluorescently labeled nucleotides, each with their 3′OH chemically inactivated, are added. After each base is incorporated, the surface is imaged, the 3′OH-inactivating residue and label are removed, and the process repeated. Reprinted with permission from Illumina, Inc.
Figure 3
Figure 3
Applied Biosystems SOLiD Sequencer Workflow DNA is fragmented and oligonucleotide adaptors are ligated to each end. The fragments are hybridized to complementary oligonucleotides attached to magnetic beads. The beads are contained within an oil emulsion where amplification is performed. When amplification is complete, the emulsion is broken, and the beads are attached to a glass surface and placed within the sequencer. A universal sequencing primer, complementary to the adaptor sequence, is added followed by subsequent ligation cycles with fluorescently labeled degenerate octomers. After each cycle, the glass surface is imaged and the octomer is cleaved between bases 5 and 6, removing the fluorescent tag, and a new octomer is added. After several rounds of sequencing, the extended universal primer is removed and a new universal primer is added that is offset by one base. Reprinted with permission from Applied Biosystems.
Figure 4
Figure 4
GS-FLX 454 Sequencer Workflow DNA is fragmented and adaptors, one of which is biotinylated, are ligated to each end. Fragments are coupled to agarose beads by oligonucleotides complementary to the adaptor sequence and contained within an emulsion droplet for amplification. When amplification is completed, the beads are put into an individual well on a fiber optic slide and placed in the sequencer. Nucleotides and polymerase are sequentially added, and the sequence produced is monitored by the generation of light through an enzymatic reaction that is coupled to DNA synthesis. Modified with permission from 454 Sequencing, copyright 2009 Roche Diagnostics.
Figure 5
Figure 5
Helicos Heliscope Sequencer Workflow Fragments are captured by poly-T oligomers tethered to an array. At each sequencing cycle, polymerase and single fluorescently labeled nucleotide are added and the array is imaged. The fluorescent tag is then removed and the cycle is repeated. Reprinted with permission from Helicos BioSciences Corporation.
Figure 6
Figure 6
Genomic Enrichment Strategies (A) Megaplex PCR. Surface-bound oligonucleotide primers (F & R) bind to DNA and amplify the sequence for the 1st and 2nd round of PCR. This reaction also incorporates a sequence that is complementary to a universal primer (U1 & U2), which is used for subsequent PCR cycles. Modified from ten Bosch and Grody. (B) Selector Probe Circularization. Genomic DNA (gray) is digested with restriction enzymes and circularized by hybridization of “selector probes” (black) with single stranded overhangs (white box) to the 3′ and 5′ ends of the digested DNA. DNA ligase fills in the gap, and universal primers (checkered box), complementary to the sequences within the selector probes, are used to amplify the circularized DNA. Modified from ten Bosch and Grody. (C) Nested-Patched PCR. Primer pairs containing uracil instead of thymine (wide white arrow) are constructed for all target regions. The primers amplify target regions for a low number of cycles. The primers are cleaved with uracil DNA glycosylase, nested patch oligonucleotides (gray and white checkered box) are annealed to target amplicons, universal primers (gray box) are ligated to the amplicons, and subsequent PCR cycles are primed with these universal primers. Modified from Varley and Mitra. (D) Microarray pull-down method. Genomic DNA is fragmented, and universal adaptor (white box) sequences are ligated to the ends of each fragment. The fragments of interest are captured by hybridization on the microarray (black line), which has been constructed with probes that are complementary to these sequences. The array is then denatured, and the fragments released are enriched by PCR with the universal adaptor sequence as primers. Modified from ten Bosch and Grody.
Figure 7
Figure 7
Paired-End Reads DNA is isolated (A), fragmented into pieces of a standard size, e.g., about 3 kb, and ligated to adaptors (blue boxes) on both ends (B). Adaptors permit 3 kb pieces to be circularized (C). Circles are isolated, then broken into much smaller fragments (e.g., a few hundred base pairs) (D), and the fragments containing adaptors are isolated. In these fragments, the adaptor is flanked by the sequence that was at the opposite ends of the original 3 kb piece. The paired ends are sequenced and mapped back to the canonical human genome (E) so that structural variants can be identified (see text).

Source: PubMed

3
Suscribir