RNA-Seq: a revolutionary tool for transcriptomics

Zhong Wang, Mark Gerstein, Michael Snyder, Zhong Wang, Mark Gerstein, Michael Snyder

Abstract

RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.

Figures

Figure 1. A typical RNA-Seq experiment
Figure 1. A typical RNA-Seq experiment
Briefly, long RNAs are first converted into a library of cDNA fragments through either RNA fragmentation or DNA fragmentation (see main text). Sequencing adaptors (blue) are subsequently added to each cDNA fragment and a short sequence is obtained from each cDNA using high-throughput sequencing technology. The resulting sequence reads are aligned with the reference genome or transcriptome, and classified as three types: exonic reads, junction reads and poly(A) end-reads. These three types are used to generate a base-resolution expression profile for each gene, as illustrated at the bottom; a yeast ORF with one intron is shown.
Figure 2. Quantifying expression levels: RNA-Seq and…
Figure 2. Quantifying expression levels: RNA-Seq and microarray compared
Expression levels are shown, as measured by RNA-Seq and tiling arrays, for Saccharomyces cerevisiae cells grown in nutrient-rich media. The two methods agree fairly well for genes with medium levels of expression (middle), but correlation is very low for genes with either low or high expression levels. The tiling array data used in this figure is taken from REF. , and the RNA-Seq data is taken from REF. .
Figure 3. DNA library preparation: RNA fragmentation…
Figure 3. DNA library preparation: RNA fragmentation and DNA fragmentation compared
a | Fragmentation of oligo-dT primed cDNA (blue line) is more biased towards the 3′ end of the transcript. RNA fragmentation (red line) provides more even coverage along the gene body, but is relatively depleted for both the 5′ and 3′ ends. Note that the ratio between the maximum and minimum expression level (or the dynamic range) for microarrays is 44, for RNA-Seq it is 9,560. The tag count is the average sequencing coverage for 5,000 yeast ORFs. b | A specific yeast gene, SES1 (seryl-tRNA synthetase), is shown.
Figure 4. Poly(A) tags from RNA-Seq
Figure 4. Poly(A) tags from RNA-Seq
A region containing two overlapping transcripts (ACT1, from the actin gene, and YFL040W, an uncharacterized ORF) from the Saccharomyces cerevisiae genome is shown. Arrows point to transcription direction. The poly(A) tags from RNA-Seq experiments are shown below these transcripts, with arrows indicating transcription direction. The precise location of each locus identified by poly(A) tags reveals the heterogeneity in poly(A) sites, for example, ACT1 has two big clusters, both with a few bases of local heterogeneity. The transcription direction revealed by poly(A) tags also helps to resolve 3′-end overlapping transcribed regions.
Figure 5. Coverage versus depth
Figure 5. Coverage versus depth
a | 80% of yeast genes were detected at 4 million uniquely mapped RNA-Seq reads, and coverage reaches a plateau afterwards despite the increasing sequencing depth. Expressed genes are defined as having at least four independent reads from a 50-bp window at the 3′ end. Data is taken from REF. . b | The number of unique start sites detected starts to reach a plateau when the depth of sequencing reaches 80 million in two mouse transcriptomes. ES, embryonic stem cells; EB, embryonic body. Figure is modified, with permission, from REF. © (2008) Macmillan Publishers Ltd. All rights reserved.

Source: PubMed

3
Abonnieren