Application of an RNA amplification method for reliable single-cell transcriptome analysis

Oleg Suslov, Daniel J Silver, Florian A Siebzehnrubl, Arturo Orjalo, Andrey Ptitsyn, Dennis A Steindler, Oleg Suslov, Daniel J Silver, Florian A Siebzehnrubl, Arturo Orjalo, Andrey Ptitsyn, Dennis A Steindler

Abstract

Diverse cell types have unique transcriptional signatures that are best interrogated at single-cell resolution. Here we describe a novel RNA amplification approach that allows for high fidelity gene profiling of individual cells. This technique significantly diminishes the problem of 3' bias, enabling detection of all regions of transcripts, including the recognition of mRNA with short or completely absent poly(A) tails, identification of noncoding RNAs, and discovery of the full array of splice isoforms from any given gene product. We assess this technique using statistical and bioinformatics analyses of microarray data to establish the limitations of the method. To demonstrate applicability, we profiled individual cells isolated from the mouse subventricular zone (SVZ)-a well-characterized, discrete yet highly heterogeneous neural structure involved in persistent neurogenesis. Importantly, this method revealed multiple splice variants of key germinal zone gene products within individual cells, as well as an unexpected coexpression of several mRNAs considered markers of distinct and separate SVZ cell types. These findings were independently confirmed using RNA-fluorescence in situ hybridization (RNA-FISH), contributing to the utility of this new technology that offers genomic and transcriptomic analysis of small numbers of dynamic and clinically relevant cells.

Keywords: RNA amplification; SVZ; single-cell analysis; stem and progenitor cells.

Conflict of interest statement

Competing interests

The authors declare no competing interests.

Figures

Figure 1. RNA amplification scheme
Figure 1. RNA amplification scheme
(A) Design of the RNA amplification method. RT step: The template for the RT reaction is polyadenylated RNA. The modified sequences of oligo(dT) and random primers are presented as black boxes. The T7-SWITCH primer contains a modified sequence, a T7 sequence, and an rGrGrG sequence (T7 and rGrGrG sequences are shown in red). Note that one of three cDNA products from the RT reaction does not have a 3′ tagging sequence because of the limited capacity of the switching effect. This cDNA is later primed with extending primers. PCR step: The PCR reaction contains all components initially, including the extending and amplification primers. The extending primers carry a specific sequence for the T7 RNA polymerase site and either random or Kozak sequences, which are are presented as green lines. The profile of two starting PCR cycles was tailored toward the preferable annealing of extending primers to the cDNA. These cycles are shown separately. The rest of the PCR cycles were modified to ensure optimal performance of the amplification primer. The product generated at the end of the PCR step was double-stranded DNA. T7 IVT step: The 5′ end localization of the T7 promoter on double-stranded DNA guarantees the synthesis of sense RNA, which is shown as red lines. Note that some RNA molecules are polyadenylated because they are the product of oligo(dT) priming. Other RNA molecules generated using random priming do not have a poly(A) stretch. (B) Primers utilized in the RNA amplification method.
Figure 2. Significantly enriched pathways for four…
Figure 2. Significantly enriched pathways for four MetaCore categories
GeneGO Pathway maps (A), GO Processes (B), GeneGO Process Networks (C), and GeneGO Diseases by Biomarkers (D) significantly enriched for differentially expressed genes are shown. Bar histograms (on the left) corresponding to the ontology terms in every section (1 to 10, listed on the right side of each diagram) are sorted in decreasing order of P-value (top to bottom) with no amplification, 20 ng, 1 ng, and 20 pg total RNA. The top terms are represented by histogram sections each having at least one longest strip regardless of which experiment it belongs to. Dimmed (semi-transparent) bars indicate marginal significance (with P-values below the 0.05 cutoff, as indicated on the logarithmic scale on top). Only the top 10 pathways are shown in every section in the form of a bar graph histogram, and the list of 50 pathways is available in Supplementary Figures S4–S7.
Figure 3. Expression of subventricular zone (SVZ)…
Figure 3. Expression of subventricular zone (SVZ) cell markers
(A) SVZ markers expressed in GFP positive cells. (C) SVZ markers expressed in the GFP negative population. Markers of stem cells are defined as B type, transit-amplifying cells correspond to C type, neuroblasts correspond to A type; astrocytes correspond to As typ; and ependymal cells correspond to E type. Each cell type is identified by a combination of these markers (33). Expression of certain cell surface molecules can ensure the isolation of distinct cell types from GFP positive (B) and GFP negative (D) cell populations. Genes presented in italics were recently assigned as B cell markers using microarray data from the stem cell–enriched population (34). Heat maps reflect real-time PCR-derived Cq values for each transcript. Samples without amplification (w/o) were used as positive controls. Total RNA input of unamplified pooled SVZ cells or embryonic cells after the RT reaction corresponds to 3 μg. Cell #14 was excluded for technical reasons (low expression of ACTB and all other genes). Primer sequences are provided in Supplementary Table S4.
Figure 4. Expression patterns of Numb, Prominin1,…
Figure 4. Expression patterns of Numb, Prominin1, and GFAP isoforms
(A) Numb isoforms 1, 3, and 6 that were not detected contain insertions into the PRR domain, and the detected Numb isoforms 2, 4, and 5 do not have an insertion (36). Because the same pair of primers can detect different isoforms, it is not possible to assign the expression of a certain isoform to a specific cell. For example, cells 5–8 can express either the combination of isoforms 2, 4, and 5 or only isoform 5. At the same time, Numb2 is expressed in cells 2 and 3, Numb4 in cell 16, and Numb5 in all cells where it was detected. (B) The cells expressed mostly Prominin1 isoforms 2 and 7. No isoform-specific transcripts were detected in cell #6, but primers recognizing all isoforms produced a signal. Our results also show good concordance with previous studies that revealed the expression of only Prominin1 transcript isoforms 1, 2, and 6 in the subventricular zone (SVZ), and the absence of isoforms 3, 4, 5, and 8 (34,37). We also observed that isoform 1 was expressed exclusively in one of the GFP− cells, while other isoforms were present only in GFP+ population. While the expression of isoform 7 was not previously assessed, we found that this isoform together with isoform 2 is dominant in GFP+ cells. Prominin1 is strongly predisposed to alternative splicing, and there are 10 known isoforms for the human transcript. Therefore, it is likely to reveal a new murine isoform of Prominin1 amidst the eight known isoforms. (C) Glial fibrillary protein (GFAP) expression was previously detected in distinct population of astrocytes that have been identified as SVZ neural stem cells (–40). GFAP isoforms were detected only in GFP+ cells. The cells analyzed in our experiments did not contain isoforms GFAP-ΔEx6, GFAP-ΔEx7, and GFAP-Δ164, and the GFAP-Δ135 isoform was expressed at a low level. Isoforms GFAP-ΔEx6, GFAP-Δ164 and GFAP-Δ135 were not detected in samples without amplification (w/o), but the quality of the primers was verified before (39). Primers labeled “com” are designed to anneal to the common part of all known isoforms of the transcript. Because some isoforms are homologous, certain primers detect more than one target isoform. Heat maps reflect real-time PCR-derived Cq values for each transcript. A sample without amplification was used as positive control. Total RNA input of unamplified, pooled SVZ cells or embryonic cells in the RT reaction was 3 μg.
Figure 5. Expression patterns of Id1, EGFR,…
Figure 5. Expression patterns of Id1, EGFR, RbFox3, and PAX6
(A) A high expression of DNA-binding protein inhibitor Id1 is characteristic for neural stem cells (41). It was previously established that Id1 protein expression is associated with the GFP+ cell population (41,42). This transcriptional regulator is more likely the product of Id1 isoform a, according to our results, because only this isoform is exclusively expressed in GFP+ cells. (B) Epidermal growth factor receptor (EGFR) is expressed on stem cells (B cells) and transit-amplifying progenitor cells (C cells) in the subventricular zone (SVZ), but it is absent from neuroblasts (A cells). It was proposed that EGFR regulates the balance of SVZ cell subtypes (43). All GFP+ cells express the EGFR isoform1 with a possible co-expression of isoform2. The EGFR sequence structure does not allow design of primer pairs to establish the expression pattern of EGFR isoform2 in cells 1–8 conclusively, but GFP− cells 9 and 10 definitely have this isoform. (C) It has been suggested that specific neuronal subtypes express different RbFox3 protein variants with varying nuclear/cytoplasmic ratios (44,45). It has been previously proposed that each variant of RbFox3 has its own biological target(s) and may play a key role in the regulation of neural cell differentiation. Although the protein product of RbFox3 is considered a mature neuron-specific marker, we detected only isoform 2, which is exclusively nuclear, and isoform 3 that, as was suggested, shuttles between the nucleous and cytoplasm in GFP+ as well as GFP− cells. RbFox isoform 1 was not detected in any GFP cell population. (D) It was shown that PAX6, containing the canonical form of the paired domain (PD) without the alternative exon 5a, influences cell fate and proliferation at the same time, and an exon 5a-containing PAX6 isoform inhibits cell proliferation without affecting cell fate (46). We observed that PAX6 mRNA was present in almost every cell regardless of the presence or absence of exon 5a. We speculate that the protein distribution can be different, taking in account that the PAX6 gene demonstrates translational uncoupling (34). (E) Examples of 5′ end detection of the transcripts. The primers localized at 5′ ends of mRNAs detect the transcripts of different lengths, up to 9 kb. Heat maps reflect real-time PCR–derived Cq values for each transcript. A sample without amplification (w/o) was used as a positive control. Total RNA input of unamplified pooled SVZ cells or embryonic cells to RT reaction was 3 μg.

Source: PubMed

3
Subskrybuj