RNA transcripts, miRNA-sized fragments and proteins produced from D4Z4 units: new candidates for the pathophysiology of facioscapulohumeral dystrophy

Lauren Snider, Amy Asawachaicharn, Ashlee E Tyler, Linda N Geng, Lisa M Petek, Lisa Maves, Daniel G Miller, Richard J L F Lemmers, Sara T Winokur, Rabi Tawil, Silvère M van der Maarel, Galina N Filippova, Stephen J Tapscott, Lauren Snider, Amy Asawachaicharn, Ashlee E Tyler, Linda N Geng, Lisa M Petek, Lisa Maves, Daniel G Miller, Richard J L F Lemmers, Sara T Winokur, Rabi Tawil, Silvère M van der Maarel, Galina N Filippova, Stephen J Tapscott

Abstract

Deletion of a subset of the D4Z4 macrosatellite repeats in the subtelomeric region of chromosome 4q causes facioscapulohumeral muscular dystrophy (FSHD) when occurring on a specific haplotype of 4qter (4qA161). Several genes have been examined as candidates for causing FSHD, including the DUX4 homeobox gene in the D4Z4 repeat, but none have been definitively shown to cause the disease, nor has the full extent of transcripts from the D4Z4 region been carefully characterized. Using strand-specific RT-PCR, we have identified several sense and antisense transcripts originating from the 4q D4Z4 units in wild-type and FSHD muscle cells. Consistent with prior reports, we find that the DUX4 transcript from the last (most telomeric) D4Z4 unit is polyadenylated and has two introns in its 3-prime untranslated region. In addition, we show that this transcript generates (i) small si/miRNA-sized fragments, (ii) uncapped, polyadenylated 3-prime fragments that encode the conserved C-terminal portion of DUX4 and (iii) capped and polyadenylated mRNAs that contain the double-homeobox domain of DUX4 but splice-out the C-terminal portion. Transfection studies demonstrate that translation initiation at an internal methionine can produce the C-terminal polypeptide and developmental studies show that this peptide inhibits myogenesis at a step between MyoD transcription and the activation of MyoD target genes. Together, we have identified new sense and anti-sense RNA transcripts, novel mRNAs and mi/siRNA-sized RNA fragments generated from the D4Z4 units that are new candidates for the pathophysiology of FSHD.

Figures

Figure 1.
Figure 1.
Schematic representation of the FSHD locus, ORFs, transcripts and location of miRNA-sized fragments. The top panel shows the FSHD locus with two full D4Z4 units. The region centromeric to the D4Z4 units is designated as p13E-11. The small triangle represents a partial D4Z4 unit centromeric to the First and Last D4Z4 units (large triangles), followed by another partial D4Z4 unit (partial triangle), then the region designated as pLAM, followed by beta-satellite repeat sequence. The sequence in Supplementary Material, Figure S1 extends from the first D4Z4 unit through the pLAM sequence and is numbered 0 to 7760, as shown schematically. First repeat ORFs: schematic of ORFs in the first D4Z4 unit that extends from position 0 to 3300. Each ORF is predicted from the DNA sequence and denoted by the first few amino acids. MKG, MAL and MQG represent internal in-frame methionines that might also be used for translation initiation. The ORF beginning with MAL is the previously identified DUX4 ORF. STOP indicates a stop-codon in-frame with the upstream ORF. Numbering is from the KPNI site at the beginning of the D4Z4 repeat to the KPNI site at the end of the repeat. Last repeat ORFs: potential ORFs in the last D4Z4 unit differ from first due to sequence variations. Transcripts mapped by strand-specific RT–PCR and small RNAs: map of transcription units determined by strand-specific RT–PCR in wild-type and FSHD fibroblasts and muscle cells. Solid arrows (AF) represent regions where strand-specific RT–PCR has identified RNA transcripts and dashed lines represent regions where transcripts were not identified. Location of the endpoints of these transcripts in the first/last repeats is as follows: A, 404/3701-1325/4628; B, 1614/4912-2237/5535; C, 2704/6002-3340/6638; D, 1003-1374/4672; E, 1623/4921-2100/5400; F, 2634/5932-3071/6369. Asterisks represent locations of small miRNA-sized fragments confirmed by northern.
Figure 2.
Figure 2.
Discontinuous regions of DUX4 transcripts in wild-type and FSHD fibroblasts. (A) RT–PCR of random hexamer primed total RNA from fibroblasts, wild-type or FSHD-derived, amplified with primers for the 5-prime (1797–1906; 1797–2235), the central portion (2270–2398), the 3-prime end (2672–2970) and the full-length DUX4 (1797–2970). The fibroblasts were transduced with a retrovirus expressing MyoD-ER and assayed under differentiation conditions (DMEM with 1 µg/ml insulin and transferrin for 96 h) either without beta-estadiol (−) or with beta-estradiol (+) to induce MyoD activity; however, no consistent differences were noted in RNA obtained from MyoD induced and non-induced cells. The presence of amplification products from DNA controls shows that all primer sets can amplify the sequence from DUX4 cDNA templates. (B) Strand-specific RT–PCR amplification products using total RNA from control and FSHD muscle cells. Primer pairs are designed to amplify the 5-prime (1670-ST1936), middle (2270-ST2568) and 3-prime (2922-ST3072) regions of DUX4. IVT is a dilution of RNA from an in vitro transcribed DUX4 cDNA to demonstrate that the RT reaction and PCR can amplify the central portion of DUX4 from an RNA template. (C) Strand-specific RT–PCR amplification products using total RNA from human embryonic stem cells (ES), mesenchymal stem cells (MSC) and muscle cells from control (N-201) and FSHD (F-183) individuals. In contrast to the wild-type and FSHD muscle cells, the central portion of DUX4 can be amplified from the ES and MSC, but not from wild-type or FSHD muscle cells. Template controls include: IVT, dilutions of RNA from in vitro transcribed DUX4 cDNA; T-DUX4 and T-Control, RNA from mouse C2C12 muscle cells transfected with a DUX4 expression vector and the empty expression vector, respectively. In both (B) and (C), three independent wild-type and FSHD-derived myoblasts cultures were grown to confluence and induced to differentiate for 96 h. Similar results were obtained from undifferentiated wild-type and FSHD myoblasts (data not shown). Note that the size of the PCR product can be calculated from the position of the PCR primers.
Figure 3.
Figure 3.
Quantification of amplification products from various DUX4 regions using real-time RT–PCR. RNA sources were cultured myoblasts and myotubes from control and FSHD-affected individuals, and human ES cells. In each grouping, the left panel shows the average of triplicate data from six independent muscle-derived cell cultures (three wild-type and three FSHD) under growth (myoblast) and differentiation (myotube) conditions and the right panel shows the combined average values for wild-type and FSHD with values expressed as the amount relative to the RNA abundance in ES cells. Note that amplification products from the 5-prime region of the DUX4 transcript are slightly more abundant in muscle cells when compared with ES cells; however, amplification products from the middle region of DUX4 are much less abundant in muscle cells. Amplification products from the 3-prime region of DUX4 are less abundant when RNA from muscle is compared with that from ES cells but increased in quantity when compared with the middle region. Error bars indicate standard deviation and asterisks indicate: *P < 0.01; **P < 0.001; ***P < 0.0001; however, the biological significance should be interpreted cautiously due to the small number of independent samples. See Supplementary Material, Table S2 for RT and PCR primer sequences.
Figure 4.
Figure 4.
Northern detection of miRNA-sized fragments generated from regions of the DUX4 transcript. (A) Micro-RNA northern of RNA from normal muscle cells (N201), FSHD-derived muscle cells (F-148) and FSHD-derived fibroblasts transduced with the MyoD-ER (F7-MDER) with a 21 mer probe to positions 2254–2273, a 20 mer probe to positions 2284–2303 and a 24 mer probe to positions 1615–1638. Each probe detects fragments in the 20–30 nt range; however, multiple larger RNA species are also detected. (B) Micro-RNA northern of wild-type and FSHD-derived fibroblasts with MyoD-ER probed in the region of the predicted miRNA (center panel) and with probes shifted 10 nt in either direction (flanking panels). Fragments in the 20–30 nt range (asterisks) are restricted to the central probes, whereas all probes identify some larger RNA species. All cells contained MDER and were induced to differentiate for 96 h (96+). (C) Micro-RNA northern of wild-type fibroblasts with MyoD-ER probed for the fragments at 2284 and 1615 showing that the fragment abundance is substantially increased in cells transduced with a viral construct expressing the DUX4 RNA and induced to differentiate into muscle (+, indicates the addition of beta-estradiol to activate the MyoD-ER). Control, no viral vector; pBABE, viral vector without insert; pBabe-DUX4, viral vector expressing a DUX4 transcript with nomenclature as indicated in Figure 4A. Sequences of the probes are in Supplementary Material, Table S3.
Figure 5.
Figure 5.
Inhibition of myogenesis in the absence of DUX4 protein. (A) Schematic of the coding regions and stop-codon placements of the expression constructs tested. The methionines in an open reading frame with DUX4 are depicted with the two following amino acids (MKG, MAL, MQG) and the DUX4 translation stop codon indicated by STOP. S shows the regions where we have introduced a new translation stop codon. The column labeled Differ. indicates whether the expression of the indicated vector inhibited C2C12 differentiation (Inhibits) or did not inhibit differentiation (NI). (B) C2C12 cells transiently transfected with pCkm-luc and CMV-beta-galactosidase together with the indicated DUX4 expression. Bar graphs show luciferase activity relative to beta-galactosidase activity 24 h after induction in differentiation medium and western shows abundance of protein containing the epitope recognized by the 9A12 monoclonal antibody to DUX4. The two major bands in the western represent translation initiation at the MKG and the MAL methionines and they migrate at their predicted size. Note that the monoclonal antibody does not recognize in vitro translated mqgDUX4 (data not shown) indicating that the epitope is not contained within this region of the protein.
Figure 6.
Figure 6.
Inhibition of myogenesis in zebrafish embryos by the c-terminal peptide sequence of DUX4. Zebrafish embryos were injected with mRNA encoding the full-length DUX4 (mkgDUX4, see Fig. 4A for nomenclature), the c-terminal fragment of DUX4 (mqgDUX4) or the c-terminal fragment with a stop codon to prevent protein translation (mqgDUX4-mqg*). Full-length DUX4 is highly toxic and broadly interferes with the development on the injected side. The mqgDUX4 injected side shows nearly normal development with normal expression of MyoD RNA, but has a very specific inhibition of muscle gene expression with decreased expression of myogenin (myog) and myosin light chain (mylz2). The stop-codon mutant of mqgDUX4 does not inhibit muscle gene expression, demonstrating that the mqg protein is required for inhibitory activity.
Figure 7.
Figure 7.
Five-prime and 3-prime polyadenylated transcripts with an IRES-like element upstream of the MQG ORF. (A) RT–PCR on random primed RNA from FSHD muscle cells using primers to three regions of the DUX4 transcript (5-prime, Central and the 3-prime region of the MQG ORF) on total RNA (T), the poly-adenylated fraction that binds oligo-dT (P), or the unbound fraction that does not bind oligo-dT (U). Primers used were: 5-prime, 1707 and 1906; central, 2307and 2815; 3-prime, 6315 and 7074. In vitro transcribed full-length DUX4 RNA was used as a positive control for the RT reaction (I). 100 bp ladder (L) with 100 bp as lowest band. (B) A schematic of the DUX4 region with representations of the transcripts identified by a combination of 5-prime RACE and 3-prime RACE on the poly-A fraction and location of miRNA-like fragments indicated by asterisks. Top schematic shows locations of potential translation start codons (MKG, MAL and MQG) and stop codon (STOP); asterisks indicate locations of miRNA-like fragments; LR-EX1, cloned sequence matches last repeat-ExonI; LR-EX2, last repeat Exon 2; LR EX3, last repeat Exon 3; IR-EX1, cloned sequence does not match either first or last repeat (Supplementary Material, Fig. S4); IR-EX2, in this case cloned sequence does match LR-EX2 but the tandem repeat indicates it comes from an internal repeat. (C) Sequence of the DUX4 ORF and pLAM region showing the locations of the capped 5-prime ends (Blue, positions 4941–4944 and 4970–4971) and uncapped 5-prime ends (Red, positions 5715 and 5863), representing sites of transcription initiation and RNA cleavage, respectively. The polyadenylation sites are indicated in Orange (postions 7155–7156 and 7166); introns are underlined; miRNA-sized fragments are shown in green (note that the last partial D4Z4 unit is between the last full D4Z4 unit and the pLAM sequence (Fig. 1), and therefore, the last two miRNA-sized fragments are also present in the beginning of the D4Z4 repeat). (D) The construction of the dual cistronic pRF backbone is detailed in (33). The locations of SV40 promoter and chimeric intron are indicated, and Poly-A is the SV40 polyadenylation signal. Inserting test sequences between the EcoRI and NcoI sites of pRF created the constructs pRF + 423DUX4, pRF + 275DUX4. pRF + HCVir was created by inserting the previously characterized IRES element from the Hepatitis C virus as a positive control, and its IRES activity has been previously described (34). An empty pRF plasmid without insert was used as a negative control. Each of the constructs was transfected into mouse myoblast C2C12 cells as two sets of triplicates. Twenty-four hours post-transfection, one triplicate set of cells, designated as ‘growth,’ was harvested and their lysates assayed for FLuc and RLuc activities as described in Materials and Methods. The remaining set was switched to ‘differentiation’ media and assayed for luciferase activities 48 h post-transfection. FLuc activity was normalized to RLuc and plotted as the mean ± SD relative to the empty plasmid pRF.

Source: PubMed

3
Subscribe