Accurate detection of m6A RNA modifications in native RNA sequences

Huanle Liu, Oguzhan Begik, Morghan C Lucas, Jose Miguel Ramirez, Christopher E Mason, David Wiener, Schraga Schwartz, John S Mattick, Martin A Smith, Eva Maria Novoa, Huanle Liu, Oguzhan Begik, Morghan C Lucas, Jose Miguel Ramirez, Christopher E Mason, David Wiener, Schraga Schwartz, John S Mattick, Martin A Smith, Eva Maria Novoa

Abstract

The epitranscriptomics field has undergone an enormous expansion in the last few years; however, a major limitation is the lack of generic methods to map RNA modifications transcriptome-wide. Here, we show that using direct RNA sequencing, N6-methyladenosine (m6A) RNA modifications can be detected with high accuracy, in the form of systematic errors and decreased base-calling qualities. Specifically, we find that our algorithm, trained with m6A-modified and unmodified synthetic sequences, can predict m6A RNA modifications with ~90% accuracy. We then extend our findings to yeast data sets, finding that our method can identify m6A RNA modifications in vivo with an accuracy of 87%. Moreover, we further validate our method by showing that these 'errors' are typically not observed in yeast ime4-knockout strains, which lack m6A modifications. Our results open avenues to investigate the biological roles of RNA modifications in their native RNA context.

Conflict of interest statement

M.S. and E.M.N. have received travel and accommodation expenses to speak at Oxford Nanopore Technologies conferences. Otherwise, the authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Base-calling “errors” can be used as a proxy to identify RNA modifications in direct RNA sequencing reads. a Schematic overview of the strategy used in this work to train and test an m6A RNA base-calling algorithm. b IGV snapshot of one of the four transcripts used in this work. In the upper panel, in vitro transcribed reads containing m6A have been mapped, whereas in the lower panel the unmodified counterpart is shown. Nucleotides with mismatch frequencies >0.05 have been colored. c Comparison of m6A and A positions, at the level of per-base quality scores (left panel), mismatch frequencies (middle left panel), deletion frequency (middle right panel), and mean current intensity (right panel). All possible k-mers (computed as a sliding window along the transcripts) have been included for these comparisons (n = 9974). dg Replicability of each individual feature — base quality (d), deletion frequency (f), mismatch frequency (e), and current intensity (g) — across biological replicates, for both unmodified (“A”) and m6A-modified (“m6A”) data sets. Comparison of unmodified and m6A-modified (“A vs m6A”) is also shown for each feature. Correlation values shown correspond to Spearman’s rho. Error bars indicate s.d. Source data are provided in the Source Data file
Fig. 2
Fig. 2
Base-calling “errors” alone can accurately identify m6A RNA modifications. a Base-called features (base quality, insertion frequency, and deletion frequency) of m6A motif 5-mers, and for each position of the 5-mer, are shown. The features of the m6A-modified transcripts (“m6A”) are shown in red, whereas the features of the unmodified transcripts (“unm”) are shown in blue. b, c Principal component analysis (PCA) scores plot of the two first principal components, using 15 features (base quality, mismatch frequency, deletion frequency, for each of the five positions of the k-mer) as input. The logos of the k-mers used in the m6A-motif RRACH set (left) and control set (right) are also shown. Each dot represents a specific k-mer in the synthetic sequence, and has been colored depending on whether the k-mer belongs to the m6A-modified transcripts (red) or the unmodified transcripts (black). The contribution of each principal component is shown in each axis. ROC curves of the SVM predictions using: (i) each individual feature separately to train and test each model, at m6A sites (d); (ii) combined features at m6A sites, relative to the individual features (e); (iii) combined features at m6A sites relative to control sites, where the base-called “errors” information of neighboring nucleotides has been included in the model (f); and (iv) different mixtures of methylated and unmethylated reads, using the combined features model (g). Error bars indicate s.d. Source data are provided in the Source Data file
Fig. 3
Fig. 3
Yeast wild-type and ime4∆ strains show distinct base-called features at known m6A-modified RRACH sites. a Overview of the direct RNA sequencing library preparation using in vivo polyA(+) RNA from S. cerevisiae cultures. b Replicability of per-gene counts using direct RNA sequencing across wild-type yeast strains (top) and ime4∆ strains (middle). The correlation between wild-type and ime4∆ strains is also shown (bottom). c Comparison of the observed mismatch frequencies in the 100%-modified in vitro transcribed sequences (blue), unmodified sequences (red), yeast ime4∆ knockout (green), and yeast wild type (cyan). Values for each biological replicate are shown. d Base-called features (base quality, insertion frequency, and deletion frequency) of RRACH 5-mers known to contain m6A modifications. Only features corresponding to the modified nucleotide (position 0) are shown. Features extracted from wild-type yeast reads (m6A-modified) are shown in blue, whereas those from ime4∆ (unmodified) for the same set of k-mers are shown in red. f Genomic tracks of previously reported m6A-modified RRACH sites in yeast, identified using Illumina sequencing. The m6A-modified nucleotide is highlighted with a green asterisk. In these positions, wild-type yeast strains show increased mismatch frequencies, as well as decreased coverage — reflecting increased deletion frequency — in all three biological replicates, whereas these features are not observed in any of the three ime4∆ replicates. g Predicted m6A modification scores predicted by the trained SVM at known m6A-modified (n = 363) and unknown (n = 60,794) RRACH sites, both for yeast wild-type and ime4∆ data sets. P-values have been computed using Kruskal–Wallis test. A site was included in the analysis if there were mapped reads present in all six yeast samples. Sites with more than one “A” in the 5-mer were excluded from the analysis. h ROC curve depicting the performance of EpiNano in yeast data sets (n = 61,363 sites). Error bars indicate s.d. Source data are provided in the Source Data file

References

    1. Zhao X, et al. FTO-dependent demethylation of N6-methyladenosine regulates mRNA splicing and is required for adipogenesis. Cell Res. 2014;24:1403–1419. doi: 10.1038/cr.2014.151.
    1. Yang X, et al. 5-methylcytosine promotes mRNA export - NSUN2 as the methyltransferase and ALYREF as an m(5)C reader. Cell Res. 2017;27:606–625. doi: 10.1038/cr.2017.55.
    1. Saletore Y, et al. The birth of the epitranscriptome: deciphering the function of RNA modifications. Genome Biol. 2012;13:175. doi: 10.1186/gb-2012-13-10-175.
    1. Vu LP, et al. The N6-methyladenosine (m6A)-forming enzyme METTL3 controls myeloid differentiation of normal hematopoietic and leukemia cells. Nat. Med. 2017;23:1369–1376. doi: 10.1038/nm.4416.
    1. Dominissini D, et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature. 2012;485:201–206. doi: 10.1038/nature11112.
    1. Meyer KD, et al. Comprehensive analysis of mRNA methylation reveals enrichment in 3′ UTRs and near stop codons. Cell. 2012;149:1635–1646. doi: 10.1016/j.cell.2012.05.003.
    1. Delatte B, et al. RNA biochemistry. Transcriptome-wide distribution and function of RNA hydroxymethylcytosine. Science. 2016;351:282–285. doi: 10.1126/science.aac5253.
    1. Carlile TM, et al. Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells. Nature. 2014;515:143–146. doi: 10.1038/nature13802.
    1. Schwartz S, et al. Transcriptome-wide mapping reveals widespread dynamic-regulated pseudouridylation of ncRNA and mRNA. Cell. 2014;159:148–162. doi: 10.1016/j.cell.2014.08.028.
    1. Liu N, et al. N(6)-methyladenosine-dependent RNA structural switches regulate RNA-protein interactions. Nature. 2015;518:560–564. doi: 10.1038/nature14234.
    1. Novoa EM, Mason CE, Mattick JS. Charting the unknown epitranscriptome. Nat. Rev. Mol. Cell Biol. 2017;18:339–340. doi: 10.1038/nrm.2017.49.
    1. Dominissini D, et al. The dynamic N(1)-methyladenosine methylome in eukaryotic messenger RNA. Nature. 2016;530:441–446. doi: 10.1038/nature16998.
    1. Safra M, et al. The m1A landscape on cytosolic and mitochondrial mRNA at single-base resolution. Nature. 2017;551:251–255. doi: 10.1038/nature24456.
    1. Arango D, et al. Acetylation of cytidine in mRNA promotes translation efficiency. Cell. 2018;175:1872–1886. doi: 10.1016/j.cell.2018.10.030.
    1. Marchand V, et al. AlkAniline-Seq: profiling of m 7 G and m 3 C RNA modifications at single nucleotide resolution. Angew. Chem. Int. Ed. 2018;57:16785–16790. doi: 10.1002/anie.201810946.
    1. Jonkhout N, et al. The RNA modification landscape in human disease. RNA. 2017;23:1754–1769. doi: 10.1261/rna.063503.117.
    1. Linder B, et al. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nat. Methods. 2015;12:767–772. doi: 10.1038/nmeth.3453.
    1. Garalde DR, et al. Highly parallel direct RNA sequencing on an array of nanopores. Nat. Methods. 2018;15:201–206. doi: 10.1038/nmeth.4577.
    1. Vilfan ID, et al. Analysis of RNA base modification and structural rearrangement by single-molecule real-time detection of reverse transcription. J. Nanobiotechnol. 2013;11:8. doi: 10.1186/1477-3155-11-8.
    1. Smith, A. M., Jain, M., Mulroney, L., Garalde, D. R. & Akeson, M. Reading canonical and modified nucleotides in 16S ribosomal RNA using nanopore direct RNA sequencing. Preprint at bioRxiv10.1101/132274 (2017).
    1. Keller MW, et al. Direct RNA sequencing of the coding complete influenza A virus genome. Sci. Rep. 2018;8:14408. doi: 10.1038/s41598-018-32615-8.
    1. Stoiber, M. H. et al. De novo identification of DNA modifications enabled by genome-guided nanopore signal processing. Preprint at bioRxiv. 10.1101/094672 (2017).
    1. Loose M, Malla S, Stout M. Real-time selective sequencing using nanopore technology. Nat. Methods. 2016;13:751–754. doi: 10.1038/nmeth.3930.
    1. Jain M, Olsen HE, Paten B, Akeson M. The oxford nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 2016;17:239. doi: 10.1186/s13059-016-1103-0.
    1. McIntyre ABR, et al. Nanopore sequencing in microgravity. NPJ Microgravity. 2016;2:16035. doi: 10.1038/npjmgrav.2016.35.
    1. Teng, H. et al. Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning. Gigascience. 7, giy037 (2018). 10.1093/gigascience/giy037.
    1. McIntyre, A. B. R. et al. Single-molecule sequencing detection of N6-methyladenine in microbial reference materials. Nat. Commun. 10, 579 (2019). 10.1038/s41467-019-08289-9.
    1. Liu N, et al. Probing N6-methyladenosine RNA modification status at single nucleotide resolution in mRNA and long noncoding RNA. RNA. 2013;19:1848–1856. doi: 10.1261/rna.041178.113.
    1. Schwartz S, et al. High-resolution mapping reveals a conserved, widespread, dynamic mRNA methylation program in yeast meiosis. Cell. 2013;155:1409–1421. doi: 10.1016/j.cell.2013.10.047.
    1. Garcia-Campos, M. A. et al. Deciphering the ‘m6A code’ via quantitative profiling of m6A at single-nucleotide resolution. Preprint at bioRxiv. (2019). 10.1101/571679.
    1. Haussmann IU, et al. m6A potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination. Nature. 2016;540:301–304. doi: 10.1038/nature20577.
    1. Kan L, et al. The m6A pathway facilitates sex determination in Drosophila. Nat. Commun. 2017;8:15737. doi: 10.1038/ncomms15737.
    1. Lence T, et al. m6A modulates neuronal functions and sex determination in Drosophila. Nature. 2016;540:242–247. doi: 10.1038/nature20568.
    1. Batista PJ, et al. m(6)A RNA modification controls cell fate transition in mammalian embryonic stem cells. Cell Stem Cell. 2014;15:707–719. doi: 10.1016/j.stem.2014.09.019.
    1. Torres AG, Batlle E, Ribas de Pouplana L. Role of tRNA modifications in human diseases. Trends Mol. Med. 2014;20:306–314. doi: 10.1016/j.molmed.2014.01.008.
    1. Sarin LP, Leidel SA. Modify or die?-RNA modification defects in metazoans. RNA Biol. 2014;11:1555–1567. doi: 10.4161/15476286.2014.992279.
    1. Weng Y-L, et al. Epitranscriptomic m6A regulation of axon regeneration in the adult mammalian nervous system. Neuron. 2018;97:313–325. doi: 10.1016/j.neuron.2017.12.036.
    1. Widagdo J, et al. Experience-dependent accumulation of N6-methyladenosine in the prefrontal cortex is associated with memory processes in mice. J. Neurosci. 2016;36:6771–6777. doi: 10.1523/JNEUROSCI.4053-15.2016.
    1. Yoon K-J, et al. Temporal control of mammalian cortical neurogenesis by m6A methylation. Cell. 2017;171:877–889. doi: 10.1016/j.cell.2017.09.003.
    1. Li Z, et al. FTO plays an oncogenic role in acute myeloid leukemia as a N6-methyladenosine RNA demethylase. Cancer Cell. 2017;31:127–141. doi: 10.1016/j.ccell.2016.11.017.
    1. Dai D, Wang H, Zhu L, Jin H, Wang X. N6-methyladenosine links RNA metabolism to cancer progression. Cell Death Dis. 2018;9:124. doi: 10.1038/s41419-017-0129-x.
    1. Liu Z-X, Li L-M, Sun H-L, Liu S-M. Link between m6A modification and cancers. Front. Bioeng. Biotechnol. 2018;6:89. doi: 10.3389/fbioe.2018.00089.
    1. Agarwala SD, Blitzblau HG, Hochwagen A, Fink GR. RNA methylation by the MIS complex regulates a cell fate decision in yeast. PLoS Genet. 2012;8:e1002732. doi: 10.1371/journal.pgen.1002732.
    1. Schwartz S, et al. Perturbation of m6A writers reveals two distinct classes of mRNA methylation at internal and 5′ sites. Cell Rep. 2014;8:284–296. doi: 10.1016/j.celrep.2014.05.048.

Source: PubMed

3
Subscribe