Confronting false discoveries in single-cell differential expression
Jordan W Squair, Matthieu Gautier, Claudia Kathe, Mark A Anderson, Nicholas D James, Thomas H Hutson, Rémi Hudelle, Taha Qaiser, Kaya J E Matson, Quentin Barraud, Ariel J Levine, Gioele La Manno, Michael A Skinnider, Grégoire Courtine, Jordan W Squair, Matthieu Gautier, Claudia Kathe, Mark A Anderson, Nicholas D James, Thomas H Hutson, Rémi Hudelle, Taha Qaiser, Kaya J E Matson, Quentin Barraud, Ariel J Levine, Gioele La Manno, Michael A Skinnider, Grégoire Courtine
Abstract
Differential expression analysis in single-cell transcriptomics enables the dissection of cell-type-specific responses to perturbations such as disease, trauma, or experimental manipulations. While many statistical methods are available to identify differentially expressed genes, the principles that distinguish these methods and their performance remain unclear. Here, we show that the relative performance of these methods is contingent on their ability to account for variation between biological replicates. Methods that ignore this inevitable variation are biased and prone to false discoveries. Indeed, the most widely used methods can discover hundreds of differentially expressed genes in the absence of biological differences. To exemplify these principles, we exposed true and false discoveries of differentially expressed genes in the injured mouse spinal cord.
Conflict of interest statement
G.C. is a founder and shareholder of Onward Medical, a company with no direct relationships with the present work. The remaining authors declare no competing interests.
© 2021. The Author(s).
Figures
References
- Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 2009;10:57–63. doi: 10.1038/nrg2484.
- Stark R, Grzelak M, Hadfield J. RNA sequencing: the teenage years. Nat. Rev. Genet. 2019;20:631–656. doi: 10.1038/s41576-019-0150-2.
- Srinivasan K, et al. Untangling the brain’s neuroinflammatory and neurodegenerative transcriptional responses. Nat. Commun. 2016;7:11295. doi: 10.1038/ncomms11295.
- Chen X, Teichmann SA, Meyer KB. From tissues to cell types and back: single-cell gene expression analysis of tissue architecture. Annu. Rev. Biomed. Data Sci. 2018;1:29–51. doi: 10.1146/annurev-biodatasci-080917-013452.
- Kang HM, et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 2018;36:89–94. doi: 10.1038/nbt.4042.
- Mathys H, et al. Single-cell transcriptomic analysis of Alzheimer’s disease. Nature. 2019;570:332–337. doi: 10.1038/s41586-019-1195-2.
- Finak G, et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 2015;16:278. doi: 10.1186/s13059-015-0844-5.
- Kharchenko PV, Silberstein L, Scadden DT. Bayesian approach to single-cell differential expression analysis. Nat. Methods. 2014;11:740–742. doi: 10.1038/nmeth.2967.
- Zimmerman, K. D., Espeland, M. A. & Langefeld, C. D. Pseudoreplication bias in single-cell studies; a practical solution. BioRxiv (2020) 10.1101/2020.01.15.906248.
- Crowell HL, et al. muscat detects subpopulation-specific state transitions from multi-sample multi-condition single-cell transcriptomics data. Nat. Commun. 2020;11:6077. doi: 10.1038/s41467-020-19894-4.
- Mehta T, Tanik M, Allison DB. Towards sound epistemological foundations of statistical methods for high-dimensional biology. Nat. Genet. 2004;36:943–947. doi: 10.1038/ng1422.
- Hagai T, et al. Gene expression variability across cells and species shapes innate immunity. Nature. 2018;563:197–202. doi: 10.1038/s41586-018-0657-2.
- Cano-Gamez E, et al. Single-cell transcriptomics identifies an effectorness gradient shaping the response of CD4+ T cells to cytokines. Nat. Commun. 2020;11:1801. doi: 10.1038/s41467-020-15543-y.
- Angelidis I, et al. An atlas of the aging lung mapped by single cell transcriptomics and deep tissue proteomics. Nat. Commun. 2019;10:963. doi: 10.1038/s41467-019-08831-9.
- Reyfman PA, et al. Single-cell transcriptomic analysis of human lung provides insights into the pathobiology of pulmonary fibrosis. Am. J. Respir. Crit. Care Med. 2019;199:1517–1536. doi: 10.1164/rccm.201712-2410OC.
- Irizarry RA, et al. Multiple-laboratory comparison of microarray platforms. Nat. Methods. 2005;2:345–350. doi: 10.1038/nmeth756.
- Soneson C, Robinson MD. Bias, robustness and scalability in single-cell differential expression analysis. Nat. Methods. 2018;15:255–261. doi: 10.1038/nmeth.4612.
- Lun ATL, Marioni JC. Overcoming confounding plate effects in differential expression analyses of single-cell RNA-seq data. Biostatistics. 2017;18:451–464. doi: 10.1093/biostatistics/kxw055.
- Rapaport F, et al. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol. 2013;14:R95. doi: 10.1186/gb-2013-14-9-r95.
- Tarazona S, García-Alcalde F, Dopazo J, Ferrer A, Conesa A. Differential expression in RNA-seq: a matter of depth. Genome Res. 2011;21:2213–2223. doi: 10.1101/gr.124321.111.
- Jiang L, et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 2011;21:1543–1551. doi: 10.1101/gr.121095.111.
- Tung P-Y, et al. Batch effects and the effective design of single-cell gene expression studies. Sci. Rep. 2017;7:39921. doi: 10.1038/srep39921.
- Ståhl PL, et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science. 2016;353:78–82. doi: 10.1126/science.aaf2403.
- Maniatis S, et al. Spatiotemporal dynamics of molecular pathology in amyotrophic lateral sclerosis. Science. 2019;364:89–93. doi: 10.1126/science.aav9776.
- van den Brand R, et al. Restoring voluntary control of locomotion after paralyzing spinal cord injury. Science. 2012;336:1182–1185. doi: 10.1126/science.1217416.
- Beauparlant J, et al. Undirected compensatory plasticity contributes to neuronal dysfunction after severe spinal cord injury. Brain. 2013;136:3347–3361. doi: 10.1093/brain/awt204.
- Skinnider MA, et al. Cell type prioritization in single-cell data. Nat. Biotechnol. 2021;39:30–34. doi: 10.1038/s41587-020-0605-1.
- Squair JW, Skinnider MA, Gautier M, Foster LJ, Courtine G. Prioritization of cell types responsive to biological perturbations in single-cell data with Augur. Nat. Protoc. 2021;16:3836–3873. doi: 10.1038/s41596-021-00561-x.
- Robinson MD, McCarthy DJ, Smyth G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616.
- Wang F, et al. RNAscope: a novel in situ RNA analysis platform for formalin-fixed, paraffin-embedded tissues. J. Mol. Diagn. 2012;14:22–29. doi: 10.1016/j.jmoldx.2011.08.002.
- Samanta D, Semenza GL. Maintenance of redox homeostasis by hypoxia-inducible factors. Redox Biol. 2017;13:331–335. doi: 10.1016/j.redox.2017.05.022.
- Zhang C, et al. IGF binding protein-6 expression in vascular endothelial cells is induced by hypoxia and plays a negative role in tumor angiogenesis. Int. J. Cancer. 2012;130:2003–2012. doi: 10.1002/ijc.26201.
- Li Y, et al. Pericytes impair capillary blood flow and motor function after chronic spinal cord injury. Nat. Med. 2017;23:733–741. doi: 10.1038/nm.4331.
- Zimmerman KD, Espeland MA, Langefeld CD. A practical solution to pseudoreplication bias in single-cell studies. Nat. Commun. 2021;12:738. doi: 10.1038/s41467-021-21038-1.
- Stuart T, et al. Comprehensive integration of single-cell data. Cell. 2019;177:1888–1902.e21. doi: 10.1016/j.cell.2019.05.031.
- Wolf FA, Angerer P, Theis FJ. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 2018;19:15. doi: 10.1186/s13059-017-1382-0.
- Svensson, V., da Veiga Beltrame, E. & Pachter, L. A curated database reveals trends in single-cell transcriptomics. Database (Oxford)2020, (2020).
- Zhang JM, Kamath GM, Tse DN. Valid post-clustering differential analysis for single-cell RNA-Seq. Cell Syst. 2019;9:383–392.e6. doi: 10.1016/j.cels.2019.07.012.
- Ntranos V, Yi L, Melsted P, Pachter L. A discriminative learning approach to differential expression analysis for single-cell RNA-seq. Nat. Methods. 2019;16:163–166. doi: 10.1038/s41592-018-0303-9.
- McDavid A, et al. Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments. Bioinformatics. 2013;29:461–467. doi: 10.1093/bioinformatics/bts714.
- Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8.
- Ritchie ME, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47. doi: 10.1093/nar/gkv007.
- McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40:4288–4297. doi: 10.1093/nar/gks042.
- Lun ATL, Chen Y, Smyth GK. It’s DE-licious: a recipe for differential expression analyses of RNA-seq experiments using quasi-likelihood methods in edgeR. Methods Mol. Biol. 2016;1418:391–416. doi: 10.1007/978-1-4939-3578-9_19.
- Law CW, Chen Y, Shi W, Smyth G. K. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014;15:R29. doi: 10.1186/gb-2014-15-2-r29.
- Subramanian A, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102.
- Sergushichev, A. An algorithm for fast preranked gene set enrichment analysis using cumulative statistic calculation. BioRxiv (2016) 10.1101/060012.
- Zappia L, Phipson B, Oshlack A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 2017;18:174. doi: 10.1186/s13059-017-1305-0.
- Wu YE, Pan L, Zuo Y, Li X, Hong W. Detecting activated cell populations using single-cell RNA-Seq. Neuron. 2017;96:313–329.e6. doi: 10.1016/j.neuron.2017.09.026.
- Hrvatin S, et al. Single-cell analysis of experience-dependent transcriptomic states in the mouse visual cortex. Nat. Neurosci. 2018;21:120–129. doi: 10.1038/s41593-017-0029-5.
- Sathyamurthy A, et al. Massively parallel single nucleus transcriptional profiling defines spinal cord neurons and their activity during behavior. Cell Rep. 2018;22:2216–2225. doi: 10.1016/j.celrep.2018.02.003.
- Grubman A, et al. A single-cell atlas of entorhinal cortex from individuals with Alzheimer’s disease reveals cell-type-specific gene expression regulation. Nat. Neurosci. 2019;22:2087–2097. doi: 10.1038/s41593-019-0539-4.
- Rossi MA, et al. Obesity remodels activity and transcriptional state of a lateral hypothalamic brake on feeding. Science. 2019;364:1271–1274. doi: 10.1126/science.aax1184.
- Smillie CS, et al. Intra- and inter-cellular rewiring of the human colon during ulcerative Colitis. Cell. 2019;178:714–730.e22. doi: 10.1016/j.cell.2019.06.029.
- Tran NM, et al. Single-cell profiles of retinal ganglion cells differing in resilience to injury reveal neuroprotective genes. Neuron. 2019;104:1039–1055.e12. doi: 10.1016/j.neuron.2019.11.006.
- Goldfarbmuren KC, et al. Dissecting the cellular specificity of smoking effects and reconstructing lineages in the human airway epithelium. Nat. Commun. 2020;11:2485. doi: 10.1038/s41467-020-16239-z.
- Nagy C, et al. Single-nucleus transcriptomics of the prefrontal cortex in major depressive disorder implicates oligodendrocyte precursor cells and excitatory neurons. Nat. Neurosci. 2020;23:771–781. doi: 10.1038/s41593-020-0621-y.
- Huang B, et al. Mucosal profiling of pediatric-onset Colitis and IBD reveals common pathogenics and therapeutic pathways. Cell. 2019;179:1160–1176.e24. doi: 10.1016/j.cell.2019.10.027.
- Wilk AJ, et al. A single-cell atlas of the peripheral immune response in patients with severe COVID-19. Nat. Med. 2020;26:1070–1076. doi: 10.1038/s41591-020-0944-y.
- Asboth L, et al. Cortico-reticulo-spinal circuit reorganization enables functional recovery after severe spinal cord contusion. Nat. Neurosci. 2018;21:576–588. doi: 10.1038/s41593-018-0093-5.
- Wenger N, et al. Spatiotemporal neuromodulation therapies engaging muscle synergies improve motor control after spinal cord injury. Nat. Med. 2016;22:138–145. doi: 10.1038/nm.4025.
- Anderson MA, et al. Required growth facilitators propel axon regeneration across complete spinal cord injury. Nature. 2018;561:396–400. doi: 10.1038/s41586-018-0467-6.
- Scheff SW, Rabchevsky AG, Fugaccia I, Main JA, Lumpp JE. Experimental modeling of spinal cord injury: characterization of a force-defined injury device. J. Neurotrauma. 2003;20:179–193. doi: 10.1089/08977150360547099.
- Squair, J. W. et al. Integrated systems analysis reveals conserved gene networks underlying response to spinal cord injury. elife7, (2018).
- Courtine G, et al. Transformation of nonfunctional spinal circuits into functional states after the loss of brain input. Nat. Neurosci. 2009;12:1333–1342. doi: 10.1038/nn.2401.
- Takeoka A, Vollenweider I, Courtine G, Arber S. Muscle spindle feedback directs locomotor recovery and circuit reorganization after spinal cord injury. Cell. 2014;159:1626–1639. doi: 10.1016/j.cell.2014.11.019.
- Dominici N, et al. Versatile robotic interface to evaluate, enable and train locomotion and balance after neuromotor disorders. Nat. Med. 2012;18:1142–1147. doi: 10.1038/nm.2845.
- La Manno, G. et al. RNA velocity of single cells. Nature560, 494–498 (2018).
- Hafemeister C, Satija R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 2019;20:296. doi: 10.1186/s13059-019-1874-1.
- Zeisel A, et al. Molecular architecture of the mouse nervous system. Cell. 2018;174:999–1014.e22. doi: 10.1016/j.cell.2018.06.021.
- Grimm D, et al. In vitro and in vivo gene therapy vector evolution via multispecies interbreeding and retargeting of adeno-associated viruses. J. Virol. 2008;82:5887–5911. doi: 10.1128/JVI.00254-08.
- Anderson MA, et al. Astrocyte scar formation aids central nervous system axon regeneration. Nature. 2016;532:195–200. doi: 10.1038/nature17623.
- Tomer R, Ye L, Hsueh B, Deisseroth K. Advanced CLARITY for rapid and high-resolution imaging of intact tissues. Nat. Protoc. 2014;9:1682–1697. doi: 10.1038/nprot.2014.123.
- Voigt FF, et al. The mesoSPIM initiative: open-source light-sheet microscopes for imaging cleared tissue. Nat. Methods. 2019;16:1105–1108. doi: 10.1038/s41592-019-0554-0.
- Arneson D, et al. Single cell molecular alterations reveal target cells and pathways of concussive brain injury. Nat. Commun. 2018;9:3894. doi: 10.1038/s41467-018-06222-0.
- Avey D, et al. Single-cell RNA-Seq uncovers a robust transcriptional response to morphine by Glia. Cell Rep. 2018;24:3619–3629.e4. doi: 10.1016/j.celrep.2018.08.080.
- Aztekin C, et al. Identification of a regeneration-organizing cell in the Xenopus tail. Science. 2019;364:653–658. doi: 10.1126/science.aav9996.
- Bhattacherjee A, et al. Cell type-specific transcriptional programs in mouse prefrontal cortex during adolescence and addiction. Nat. Commun. 2019;10:4169. doi: 10.1038/s41467-019-12054-3.
- Brenner E, et al. Single cell transcriptome profiling of the human alcohol-dependent brain. Hum. Mol. Genet. 2020;29:1144–1153. doi: 10.1093/hmg/ddaa038.
- Cheng C-W, et al. Ketone body signaling mediates intestinal stem cell homeostasis and adaptation to diet. Cell. 2019;178:1115–1131.e15. doi: 10.1016/j.cell.2019.07.048.
- Co, M., Hickey, S. L., Kulkarni, A., Harper, M. & Konopka, G. Cortical foxp2 supports behavioral flexibility and developmental dopamine D1 receptor expression. Cereb. Cortex30, 1855–1870 (2020).
- Davie K, et al. A single-cell transcriptome Atlas aging Drosophilla brain. Cell. 2018;174:982–998.e20. doi: 10.1016/j.cell.2018.05.057.
- Denisenko E, et al. Systematic assessment of tissue dissociation and storage biases in single-cell and single-nucleus RNA-seq workflows. Genome Biol. 2020;21:130. doi: 10.1186/s13059-020-02048-6.
- Der E, et al. Tubular cell and keratinocyte single-cell transcriptomics applied to lupus nephritis reveal type I IFN and fibrosis relevant pathways. Nat. Immunol. 2019;20:915–927. doi: 10.1038/s41590-019-0386-1.
- Gunner G, et al. Sensory lesioning induces microglial synapse elimination via ADAM10 and fractalkine signaling. Nat. Neurosci. 2019;22:1075–1088. doi: 10.1038/s41593-019-0419-y.
- Haber AL, et al. A single-cell survey of the small intestinal epithelium. Nature. 2017;551:333–339. doi: 10.1038/nature24489.
- Hashimoto K, et al. Single-cell transcriptomics reveals expansion of cytotoxic CD4 T cells in supercentenarians. Proc. Natl Acad. Sci. USA. 2019;116:24242–24251. doi: 10.1073/pnas.1907883116.
- Hu P, et al. Dissecting cell-type composition and activity-dependent transcriptional state in mammalian brains by massively parallel single-nucleus RNA-seq. Mol. Cell. 2017;68:1006–1015.e7. doi: 10.1016/j.molcel.2017.11.017.
- Jaitin DA, et al. Lipid-associated macrophages control metabolic homeostasis in a Trem2-dependent manner. Cell. 2019;178:686–698.e14. doi: 10.1016/j.cell.2019.05.054.
- Jäkel S, et al. Altered human oligodendrocyte heterogeneity in multiple sclerosis. Nature. 2019;566:543–547. doi: 10.1038/s41586-019-0903-2.
- Kim D-W, et al. Multimodal analysis of cell types in a hypothalamic node controlling social behavior. Cell. 2019;179:713–728.e17. doi: 10.1016/j.cell.2019.09.020.
- Kotliarov Y, et al. Broad immune activation underlies shared set point signatures for vaccine responsiveness in healthy individuals and disease activity in patients with lupus. Nat. Med. 2020;26:618–629. doi: 10.1038/s41591-020-0769-8.
- Madissoon E, et al. scRNA-seq assessment of the human lung, spleen, and esophagus tissue stability after cold preservation. Genome Biol. 2019;21:1. doi: 10.1186/s13059-019-1906-x.
- Nault R, Fader KA, Bhattacharya S, Zacharewski TR. Single-nuclei RNA sequencing assessment of the hepatic effects of 2,3,7,8-Tetrachlorodibenzo-p-dioxin. Cell. Mol. Gastroenterol. Hepatol. 2021;11:147–159. doi: 10.1016/j.jcmgh.2020.07.012.
- Ordovas-Montanes J, et al. Allergic inflammatory memory in human respiratory epithelial progenitor cells. Nature. 2018;560:649–654. doi: 10.1038/s41586-018-0449-8.
- Reyes M, et al. An immune-cell signature of bacterial sepsis. Nat. Med. 2020;26:333–340. doi: 10.1038/s41591-020-0752-4.
- Schafflick D, et al. Integrated single cell analysis of blood and cerebrospinal fluid leukocytes in multiple sclerosis. Nat. Commun. 2020;11:247. doi: 10.1038/s41467-019-14118-w.
- Schirmer L, et al. Neuronal vulnerability and multilineage diversity in multiple sclerosis. Nature. 2019;573:75–82. doi: 10.1038/s41586-019-1404-z.
- Wagner DE, et al. Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo. Science. 2018;360:981–987. doi: 10.1126/science.aar4362.
- Wang S, et al. Single-cell transcriptomic atlas primate ovarian aging. Cell. 2020;180:585–600.e19. doi: 10.1016/j.cell.2020.01.009.
- Wirka RC, et al. Atheroprotective roles of smooth muscle cell phenotypic modulation and the TCF21 disease gene as revealed by single-cell analysis. Nat. Med. 2019;25:1280–1289. doi: 10.1038/s41591-019-0512-5.
- Ximerakis M, et al. Single-cell transcriptomic profiling of the aging mouse brain. Nat. Neurosci. 2019;22:1696–1708. doi: 10.1038/s41593-019-0491-3.
Source: PubMed