Subtype and cell type specific expression of lncRNAs provide insight into breast cancer

Sunniva Stordal Bjørklund, Miriam Ragle Aure, Jari Häkkinen, Johan Vallon-Christersson, Surendra Kumar, Katrine Bull Evensen, Thomas Fleischer, Jörg Tost, OSBREAC, Kristine K Sahlberg, Anthony Mathelier, Gyan Bhanot, Shridar Ganesan, Xavier Tekpli, Vessela N Kristensen, Tone F Bathen, Elin Borgen, Anne-Lise Børresen-Dale, Olav Engebråten, Britt Fritzman, Olaf Johan Hartmann-Johnsen, Øystein Garred, Jürgen Geisler, Gry Aarum Geitvik, Solveig Hofvind, Rolf Kåresen, Anita Langerød, Ole Christian Lingjærde, Gunhild Mari Mælandsmo, Bjørn Naume, Hege G Russnes, Torill Sauer, Helle Kristine Skjerven, Ellen Schlichting, Therese Sørlie, Sunniva Stordal Bjørklund, Miriam Ragle Aure, Jari Häkkinen, Johan Vallon-Christersson, Surendra Kumar, Katrine Bull Evensen, Thomas Fleischer, Jörg Tost, OSBREAC, Kristine K Sahlberg, Anthony Mathelier, Gyan Bhanot, Shridar Ganesan, Xavier Tekpli, Vessela N Kristensen, Tone F Bathen, Elin Borgen, Anne-Lise Børresen-Dale, Olav Engebråten, Britt Fritzman, Olaf Johan Hartmann-Johnsen, Øystein Garred, Jürgen Geisler, Gry Aarum Geitvik, Solveig Hofvind, Rolf Kåresen, Anita Langerød, Ole Christian Lingjærde, Gunhild Mari Mælandsmo, Bjørn Naume, Hege G Russnes, Torill Sauer, Helle Kristine Skjerven, Ellen Schlichting, Therese Sørlie

Abstract

Long non-coding RNAs (lncRNAs) are involved in breast cancer pathogenesis through chromatin remodeling, transcriptional and post-transcriptional gene regulation. We report robust associations between lncRNA expression and breast cancer clinicopathological features in two population-based cohorts: SCAN-B and TCGA. Using co-expression analysis of lncRNAs with protein coding genes, we discovered three distinct clusters of lncRNAs. In silico cell type deconvolution coupled with single-cell RNA-seq analyses revealed that these three clusters were driven by cell type specific expression of lncRNAs. In one cluster lncRNAs were expressed by cancer cells and were mostly associated with the estrogen signaling pathways. In the two other clusters, lncRNAs were expressed either by immune cells or fibroblasts of the tumor microenvironment. To further investigate the cis-regulatory regions driving lncRNA expression in breast cancer, we identified subtype-specific transcription factor (TF) occupancy at lncRNA promoters. We also integrated lncRNA expression with DNA methylation data to identify long-range regulatory regions for lncRNA which were validated using ChiA-Pet-Pol2 loops. lncRNAs play an important role in shaping the gene regulatory landscape in breast cancer. We provide a detailed subtype and cell type-specific expression of lncRNA, which improves the understanding of underlying transcriptional regulation in breast cancer.

Conflict of interest statement

The authors declare no competing interests.

© 2022. The Author(s).

Figures

Fig. 1. lncRNA expression in breast cancer…
Fig. 1. lncRNA expression in breast cancer subtypes.
Hierarchical clustering of log2(TPM + 1) of 4108 lncRNAs expressed above filtering thresholds (see Methods) in the SCAN-B (a), and TCGA-BRCA (b) cohorts. Estrogen Receptor (ER) and Her2 status, as well as PAM50 subtypes are annotated at the top of the heatmap. The expression gradient (blue to red) represents scaled and centered log2(TPM + 1). ce Dot plot of the log Fold Change (FC) from the differential expression analysis using a fitted Limma model (lmfit) and moderated t-statistic (eBayes) between patients of different subtypes in SCAN-B (x-axis) and TCGA-BRCA (y-axis). Each dot represents a lncRNA, while the colour indicates the subtype with the highest expression c ER positive (blue) and ER negative (red), d Her2 negative (dark blue) and Her2 positive (pink). e Luminal A (dark blue), and Luminal B (light blue). Gray dots are lncRNAs that are not significantly differentially expressed, while black dots represent lncRNAs with opposite fold change (FC) in the two cohorts. The number of patients in each clinical group were as follows: ER positive (n = 2409 and n = 807), ER negative (n = 504 and n = 237), Her2 positive (n = 458 and n = 114), Her2 negative (n = 2845 and n = 650), Luminal A (n = 1769 and n = 562), and Luminal B (n = 766 and n = 209) in SCAN-B and TCGA-BRCA respectively.
Fig. 2. Clustering of lncRNA into relevant…
Fig. 2. Clustering of lncRNA into relevant pathways for breast cancer.
a Hierarchical clustering of lncRNA-mRNA Spearman correlation values (positive correlation in red, negative correlation in blue) following co-expression analysis between lncRNAs (n = 4108) and protein coding mRNAs (n = 17060). Only lncRNA and mRNA with significant correlation (Bonferroni p-value < 0.05) and −0.4> Spearman’s rho > 0.4 in the TCGA (n = 1095) and SCAN-B (n = 3455) cohorts are used in the unsupervised clustering. In addition, we plot only lncRNAs and mRNAs with number of association higher than the mean value of association (Supplementary Fig. 4). Clusters are defined using cutree_rows = 3 and cutree_cols = 3. lncRNAs (x-axis) are annotated according to the differential expression analysis (Fig. 1). b, d Bar plot showing -log(FDR q.value) from a hypergeometric test (y-axis) of gene set enrichment analysis using Hallmark pathways of the MSigDB database. Input genes for GSEA are genes from mRNA-cluster A (n = 2890) (b), mRNA-cluster B (n = 1480) (c), and mRNA-cluster C (n = 667)(d). Boxplot of the coefficients from the generalized linear modeling of the expression of lncRNAs in the SCAN-B cohort using three variables into the same model, ESR1 mRNA (to reflect estrogen signaling (e)), fibroblast score (to infer fibroblast tumor content (f)) and lymphocyte score (to infer lymphocyte infiltration (g)). Each dot represents the coefficient for a variable and each lncRNA in cluster 1 (n = 610), cluster 2 (n = 199), and cluster 3 (n = 110). Kruskal-Wallis test p-values are shown. The line within each box represents the median. Upper and lower edges of each box represent 75th and 25th percentile, respectively. The whiskers represent the lowest datum still within [1.5 × (75th  −  25th percentile)] of the lower quartile, and the highest datum still within [1.5 × (75th  −  25th percentile)] of the upper quartile.
Fig. 3. lncRNA expression in single cell…
Fig. 3. lncRNA expression in single cell RNA-seq data.
a UMAP of 94357 single cells from breast tumours colour-coded according to cell types. b, d Dot plot of lncRNAs (found in the scRNA-seq data set) with highest glm coefficient associated with the characteristics of each cluster, i.e ESR1 mRNA (Cluster 1), fibroblast score (Cluster 2), lymphocyte score (Cluster 3). Size of the dot represents the percentage of cells expressing the lncRNA, while the colour of the dot reflects the average expression in each of the UMAP-cell-type-cluster identified. Cluster 1 lncRNAs b, cluster 2 lncRNAs c, and cluster 3 lncRNAs d. eg Expression of one high ranking lncRNA from each lncRNA cluster plotted on the scRNA-seq UMAP. Cluster 1-lncRNA: GATA3-AS1c, cluster 2-lncRNA: NR2F1-AS1d, and cluster 3-lncRNA: LINC0861e. Colour gradient (purple) represents Log normalized counts using scale.factor = 10000.
Fig. 4. Functional annotation of lncRNA promoters.
Fig. 4. Functional annotation of lncRNA promoters.
a Schematic overview of the definition of lncRNA promoters not overlapping with a protein coding gene locus. bp: base pair; PC: protein-coding; TSS: transcription start site. b, c Average normalized counts for ATAC-seq peaks mapped to lncRNA promoters in estrogen receptor (ER) positive (+) (blue dots) (n = 58) and ER negative (−) (red dots) (n = 12) breast tumor samples from the TCGA-BRCA cohort. Wilcoxon test p-values are denoted. The line within each box represents the median. Upper and lower edges of each box represent 75th and 25th percentile, respectively. The whiskers represent the lowest datum still within [1.5 × (75th −  25th percentile)] of the lower quartile, and the highest datum still within [1.5 × (75th  −  25th percentile)] of the upper quartile. b Promoters of independent lncRNAs overexpressed in ER positive cases and c promoters of independent lncRNAs overexpressed in ER negative cases. d, e Enrichment of independent lncRNA promoters across ChromHMM genome segmentation from breast cancer cell lines. Enrichment is calculated as the ratio between the frequency of lncRNA promoters found within a specific segment type, over the frequency of all lncRNA promoters within the same segment type. The length of the bars (x-axis) shows the log transformed BH corrected p-value from the hypergeometric test. d Promoters of independent lncRNAs overexpressed in ER positive cases and e promoters of independent lncRNAs overexpressed in ER negative cases. Active Enhancer=EhAct, Active Promoter = PrAct, Repeat Zink Finger = RpZNF, Flanking Promoter region = PrFlk. f, g Swarm plots showing enrichment of TF binding sites (–(log10(p-value) using Fisher’s exact tests) on the y-axis for specific sets of promoters according to UniBind. TF names of the top 10 enriched TF binding sites data sets are annotated by colours. f Promoters of independent lncRNA overexpressed in ER positive cases and g promoters of independent lncRNAs overexpressed in ER negative cases.
Fig. 5. Distal regulatory element in the…
Fig. 5. Distal regulatory element in the LINC01488 locus.
a Enrichment of CpGs with DNA methylation significantly inversely correlated with lncRNA expression across ChromHMM genome segmentation from breast cancer cell lines. Enrichment is calculated by comparing the genomic location of the CpG inversely correlated to all the CpGs on the 450k Illumina array as background. Active Enhancer = EhAct, Ehnancer Genic = EhGen, Transcription flanking = TxFlk. Average normalized counts for ATAC-seq peaks mapped to CpG location for which DNA methylation is significantly inversely correlated with lncRNAs with higher expression in ER positive cases (b) and higher expression in ER negative cases (c). ATAC-seq data from ER + (blue dots) (n = 58) and ER- (red dots) (n = 12) breast tumor samples from the TCGA-BRCA cohort. Wilcoxon test p-values are denoted. The line within each box represents the median. Upper and lower edges of each box represent 75th and 25th percentile, respectively. The whiskers represent the lowest datum still within [1.5 × (75th  −  25th percentile)] of the lower quartile, and the highest datum still within [1.5 × (75th  −  25th percentile)] of the upper quartile. d Swarm plot showing enrichment of TF binding sites (–(log10(p-value) using Fisher’s exact tests) on the y-axis for CpGs with DNA methylation inversely correlated with lncRNA expression. Names of the top 10 enriched TF binding sites data sets are annotated by colours. e Graphical illustration of the LINC01488 locus annotated for different epigenomic tracks. CpGs measured by the 450 k Illumina array are shown together with the significant negative correlations between levels of DNA methylation and LINC01488 expression in the OSLO2 and TCGA cohorts (blue arcs, negative expression-methylation correlation). ChromHMM Enhancer regions (active and genic) in the Mcf7 cell line (green) with ChiA-PET polII loop connecting the TSS of LINC01488 to the CpG in the enhancer region (pink arcs). TF binding of ESR1 (dark blue), FOXA1 (blue), and GATA3 (light blue) from ChIP-seq experiments (ReMap). f, g Correlation plot of levels of LINC01488 expression (x-axis) and levels of DNA methylation of the CpG (y-axis) in long-range interaction in e. Rho and p-value from Spearman correlation is indicated. f OSLO2 (ER positive, n = 214, ER negative, n = 52), g TCGA (ER positive, n = 807, ER negative, n = 237. h Graphical illustration of the LINC01488 locus annotated with ChromHMM Enhancer regions (active and genic) in the Mcf7 cell line (green) and ChiA-PET polII loop connecting LINC01488 to CCND1 (pink arcs). i Correlation plot of log2(TPM + 1) LINC01488 expression (x-axis) and log2(TPM + 1) CCND1 expression (y-axis) in ER positive (n = 2409) and ER negative (n = 504) patients in the SCAN-B cohort. Rho and p-value from Spearman correlation are indicated.

References

    1. Sorlie T, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl Acad. Sci. 2001;98:10869–10874. doi: 10.1073/pnas.191367098.
    1. Bertucci F, et al. How basal are triple-negative breast cancers? Int J. Cancer. 2008;123:236–240. doi: 10.1002/ijc.23518.
    1. Zhu Q, Tekpli X, Troyanskaya OG, Kristensen VN. Subtype-specific transcriptional regulators in breast tumors subjected to genetic and epigenetic alterations. Bioinformatics. 2020;36:994–999. doi: 10.1093/bioinformatics/btz709.
    1. Cabili MN, et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25:1915–1927. doi: 10.1101/gad.17446611.
    1. Iyer MK, et al. The landscape of long noncoding RNAs in the human transcriptome. Nat. Genet. 2015;47:199–208. doi: 10.1038/ng.3192.
    1. Vucicevic D, Corradin O, Ntini E, Scacheri PC, Orom UA. Long ncRNA expression associates with tissue-specific enhancers. Cell Cycle. 2015;14:253–260. doi: 10.4161/15384101.2014.977641.
    1. Gil N, Ulitsky I. Production of Spliced Long Noncoding RNAs Specifies Regions with Increased Enhancer Activity. Cell Syst. 2018;7:537–547.e533. doi: 10.1016/j.cels.2018.10.009.
    1. Marques AC, et al. Chromatin signatures at transcriptional start sites separate two equally populated yet distinct classes of intergenic long noncoding RNAs. Genome Biol. 2013;14:R131. doi: 10.1186/gb-2013-14-11-r131.
    1. Dimitrova N, et al. LincRNA-p21 activates p21 in cis to promote Polycomb target gene expression and to enforce the G1/S checkpoint. Mol. Cell. 2014;54:777–790. doi: 10.1016/j.molcel.2014.04.025.
    1. Wang KC, et al. A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature. 2011;472:120–124. doi: 10.1038/nature09819.
    1. Xiang JF, et al. Human colorectal cancer-specific CCAT1-L lncRNA regulates long-range chromatin interactions at the MYC locus. Cell Res. 2014;24:513–531. doi: 10.1038/cr.2014.35.
    1. Lai F, et al. Activating RNAs associate with Mediator to enhance chromatin architecture and transcription. Nature. 2013;494:497–501. doi: 10.1038/nature11884.
    1. Niknafs YS, et al. The lncRNA landscape of breast cancer reveals a role for DSCAM-AS1 in breast cancer progression. Nat. Commun. 2016;7:12791. doi: 10.1038/ncomms12791.
    1. Gupta RA, et al. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature. 2010;464:1071–1076. doi: 10.1038/nature08975.
    1. Yan X, et al. Comprehensive Genomic Characterization of Long Non-coding RNAs across Human Cancers. Cancer Cell. 2015;28:529–540. doi: 10.1016/j.ccell.2015.09.006.
    1. Su, X. et al. Comprehensive analysis of long non-coding RNAs in human breast cancer clinical subtypes. Oncotarget5, 9864–9876 (2014).
    1. Vallon-Christersson J, et al. Cross comparison and prognostic assessment of breast cancer multigene signatures in a large population-based contemporary clinical series. Sci. Rep. 2019;9:12184. doi: 10.1038/s41598-019-48570-x.
    1. Yates AD, et al. Ensembl 2020. Nucleic Acids Res. 2020;48:D682–D688. doi: 10.1093/nar/gkz1138.
    1. Liu J, et al. Forkhead box C1 promoter upstream transcript, a novel long non-coding RNA, regulates proliferation and migration in basal-like breast cancer. Mol. Med Rep. 2015;11:3155–3159. doi: 10.3892/mmr.2014.3089.
    1. Ma W, et al. Immune-related lncRNAs as predictors of survival in breast cancer: a prognostic signature. J. Transl. Med. 2020;18:442. doi: 10.1186/s12967-020-02522-6.
    1. Wang D, et al. Overexpression of MAPT-AS1 is associated with better patient survival in breast cancer. Biochem. Cell Biol. 2019;97:158–164. doi: 10.1139/bcb-2018-0039.
    1. Munschauer M, et al. The NORAD lncRNA assembles a topoisomerase complex critical for genome stability. Nature. 2018;561:132–136. doi: 10.1038/s41586-018-0453-z.
    1. Betts JA, et al. Long Noncoding RNAs CUPID1 and CUPID2 Mediate Breast Cancer Risk at 11q13 by Modulating the Response to DNA Damage. Am. J. Hum. Genet. 2017;101:255–266. doi: 10.1016/j.ajhg.2017.07.007.
    1. Soundararajan M, Kannan S. Fibroblasts and mesenchymal stem cells: Two sides of the same coin? J. Cell Physiol. 2018;233:9099–9109. doi: 10.1002/jcp.26860.
    1. Walker, C., Mojares, E. & Del Rio Hernandez, A. Role of Extracellular Matrix in Development and Cancer Progression. Int. J. Mol. Sci.19, 10.3390/ijms19103028 (2018).
    1. Jackson HW, et al. The single-cell pathology landscape of breast cancer. Nature. 2020;578:615–620. doi: 10.1038/s41586-019-1876-x.
    1. Tekpli X, et al. An independent poor-prognosis subtype of breast cancer defined by a distinct tumor immune microenvironment. Nat. Commun. 2019;10:5499. doi: 10.1038/s41467-019-13329-5.
    1. Aran D, Hu Z, Butte AJ. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017;18:220. doi: 10.1186/s13059-017-1349-1.
    1. Xi Y, et al. Histone modification profiling in breast cancer cell lines highlights commonalities and differences among subtypes. BMC Genomics. 2018;19:150. doi: 10.1186/s12864-018-4533-0.
    1. Wu SZ, et al. A single-cell and spatially resolved atlas of human breast cancers. Nat. Genet. 2021;53:1334–1347. doi: 10.1038/s41588-021-00911-1.
    1. Sage AP, et al. Assessment of long non-coding RNA expression reveals novel mediators of the lung tumour immune response. Sci. Rep. 2020;10:16945. doi: 10.1038/s41598-020-73787-6.
    1. Gheorghe M, et al. A map of direct TF-DNA interactions in the human genome. Nucleic Acids Res. 2019;47:7715. doi: 10.1093/nar/gkz582.
    1. Kanzaki H, et al. Disabling the Nuclear Translocalization of RelA/NF-κB by a Small Molecule Inhibits Triple-Negative Breast Cancer Growth. Breast Cancer. 2021;13:419–430.
    1. Kinoshita S, Akira S, Kishimoto T. A member of the C/EBP family, NF-IL6 beta, forms a heterodimer and transcriptionally synergizes with NF-IL6. Proc. Natl Acad. Sci. 1992;89:1473–1476. doi: 10.1073/pnas.89.4.1473.
    1. Novoszel P, et al. The AP-1 transcription factors c-Jun and JunB are essential for CD8α conventional dendritic cell identity. Cell Death Differ. 2021;28:2404–2420. doi: 10.1038/s41418-021-00765-4.
    1. Fleischer T, et al. DNA methylation at enhancers identifies distinct breast cancer lineages. Nat. Commun. 2017;8:1379. doi: 10.1038/s41467-017-00510-x.
    1. Hon CC, et al. An atlas of human long non-coding RNAs with accurate 5’ ends. Nature. 2017;543:199–204. doi: 10.1038/nature21374.
    1. Piccoli MT, et al. Inhibition of the Cardiac Fibroblast-Enriched lncRNA Meg3 Prevents Cardiac Fibrosis and Diastolic Dysfunction. Circ. Res. 2017;121:575–583. doi: 10.1161/CIRCRESAHA.117.310624.
    1. Aure, M. R. et al. Crosstalk between microRNA expression and DNA methylation drive the hormone-dependent phenotype of breast cancer. bioRxiv10.1101/2020.04.12.038182 (2020).
    1. Barter MJ, et al. The long non-coding RNA ROCR contributes to SOX9 expression and chondrogenic differentiation of human mesenchymal stem cells. Development. 2017;144:4510–4521.
    1. Tariq A, et al. LncRNA-mediated regulation of SOX9 expression in basal subtype breast cancer cells. RNA. 2020;26:175–185. doi: 10.1261/rna.073254.119.
    1. Liu Y, et al. Long non-coding RNA NR2F1-AS1 induces breast cancer lung metastatic dormancy by regulating NR2F1 and ΔNp63. Nat. Commun. 2021;12:5232. doi: 10.1038/s41467-021-25552-0.
    1. Wen S, et al. Cancer-associated fibroblast (CAF)-derived IL32 promotes breast cancer cell invasion and metastasis via integrin β3-p38 MAPK signalling. Cancer Lett. 2019;442:320–332. doi: 10.1016/j.canlet.2018.10.015.
    1. Li Y, et al. Pan-cancer characterization of immune-related lncRNAs identifies potential oncogenic biomarkers. Nat. Commun. 2020;11:1000. doi: 10.1038/s41467-020-14802-2.
    1. Bester AC, et al. An Integrated Genome-wide CRISPRa Approach to Functionalize lncRNAs in Drug Resistance. Cell. 2018;173:649–664.e620. doi: 10.1016/j.cell.2018.03.052.
    1. Lupien M, et al. FoxA1 translates epigenetic signatures into enhancer-driven lineage-specific transcription. Cell. 2008;132:958–970. doi: 10.1016/j.cell.2008.01.018.
    1. Hah N, Murakami S, Nagari A, Danko CG, Kraus WL. Enhancer transcripts mark active estrogen receptor binding sites. Genome Res. 2013;23:1210–1223. doi: 10.1101/gr.152306.112.
    1. Saal LH, et al. The Sweden Cancerome Analysis Network - Breast (SCAN-B) Initiative: a large-scale multicenter infrastructure towards implementation of breast cancer genomic analyses in the clinical routine. Genome Med. 2015;7:20. doi: 10.1186/s13073-015-0131-9.
    1. Cancer Genome Atlas, N. Comprehensive molecular portraits of human breast tumours. Nature. 2012;490:61–70. doi: 10.1038/nature11412.
    1. Aure MR, et al. Integrative clustering reveals a novel split in the luminal A subtype of breast cancer with impact on outcome. Breast Cancer Res. 2017;19:44. doi: 10.1186/s13058-017-0812-y.
    1. Aure MR, et al. Integrated analysis reveals microRNA networks coordinately expressed with key proteins in breast cancer. Genome Med. 2015;7:21. doi: 10.1186/s13073-015-0135-5.
    1. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 2016;34:525–527. doi: 10.1038/nbt.3519.
    1. Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 2015;4:1521. doi: 10.12688/f1000research.7563.1.
    1. Goldman MJ, et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol. 2020;38:675–678. doi: 10.1038/s41587-020-0546-8.
    1. Berger AC, et al. A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers. Cancer Cell. 2018;33:690–705.e699. doi: 10.1016/j.ccell.2018.03.014.
    1. Colaprico A, et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2016;44:e71. doi: 10.1093/nar/gkv1507.
    1. Fleischer T, et al. Genome-wide DNA methylation profiles in progression to in situ and invasive carcinoma of the breast with impact on gene transcription and prognosis. Genome Biol. 2014;15:435.
    1. Touleimat N, Tost J. Complete pipeline for Infinium((R)) Human Methylation 450K BeadChip data processing using subset quantile normalization for accurate DNA methylation estimation. Epigenomics. 2012;4:325–341. doi: 10.2217/epi.12.21.
    1. Benjamini Y, Drai D, Elmer G, Kafkafi N, Golani I. Controlling the false discovery rate in behavior genetics research. Behav. Brain Res. 2001;125:279–284. doi: 10.1016/S0166-4328(01)00297-2.
    1. Liberzon A, et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004.
    1. The Gene Ontology, C. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res. 2019;47:D330–D338. doi: 10.1093/nar/gky1055.
    1. Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556.
    1. Subramanian A, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102.
    1. Ju W, et al. Defining cell-type specificity at the transcriptional level in human disease. Genome Res. 2013;23:1862–1873. doi: 10.1101/gr.155697.113.
    1. Wilks C, et al. recount3: summaries and queries for large-scale RNA-seq expression and splicing. Genome Biol. 2021;22:323. doi: 10.1186/s13059-021-02533-6.
    1. Smedley D, et al. BioMart–biological queries made easy. BMC Genomics. 2009;10:22. doi: 10.1186/1471-2164-10-22.
    1. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033.
    1. Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science362, 10.1126/science.aav1898 (2018).
    1. Fornes O, et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2020;48:D87–D92. doi: 10.1093/nar/gkaa516.
    1. Sheffield NC, Bock C. LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor. Bioinformatics. 2016;32:587–589. doi: 10.1093/bioinformatics/btv612.
    1. Li G, et al. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell. 2012;148:84–98. doi: 10.1016/j.cell.2011.12.014.
    1. Cheneby J, Gheorghe M, Artufel M, Mathelier A, Ballester B. ReMap 2018: an updated atlas of regulatory regions from an integrative analysis of DNA-binding ChIP-seq experiments. Nucleic Acids Res. 2018;46:D267–D275. doi: 10.1093/nar/gkx1092.

Source: PubMed

3
Abonnieren