Integrative analysis of public ChIP-seq experiments reveals a complex multi-cell regulatory landscape
Aurélien Griffon, Quentin Barbier, Jordi Dalino, Jacques van Helden, Salvatore Spicuglia, Benoit Ballester, Aurélien Griffon, Quentin Barbier, Jordi Dalino, Jacques van Helden, Salvatore Spicuglia, Benoit Ballester
Abstract
The large collections of ChIP-seq data rapidly accumulating in public data warehouses provide genome-wide binding site maps for hundreds of transcription factors (TFs). However, the extent of the regulatory occupancy space in the human genome has not yet been fully apprehended by integrating public ChIP-seq data sets and combining it with ENCODE TFs map. To enable genome-wide identification of regulatory elements we have collected, analysed and retained 395 available ChIP-seq data sets merged with ENCODE peaks covering a total of 237 TFs. This enhanced repertoire complements and refines current genome-wide occupancy maps by increasing the human genome regulatory search space by 14% compared to ENCODE alone, and also increases the complexity of the regulatory dictionary. As a direct application we used this unified binding repertoire to annotate variant enhancer loci (VELs) from H3K4me1 mark in two cancer cell lines (MCF-7, CRC) and observed enrichments of specific TFs involved in biological key functions to cancer development and proliferation. Those enrichments of TFs within VELs provide a direct annotation of non-coding regions detected in cancer genomes. Finally, full access to this catalogue is available online together with the TFs enrichment analysis tool (http://tagc.univ-mrs.fr/remap/).
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Figures
References
- Hoffman M.M., Ernst J., Wilder S.P., Kundaje A., Harris R.S., Libbrecht M., Giardine B., Ellenbogen P.M., Bilmes J.A., Birney E., et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res. 2013;41:827–841.
- Negre N., Brown C.D., Ma L., Bristow C.A., Miller S.W., Wagner U., Kheradpour P., Eaton M.L., Loriaux P., Sealfon R., et al. A cis-regulatory map of the Drosophila genome. Nature. 2011;471:527–531.
- Shen Y., Yue F., McCleary D.F., Ye Z., Edsall L., Kuan S., Wagner U., Dixon J., Lee L., Lobanenkov V.V., et al. A map of the cis-regulatory sequences in the mouse genome. Nature. 2012:116–120. 488.
- ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74.
- Neph S., Vierstra J., Stergachis A.B., Reynolds A.P., Haugen E., Vernot B., Thurman R.E., John S., Sandstrom R., Johnson A.K., et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature. 2012;489:83–90.
- Hnisz D., Abraham B.J., Lee T.I., Lau A., Saint-André V., Sigova A.A., Hoke H.A., Young R.A. Super-enhancers in the control of cell identity and disease. Cell. 2013;155:934–947.
- Chapuy B., McKeown M.R., Lin C.Y., Monti S., Roemer M.G.M., Qi J., Rahl P.B., Sun H.H., Yeda K.T., Doench J.G., et al. Discovery and characterization of super-enhancer-associated dependencies in diffuse large B cell lymphoma. Cancer Cell. 2013;24:777–790.
- Andersson R., Gebhard C., Miguel-Escalada I., Hoof I., Bornholdt J., Boyd M., Chen Y., Zhao X., Schmidl C., Suzuki T., et al. An atlas of active enhancers across human cell types and tissues. Nature. 2014;507:455–461.
- Liu T., Ortiz J.A., Taing L., Meyer C.A., Lee B., Zhang Y., Shin H., Wong S.S., Ma J., Lei Y., et al. Cistrome: an integrative platform for transcriptional regulation studies. Genome Biol. 2011;12:R83.
- Marinov G.K., Kundaje A., Park P.J., Wold B.J. Large-scale quality analysis of published ChIP-seq data. G3 (Bethesda) 2014;4:209–223.
- Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359.
- Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., Nusbaum C., Myers R.M., Brown M., Li W., et al. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137.
- Salmon-Divon M., Dvinge H., Tammoja K., Bertone P. PeakAnalyzer: genome-wide annotation of chromatin binding and modification loci. BMC Bioinformatics. 2010;11:415.
- Landt S.G., Marinov G.K., Kundaje A., Kheradpour P., Pauli F., Batzoglou S., Bernstein B.E., Bickel P., Brown J.B., Cayting P., et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 2012;22:1813–1831.
- Kharchenko P.V., Tolstorukov M.Y., Park P.J. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat. Biotechnol. 2008;26:1351–1359.
- Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842.
- Shin H., Liu T., Manrai A.K., Liu X.S. CEAS: cis-regulatory element annotation system. Bioinformatics. 2009;25:2605–2606.
- Thomas-Chollier M., Herrmann C., Defrance M., Sand O., Thieffry D., van Helden J. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets. Nucleic Acids Res. 2012;40:e31.
- Wang J., Zhuang J., Iyer S., Lin X.-Y., Greven M.C., Kim B.-H., Moore J., Pierce B.G., Dong X., Virgil D., et al. : a Wiki-based database for transcription factor-binding data generated by the ENCODE consortium. Nucleic Acids Res. 2013;41:D171–D176.
- Mathelier A., Zhao X., Zhang A.W., Parcy F., Worsley-Hunt R., Arenillas D.J., Buchman S., Chen C.-Y., Chou A., Ienasescu H., et al. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res. 2014;42:D142–D147.
- Flicek P., Aken B.L., Ballester B., Beal K., Bragin E., Brent S., Chen Y., Clapham P., Coates G., Fairley S., et al. Ensembl's 10th year. Nucleic Acids Res. 2010;38:D557–D562.
- Cooper G.M., Stone E.A., Asimenos G., Green E.D., Batzoglou S., Sidow A., NISC Comparative Sequencing Program Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 2005;15:901–913.
- Chikina M.D., Troyanskaya O.G. An effective statistical evaluation of ChIPseq dataset similarity. Bioinformatics. 2012;28:607–613.
- Bastian M., Heymann S., Jacomy M. Gephi: an open source software for exploring and manipulating networks. Intl. AAAI Conf. Weblogs Social Media. 2009
- Blondel V.D., Guillaume J.-L., Lambiotte R., Lefebvre E. Fast unfolding of communities in large networks. J. Stat. Mech. 2008;2008:P10008.
- Akhtar-Zaidi B., Cowper-Sal Lari R., Corradin O., Saiakhova A., Bartels C.F., Balasubramanian D., Myeroff L., Lutterbaugh J., Jarrar A., Kalady M.F. Epigenomic enhancer profiling defines a signature of colon cancer. Science. 2012;336:736–739.
- Choe M.K., Hong C.-P., Park J., Seo S.H., Roh T.-Y. Functional elements demarcated by histone modifications in breast cancer cells. Biochem. Biophys. Res. Commun. 2012;418:475–482.
- Barrett T., Troup D.B., Wilhite S.E., Ledoux P., Rudnev D., Evangelista C., Kim I.F., Soboleva A., Tomashevsky M., Edgar R. NCBI GEO: mining tens of millions of expression profiles–database and tools update. Nucleic Acids Res. 2007;35:D760–D765.
- Parkinson H., Kapushesky M., Shojatalab M., Abeygunawardena N., Coulson R., Farne A., Holloway E., Kolesnykov N., Lilja P., Lukk M., et al. ArrayExpress–a public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 2007;35:D747–D750.
- Cuddapah S., Jothi R., Schones D.E., Roh T.-Y., Cui K., Zhao K. Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. Genome Res. 2009;19:24–32.
- Heintzman N.D., Hon G.C., Hawkins R.D., Kheradpour P., Stark A., Harp L.F., Ye Z., Lee L.K., Stuart R.K., Ching C.W., et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature. 2009;459:108–112.
- Moorman C., Sun L.V., Wang J., de Wit E., Talhout W., Ward L.D., Greil F., Lu X.-J., White K.P., Bussemaker H.J., et al. Hotspots of transcription factor colocalization in the genome of Drosophila melanogaster. Proc. Natl. Acad. Sci. U.S.A. 2006;103:12027–12032.
- Foley J.W., Sidow A. Transcription-factor occupancy at HOT regions quantitatively predicts RNA polymerase recruitment in five human cell lines. BMC Genom. 2013;14:720.
- Lee B.-K., Bhinge A.A., Battenhouse A., Liu Z., McDaniell R.M., Song L., Ni Y., Birney E., Lieb J.D., Furey T.S. Cell-type specific and combinatorial usage of diverse transcription factors revealed by genome-wide binding studies in multiple human cells. Genome Res. 2011;22:9–24.
- Xie D., Boyle A.P., Wu L., Zhai J., Kawli T., Snyder M. Dynamic trans-acting factor colocalization in human cells. Cell. 2013;155:713–724.
- Vlieghe D., Sandelin A., Bleser P.J., Vleminckx K., Wasserman W.W., van Roy F., Lenhard B. A new generation of JASPAR, the open-access repository for transcription factor binding site profiles. Nucleic Acids Res. 2006;34:D95–D97.
- Schmidt D., Wilson M.D., Ballester B., Schwalie P.C., Brown G.D., Marshall A., Kutter C., Watt S., Martinez-Jimenez C.P., Mackay S., et al. Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science (New York, NY) 2010;328:1036–1040.
- Consortium E.P. A user's guide to the encyclopedia of DNA elements (ENCODE) PLoS Biol. 2011;9:e1001046.
- Kazemian M., Pham H., Wolfe S.A., Brodsky M.H., Sinha S. Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development. Nucleic Acids Res. 2013;41:8237–8252.
- Cheng Q., Kazemian M., Pham H., Blatti C., Celniker S.E., Wolfe S.A., Brodsky M.H., Sinha S. Computational identification of diverse mechanisms underlying transcription factor-DNA occupancy. PLoS Genet. 2013;9:e1003571.
- Gerstein M.B., Kundaje A., Hariharan M., Landt S.G., Yan K.-K., Cheng C., Mu X.J., Khurana E., Rozowsky J., Alexander R., et al. Architecture of the human regulatory network derived from ENCODE data. Nature. 2012;489:91–100.
- Wang J., Zhuang J., Iyer S., Lin X., Whitfield T.W., Greven M.C., Pierce B.G., Dong X., Kundaje A., Cheng Y., et al. Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res. 2012;22:1798–1812.
- Vaquerizas J.M., Kummerfeld S.K., Teichmann S.A., Luscombe N.M. A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet. 2009;10:252–263.
- Hawkins S.M., Loomans H.A., Wan Y.-W., Ghosh-Choudhury T., Coffey D., Xiao W., Liu Z., Sangi-Haghpeykar H., Anderson M.L. Expression and functional pathway analysis of nuclear receptor NR2F2 in ovarian cancer. J. Clin. Endocrinol. Metab. 2013;98:E1152–E1162.
- Su X., Chakravarti D., Cho M.S., Liu L., Gi Y.J., Lin Y.-L., Leung M.L., El-Naggar A., Creighton C.J., Suraokar M.B., et al. TAp63 suppresses metastasis through coordinate regulation of Dicer and miRNAs. Nature. 2010;467:986–990.
- Schaab C., Geiger T., Stoehr G., Cox J., Mann M. Analysis of high accuracy, quantitative proteomics data in the MaxQB database. Mol. Cell Proteom. 2012;11 M111.014068.
- Tang W., Dodge M., Gundapaneni D., Michnoff C., Roth M., Lum L. A genome-wide RNAi screen for Wnt/beta-catenin pathway components identifies unexpected roles for TCF transcription factors in cancer. Proc. Natl. Acad. Sci. U.S.A. 2008;105:9697–9702.
- Slattery M.L., Folsom A.R., Wolff R., Herrick J., Caan B.J., Potter J.D. Transcription factor 7-like 2 polymorphism and colon cancer. Cancer Epidemiol. Biomarkers Prev. 2008;17:978–982.
- Shaulian E., Karin M. AP-1 as a regulator of cell life and death. Nat. Cell Biol. 2002;4:E131–E136.
- Teng L., He B., Gao P., Gao L., Tan K. Discover context-specific combinatorial transcription factor interactions by integrating diverse ChIP-Seq data sets. Nucleic Acids Res. 2013;42 gkt1105–e24.
- Ernst J., Kellis M. Interplay between chromatin state, regulator binding, and regulatory motifs in six human cell types. Genome Res. 2013;23:1142–1154.
- Mendoza-Parra M.-A., Van Gool W., Mohamed Saleem M.A., Ceschin D.G., Gronemeyer H. A quality control system for profiles obtained by ChIP sequencing. Nucleic Acids Res. 2013;41:e196.
- Khurana E., Fu Y., Colonna V., Mu X.J., Kang H.M., Lappalainen T., Sboner A., Lochovsky L., Chen J., Harmanci A., et al. Integrative annotation of variants from 1092 humans: application to cancer genomics. Science (New York, NY) 2013;342:1235587.
- Ritchie G.R.S., Dunham I., Zeggini E., Flicek P. Functional annotation of noncoding sequence variants. Nat. Methods. 2014;11:294–296.
- Whyte W.A., Orlando D.A., Hnisz D., Abraham B.J., Lin C.Y., Kagey M.H., Rahl P.B., Lee T.I., Young R.A. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153:307–319.
- Lovén J., Hoke H.A., Lin C.Y., Lau A., Orlando D.A., Vakoc C.R., Bradner J.E., Lee T.I., Young R.A. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell. 2013;153:320–334.
- Zhao Z., Tavoosidana G., Sjölinder M., Göndör A., Mariano P., Wang S., Kanduri C., Lezcano M., Sandhu K.S., Singh U., et al. Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions. Nat. Genet. 2006;38:1341–1347.
- Jin F., Li Y., Dixon J.R., Selvaraj S., Ye Z., Lee A.Y., Yen C.-A., Schmitt A.D., Espinoza C.A., Ren B. A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature. 2013;503:290–294.
- Li G., Ruan X., Auerbach R.K., Sandhu K.S., Zheng M., Wang P., Poh H.M., Goh Y., Lim J., Zhang J., et al. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell. 2012;148:84–98.
- Zhang Y., Wong C.-H., Birnbaum R.Y., Li G., Favaro R., Ngan C.Y., Lim J., Tai E., Poh H.M., Wong E., et al. Chromatin connectivity maps reveal dynamic promoter-enhancer long-range associations. Nature. 2013;504:306–310.
Source: PubMed