Integrative analysis of 111 reference human epigenomes

Roadmap Epigenomics Consortium, Anshul Kundaje, Wouter Meuleman, Jason Ernst, Misha Bilenky, Angela Yen, Alireza Heravi-Moussavi, Pouya Kheradpour, Zhizhuo Zhang, Jianrong Wang, Michael J Ziller, Viren Amin, John W Whitaker, Matthew D Schultz, Lucas D Ward, Abhishek Sarkar, Gerald Quon, Richard S Sandstrom, Matthew L Eaton, Yi-Chieh Wu, Andreas R Pfenning, Xinchen Wang, Melina Claussnitzer, Yaping Liu, Cristian Coarfa, R Alan Harris, Noam Shoresh, Charles B Epstein, Elizabeta Gjoneska, Danny Leung, Wei Xie, R David Hawkins, Ryan Lister, Chibo Hong, Philippe Gascard, Andrew J Mungall, Richard Moore, Eric Chuah, Angela Tam, Theresa K Canfield, R Scott Hansen, Rajinder Kaul, Peter J Sabo, Mukul S Bansal, Annaick Carles, Jesse R Dixon, Kai-How Farh, Soheil Feizi, Rosa Karlic, Ah-Ram Kim, Ashwinikumar Kulkarni, Daofeng Li, Rebecca Lowdon, GiNell Elliott, Tim R Mercer, Shane J Neph, Vitor Onuchic, Paz Polak, Nisha Rajagopal, Pradipta Ray, Richard C Sallari, Kyle T Siebenthall, Nicholas A Sinnott-Armstrong, Michael Stevens, Robert E Thurman, Jie Wu, Bo Zhang, Xin Zhou, Arthur E Beaudet, Laurie A Boyer, Philip L De Jager, Peggy J Farnham, Susan J Fisher, David Haussler, Steven J M Jones, Wei Li, Marco A Marra, Michael T McManus, Shamil Sunyaev, James A Thomson, Thea D Tlsty, Li-Huei Tsai, Wei Wang, Robert A Waterland, Michael Q Zhang, Lisa H Chadwick, Bradley E Bernstein, Joseph F Costello, Joseph R Ecker, Martin Hirst, Alexander Meissner, Aleksandar Milosavljevic, Bing Ren, John A Stamatoyannopoulos, Ting Wang, Manolis Kellis, Anshul Kundaje, Wouter Meuleman, Jason Ernst, Misha Bilenky, Angela Yen, Alireza Heravi-Moussavi, Pouya Kheradpour, Zhizhuo Zhang, Jianrong Wang, Michael J Ziller, Viren Amin, John W Whitaker, Matthew D Schultz, Lucas D Ward, Abhishek Sarkar, Gerald Quon, Richard S Sandstrom, Matthew L Eaton, Yi-Chieh Wu, Andreas Pfenning, Xinchen Wang, Melina Claussnitzer, Yaping Liu, Cristian Coarfa, R Alan Harris, Noam Shoresh, Charles B Epstein, Elizabeta Gjoneska, Danny Leung, Wei Xie, R David Hawkins, Ryan Lister, Chibo Hong, Philippe Gascard, Andrew J Mungall, Richard Moore, Eric Chuah, Angela Tam, Theresa K Canfield, R Scott Hansen, Rajinder Kaul, Peter J Sabo, Mukul S Bansal, Annaick Carles, Jesse R Dixon, Kai-How Farh, Soheil Feizi, Rosa Karlic, Ah-Ram Kim, Ashwinikumar Kulkarni, Daofeng Li, Rebecca Lowdon, GiNell Elliott, Tim R Mercer, Shane J Neph, Vitor Onuchic, Paz Polak, Nisha Rajagopal, Pradipta Ray, Richard C Sallari, Kyle T Siebenthall, Nicholas A Sinnott-Armstrong, Michael Stevens, Robert E Thurman, Jie Wu, Bo Zhang, Xin Zhou, Nezar Abdennur, Mazhar Adli, Martin Akerman, Luis Barrera, Jessica Antosiewicz-Bourget, Tracy Ballinger, Michael J Barnes, Daniel Bates, Robert J A Bell, David A Bennett, Katherine Bianco, Christoph Bock, Patrick Boyle, Jan Brinchmann, Pedro Caballero-Campo, Raymond Camahort, Marlene J Carrasco-Alfonso, Timothy Charnecki, Huaming Chen, Zhao Chen, Jeffrey B Cheng, Stephanie Cho, Andy Chu, Wen-Yu Chung, Chad Cowan, Qixia Athena Deng, Vikram Deshpande, Morgan Diegel, Bo Ding, Timothy Durham, Lorigail Echipare, Lee Edsall, David Flowers, Olga Genbacev-Krtolica, Casey Gifford, Shawn Gillespie, Erika Giste, Ian A Glass, Andreas Gnirke, Matthew Gormley, Hongcang Gu, Junchen Gu, David A Hafler, Matthew J Hangauer, Manoj Hariharan, Meital Hatan, Eric Haugen, Yupeng He, Shelly Heimfeld, Sarah Herlofsen, Zhonggang Hou, Richard Humbert, Robbyn Issner, Andrew R Jackson, Haiyang Jia, Peng Jiang, Audra K Johnson, Theresa Kadlecek, Baljit Kamoh, Mirhan Kapidzic, Jim Kent, Audrey Kim, Markus Kleinewietfeld, Sarit Klugman, Jayanth Krishnan, Samantha Kuan, Tanya Kutyavin, Ah-Young Lee, Kristen Lee, Jian Li, Nan Li, Yan Li, Keith L Ligon, Shin Lin, Yiing Lin, Jie Liu, Yuxuan Liu, C John Luckey, Yussanne P Ma, Cecile Maire, Alexander Marson, John S Mattick, Michael Mayo, Michael McMaster, Hayden Metsky, Tarjei Mikkelsen, Diane Miller, Mohammad Miri, Eran Mukamel, Raman P Nagarajan, Fidencio Neri, Joseph Nery, Tung Nguyen, Henriette O'Geen, Sameer Paithankar, Thalia Papayannopoulou, Mattia Pelizzola, Patrick Plettner, Nicholas E Propson, Sriram Raghuraman, Brian J Raney, Anthony Raubitschek, Alex P Reynolds, Hunter Richards, Kevin Riehle, Paolo Rinaudo, Joshua F Robinson, Nicole B Rockweiler, Evan Rosen, Eric Rynes, Jacqueline Schein, Renee Sears, Terrence Sejnowski, Anthony Shafer, Li Shen, Robert Shoemaker, Mahvash Sigaroudinia, Igor Slukvin, Sandra Stehling-Sun, Ron Stewart, Sai Lakshmi Subramanian, Kran Suknuntha, Scott Swanson, Shulan Tian, Hannah Tilden, Linus Tsai, Mark Urich, Ian Vaughn, Jeff Vierstra, Shinny Vong, Ulrich Wagner, Hao Wang, Tao Wang, Yunfei Wang, Arthur Weiss, Holly Whitton, Andre Wildberg, Heather Witt, Kyoung-Jae Won, Mingchao Xie, Xiaoyun Xing, Iris Xu, Zhenyu Xuan, Zhen Ye, Chia-an Yen, Pengzhi Yu, Xian Zhang, Xiaolan Zhang, Jianxin Zhao, Yan Zhou, Jiang Zhu, Yun Zhu, Steven Ziegler, Arthur E Beaudet, Laurie A Boyer, Philip L De Jager, Peggy J Farnham, Susan J Fisher, David Haussler, Steven J M Jones, Wei Li, Marco A Marra, Michael T McManus, Shamil Sunyaev, James A Thomson, Thea D Tlsty, Li-Huei Tsai, Wei Wang, Robert A Waterland, Michael Q Zhang, Lisa H Chadwick, Bradley E Bernstein, Joseph F Costello, Joseph R Ecker, Martin Hirst, Alexander Meissner, Aleksandar Milosavljevic, Bing Ren, John A Stamatoyannopoulos, Ting Wang, Manolis Kellis, Bradley E Bernstein, Joseph F Costello, Joseph R Ecker, Martin Hirst, Alexander Meissner, Aleksandar Milosavljevic, Bing Ren, John A Stamatoyannopoulos, Ting Wang, Manolis Kellis, Roadmap Epigenomics Consortium, Anshul Kundaje, Wouter Meuleman, Jason Ernst, Misha Bilenky, Angela Yen, Alireza Heravi-Moussavi, Pouya Kheradpour, Zhizhuo Zhang, Jianrong Wang, Michael J Ziller, Viren Amin, John W Whitaker, Matthew D Schultz, Lucas D Ward, Abhishek Sarkar, Gerald Quon, Richard S Sandstrom, Matthew L Eaton, Yi-Chieh Wu, Andreas R Pfenning, Xinchen Wang, Melina Claussnitzer, Yaping Liu, Cristian Coarfa, R Alan Harris, Noam Shoresh, Charles B Epstein, Elizabeta Gjoneska, Danny Leung, Wei Xie, R David Hawkins, Ryan Lister, Chibo Hong, Philippe Gascard, Andrew J Mungall, Richard Moore, Eric Chuah, Angela Tam, Theresa K Canfield, R Scott Hansen, Rajinder Kaul, Peter J Sabo, Mukul S Bansal, Annaick Carles, Jesse R Dixon, Kai-How Farh, Soheil Feizi, Rosa Karlic, Ah-Ram Kim, Ashwinikumar Kulkarni, Daofeng Li, Rebecca Lowdon, GiNell Elliott, Tim R Mercer, Shane J Neph, Vitor Onuchic, Paz Polak, Nisha Rajagopal, Pradipta Ray, Richard C Sallari, Kyle T Siebenthall, Nicholas A Sinnott-Armstrong, Michael Stevens, Robert E Thurman, Jie Wu, Bo Zhang, Xin Zhou, Arthur E Beaudet, Laurie A Boyer, Philip L De Jager, Peggy J Farnham, Susan J Fisher, David Haussler, Steven J M Jones, Wei Li, Marco A Marra, Michael T McManus, Shamil Sunyaev, James A Thomson, Thea D Tlsty, Li-Huei Tsai, Wei Wang, Robert A Waterland, Michael Q Zhang, Lisa H Chadwick, Bradley E Bernstein, Joseph F Costello, Joseph R Ecker, Martin Hirst, Alexander Meissner, Aleksandar Milosavljevic, Bing Ren, John A Stamatoyannopoulos, Ting Wang, Manolis Kellis, Anshul Kundaje, Wouter Meuleman, Jason Ernst, Misha Bilenky, Angela Yen, Alireza Heravi-Moussavi, Pouya Kheradpour, Zhizhuo Zhang, Jianrong Wang, Michael J Ziller, Viren Amin, John W Whitaker, Matthew D Schultz, Lucas D Ward, Abhishek Sarkar, Gerald Quon, Richard S Sandstrom, Matthew L Eaton, Yi-Chieh Wu, Andreas Pfenning, Xinchen Wang, Melina Claussnitzer, Yaping Liu, Cristian Coarfa, R Alan Harris, Noam Shoresh, Charles B Epstein, Elizabeta Gjoneska, Danny Leung, Wei Xie, R David Hawkins, Ryan Lister, Chibo Hong, Philippe Gascard, Andrew J Mungall, Richard Moore, Eric Chuah, Angela Tam, Theresa K Canfield, R Scott Hansen, Rajinder Kaul, Peter J Sabo, Mukul S Bansal, Annaick Carles, Jesse R Dixon, Kai-How Farh, Soheil Feizi, Rosa Karlic, Ah-Ram Kim, Ashwinikumar Kulkarni, Daofeng Li, Rebecca Lowdon, GiNell Elliott, Tim R Mercer, Shane J Neph, Vitor Onuchic, Paz Polak, Nisha Rajagopal, Pradipta Ray, Richard C Sallari, Kyle T Siebenthall, Nicholas A Sinnott-Armstrong, Michael Stevens, Robert E Thurman, Jie Wu, Bo Zhang, Xin Zhou, Nezar Abdennur, Mazhar Adli, Martin Akerman, Luis Barrera, Jessica Antosiewicz-Bourget, Tracy Ballinger, Michael J Barnes, Daniel Bates, Robert J A Bell, David A Bennett, Katherine Bianco, Christoph Bock, Patrick Boyle, Jan Brinchmann, Pedro Caballero-Campo, Raymond Camahort, Marlene J Carrasco-Alfonso, Timothy Charnecki, Huaming Chen, Zhao Chen, Jeffrey B Cheng, Stephanie Cho, Andy Chu, Wen-Yu Chung, Chad Cowan, Qixia Athena Deng, Vikram Deshpande, Morgan Diegel, Bo Ding, Timothy Durham, Lorigail Echipare, Lee Edsall, David Flowers, Olga Genbacev-Krtolica, Casey Gifford, Shawn Gillespie, Erika Giste, Ian A Glass, Andreas Gnirke, Matthew Gormley, Hongcang Gu, Junchen Gu, David A Hafler, Matthew J Hangauer, Manoj Hariharan, Meital Hatan, Eric Haugen, Yupeng He, Shelly Heimfeld, Sarah Herlofsen, Zhonggang Hou, Richard Humbert, Robbyn Issner, Andrew R Jackson, Haiyang Jia, Peng Jiang, Audra K Johnson, Theresa Kadlecek, Baljit Kamoh, Mirhan Kapidzic, Jim Kent, Audrey Kim, Markus Kleinewietfeld, Sarit Klugman, Jayanth Krishnan, Samantha Kuan, Tanya Kutyavin, Ah-Young Lee, Kristen Lee, Jian Li, Nan Li, Yan Li, Keith L Ligon, Shin Lin, Yiing Lin, Jie Liu, Yuxuan Liu, C John Luckey, Yussanne P Ma, Cecile Maire, Alexander Marson, John S Mattick, Michael Mayo, Michael McMaster, Hayden Metsky, Tarjei Mikkelsen, Diane Miller, Mohammad Miri, Eran Mukamel, Raman P Nagarajan, Fidencio Neri, Joseph Nery, Tung Nguyen, Henriette O'Geen, Sameer Paithankar, Thalia Papayannopoulou, Mattia Pelizzola, Patrick Plettner, Nicholas E Propson, Sriram Raghuraman, Brian J Raney, Anthony Raubitschek, Alex P Reynolds, Hunter Richards, Kevin Riehle, Paolo Rinaudo, Joshua F Robinson, Nicole B Rockweiler, Evan Rosen, Eric Rynes, Jacqueline Schein, Renee Sears, Terrence Sejnowski, Anthony Shafer, Li Shen, Robert Shoemaker, Mahvash Sigaroudinia, Igor Slukvin, Sandra Stehling-Sun, Ron Stewart, Sai Lakshmi Subramanian, Kran Suknuntha, Scott Swanson, Shulan Tian, Hannah Tilden, Linus Tsai, Mark Urich, Ian Vaughn, Jeff Vierstra, Shinny Vong, Ulrich Wagner, Hao Wang, Tao Wang, Yunfei Wang, Arthur Weiss, Holly Whitton, Andre Wildberg, Heather Witt, Kyoung-Jae Won, Mingchao Xie, Xiaoyun Xing, Iris Xu, Zhenyu Xuan, Zhen Ye, Chia-an Yen, Pengzhi Yu, Xian Zhang, Xiaolan Zhang, Jianxin Zhao, Yan Zhou, Jiang Zhu, Yun Zhu, Steven Ziegler, Arthur E Beaudet, Laurie A Boyer, Philip L De Jager, Peggy J Farnham, Susan J Fisher, David Haussler, Steven J M Jones, Wei Li, Marco A Marra, Michael T McManus, Shamil Sunyaev, James A Thomson, Thea D Tlsty, Li-Huei Tsai, Wei Wang, Robert A Waterland, Michael Q Zhang, Lisa H Chadwick, Bradley E Bernstein, Joseph F Costello, Joseph R Ecker, Martin Hirst, Alexander Meissner, Aleksandar Milosavljevic, Bing Ren, John A Stamatoyannopoulos, Ting Wang, Manolis Kellis, Bradley E Bernstein, Joseph F Costello, Joseph R Ecker, Martin Hirst, Alexander Meissner, Aleksandar Milosavljevic, Bing Ren, John A Stamatoyannopoulos, Ting Wang, Manolis Kellis

Abstract

The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

Figures

Extended Data 1
Extended Data 1
a-d. Tissues and Cell Types of Reference Epigenomes. Comprehensive listing of all 111 reference epigenomes generated by the consortium, along with epigenome identifiers (EIDs), including: (a) adult samples; (b) fetal samples; (c) ESC, iPSC, and ESC-derived cells; and (d) primary cultures. Colors indicate the groupings of tissues and cell types (as in Fig. 2b, and throughout the manuscript). For five samples (adult osteoblasts, and fetal liver, spleen, gonad, and spinal cord), no color is present, indicating that these are not part of the 111 reference epigenomes (ENCODE 2012 samples, or not all five marks in the core set were present), but datasets from these samples are high quality and were sometimes used in companion paper analyses, and are available to the public. e. Assay correlations. Heatmap of the pairwise experiment correlations for the core set of five histone modification marks (H3K4me1, H3K4me3, H3K36me3, H3K27me3, H3K9me3) across all 127 reference epigenomes, the two common acetylation marks (H3K27ac and H3K9ac), and DNA accessibility (DNase) across the reference epigenomes where they are available. Yellow indicates relatively higher correlation and blue lower correlation. Rows and columns were ordered computationally to maximize similarity of neighboring rows and columns (see Methods). All experiments for H3K9me3, H3K27me3, H3K36me3, DNase, and H3K4me1 are consistently ordered into distinct and contiguous groups. For H3K4me3, H3K9ac, and H3K27ac, experiments group primarily based on the mark, but in some cases, the correlations and ordering appear more cell type driven.
Extended Data 2. Chromatin state model robustness…
Extended Data 2. Chromatin state model robustness and enrichments
a. Chromatin state model robustness. Clustering of 15-state ‘core’ chromatin state model learned jointly across reference epigenomes (Fig. 4a) with chromatin state models learned independently in 111 reference epigenomes. We applied ChromHMM to learn a 15-state ChromHMM model using the five core marks in each of the 111 reference epigenomes generated by the Roadmap Epigenomics program, and clustered the resulting 1680 state emission probability vectors (leaves of the tree) with the 15 states from the joint model (indicated by arrows). We found that the vast majority of states learned across cell types clustered into 15 clusters, corresponding to the joint model states, validating the robustness of chromatin states across cell types. This analysis revealed two new clusters (red crosses) which are not represented in the 15 states of the jointly-learned model: ‘HetWk’, a cluster showing weak enrichment for H3K9me3; and ‘Rpts’, a cluster showing H3K9me3 along with a diversity of other marks, and enriched in specific types of repetitive elements (satellite repeats) in each cell type, which may be due to mapping artifacts. This joint clustering also revealed subtle variations in the relative intensity of H3K4me1 in states TxFlnk, Enh, and TssBiv, and H3K27me3 in state TssBiv. Overall, this analysis confirms that the 15-state chromatin state model based on the core set of five marks provides a robust framework for interpreting epigenomic complexity across tissues and cell types. b. Enrichments for 15-state model based on five histone modification marks. Top Left: TF binding site overlap enrichments of 15 states in H1-ESC from the ‘core’ model for transcription factor binding sites (TFBS) based on ChIP-seq data in H1-ESC. TF binding coverage for other cell-types based on matched TF ChIP-seq data is shown in Fig. S2. Top Right: Enrichments for expressed and non-expressed genes in H1-ESC and GM12878. Bottom: Positional enrichments at the transcription start site (TSS) and transcription end site (TES) of expressed (expr.) and repressed (repr.) genes in H1-ESC. Transition probabilities show frequency of co-occurrence of each pair of chromatin states in neighboring 200-bp bins. d. Definition and enrichments for 18-state ‘expanded’ model that also includes H3K27ac associated with active enhancer and active promoter regions, but which was only available for 98 of the 127 reference epigenomes. Inclusion of H3K27ac distinguishes active enhancers and active promoters. Top: TFBS enrichments in H1-ESC (E003) chromatin states using ENCODE TF ChIP-seq data in H1-ESC . Bottom: Positional enrichments in H1-ESC for genomic annotations, expressed and repressed genes, TSS and TES, and state transitions as in Extended Data 2b and Fig. 4a-c. Right: Average fold-enrichment (colors bars) and standard deviation (black line) across 98 reference epigenomes (Fig. S3d) for the fold enrichment for non-coding of genomic segments (GERP) in each chromatin state (rows) in the 18-state model. Even after excluding protein-coding exons (see Fig. S3b vs. Fig. S3d), the TSS-proximal states show the highest levels of conservation, followed by EnhBiv and the three non-transcribed enhancer states. In contrast, Tx and TxWk elements are weakly depleted for conserved regions, and Znf/Rpts, and Het are strongly depleted for conserved elements.
Extended Data 3. Relationship between histone marks,…
Extended Data 3. Relationship between histone marks, DNA methylation, DNA accessibility, and gene expression
a. H3K27ac-marked ‘active’ enhancers show higher levels of DNA accessibility, based on enrichment of DNase-seq signal confidence scores (-log10(Poisson p-value))for elements in each chromatin state in our extended 18-state model that includes the core five histone modification marks and H3K27ac, similar to Fig. 4e. b. Level of whole-genome bisulfite methylation for all chromatin states in the 18-state model shows that H3K27ac-marked ‘active’ enhancers associated with H3K27ac in addition to H3K4me1 show lower methylation levels, consistent with higher regulatory activity. The whiskers in a. and b. show 1.5 x IQR (interquartile range) and the filled circles are individual outliers c. DNA methylation levels for genes showing different expression levels. The depletion of DNA methylation in promoter regions, and the enrichment of DNA methylation in transcribed regions, are both more pronounced for highly expressed genes. The enrichment for high DNA methylation is more pronounced in the 3’ ends of the most highly expressed genes. d. Genes associated with active enhancer states have consistently significantly higher expression. ‘Active enhancer’ associated genes have at least one EnhA1 and/or EnhA2 +/−20Kb from TSS (18-state model). ‘Weak-enhancer’genes are associated with EnhG1, EnhG2, EnhWk, EnhBiv. Lowest expression have genes that are not associated with any enhancer. Plots with red markers show median expression of genes associated with ‘active’ enhancers, yellow markers ‘weak’ enhancers, and white markers no association with any enhancer state. e. Higher-expression genes show greater association with H3K27ac-marked ‘active’ enhancers. Highly expressed genes are consistently more frequently associated with H3K27ac-marked active enhancers (EnhA1 and EnhA2) across all cell types. Fraction of genes associated with H3K27ac-marked ‘active’ enhancers (red), H3K27ac-lacking ‘weak’ enhancers only (yellow), or no enhancers (white) for genes of varying expression levels in each cell type with RNA-seq data.
Extended Data 4. Methylation relationship with chromatin…
Extended Data 4. Methylation relationship with chromatin state
a-c. DNA methylation levels in 15-state model across technologies. We observed significant differences in the average methylation levels observed that were correlated with the different DNA methylation platforms used, but their relative relationships in average chromatin state methylation were conserved. Relative to WGBS (panel a, repeated from Fig. 4d for comparison purposes), RRBS (panel b) showed the lowest overall methylation levels (as expected given its CpG island enrichment), while mCRF showed the highest (panel c). This highlights the importance of recognizing and potentially correcting for DNA methylation platform specific biases prior performing integrative analyse. d,e. Distribution of DNA methylation levels measured using RRBS and mCRF in 18-state model (defined in Extended Data 2c). WGBS is shown in Extended Data 3b. The whiskers in a., b., c., d., and e. show 1.5 x IQR (interquartile range) and the filled circles are individual outliers f. DNA methylation variation across cell types. Density plots denote distribution of DNA methylation levels from 0% to 100% for each chromatin state across the 95 reference epigenomes profiled for whole-genome bisulfite (WGBS, red), reduced representation bisulfite (RRBS, blue), or MeDIP/MRE (mCRF, green). The respective color (red, blue, or green) was set to the maximum ln(density+1) value for each chromatin state and respective platform, with intermediate values colored on a natural log scale. For each panel, epigenomes are listed in the same order, shown on the right, with abbreviations of samples in the order of Fig. 2 for each technology.
Extended Data 5. Chromatin state variability, switching,…
Extended Data 5. Chromatin state variability, switching, and genomic coverage
a. Variability level for 18-state model. Chromatin state variability (similar to Fig. 5a), quantified based on the fraction of the genomic coverage (y-axis) of each state (color) that is consistently labeled with that state in at most N (ranging from 1 to 98) reference epigenomes, using the 18-state model learned based on 6 chromatin marks, including H3K27ac. b. Chromatin state over- and under-representation for 18-state expanded model. c. Log-ratio (log10) of chromatin state switching probabilities for the 18-state expanded model across 34 high-quality, non-redundant epigenomes that have H3K27ac data, relative to intra-tissue switching probabilities across replicates or samples from multiple individuals. d. Chromatin state coverage grouped by epigenomic domains. Top: Chromosome ‘painting’ of 11 clusters shown in Fig. 5d and discovered based on chromatin state co-occurrence at the 2Mb scale across reference epigenomes. Bottom: Enrichment of CpG islands in each cluster clearly showing higher CpG density ‘active’ clusters 3 and 6 comparing to passive clusters 9-11. Each box plot shows a distribution of CpG total occupancy in 2Mb bins in each cluster (with box boundaries indicate 25th and 75th percentiles the whiskers extend to the most extreme datapoints the algorithm considers to not be outliers. Points are drawn as outliers if they are larger than Q3+W*(Q3-Q1) or smaller than Q1-W*(Q3-Q1), where Q1 and Q3 are the 25th and 75th percentiles, respectively.).
Extended Data 6. Hierarchical clustering of epigenomes…
Extended Data 6. Hierarchical clustering of epigenomes using diverse marks
a-e. Clustering of all 127 reference epigenomes, including ENCODE samples, using H3K4me1, H3K4me3, H3K27me3, H3K36me3 and H3K9me3 signal in Enh, TssA, ReprPC, Tx and Het chromatin states, respectively. All panels show hierarchical clustering with optimal leaf ordering. Colors indicate sample groups, as defined in Fig. 2. Numbers on internal nodes represent bootstrap support scores over 1,000 bootstrap samples.
Extended Data 7
Extended Data 7
a-i. Multidimensional scaling (MDS) plots showing tissue/cell type similarity using different epigenomic marks. Multi-Dimensional Scaling (MDS) analysis results, showing reference epigenomes using their group coloring defined in Fig. 2. Thin lines connect same-group reference epigenomes. The first 4 axes of variation are shown in pairs. Marks are assessed in regions with relevant chromatin states (see Methods). j. Variance explained by each MDS dimension. The first 5 dimensions shown in Fig. S10 (Fig. 6b,c) explain between 45% and 80% of the total epigenometo-epigenome variance for all histone modification mark correlations, and additional dimensions explain less than 10%. Only a few components of H3K4me3 in TssA chromatin states explains a much larger fraction of the variance than other marks, possibly due to its stability across cell types.
Extended Data 8
Extended Data 8
a. Regulatory motifs enriched in clusters. Enrichment (red) or depletion (blue) of regulatory motifs (rows) in the enhancer modules (columns) relative to shuffled control motifs. For each motif is shown the motif name, consensus logo, and correlation between regulator expression and module activity: positive correlation (orange) is indicative of activators, and negative correlation (purple) indicates a repressive role for the factor. Only clusters with enrichment or depletion of at least 2^1.5-fold for one motif are shown. b. Average activity level of enhancers of each module in each reference epigenome (black=high, white=low). Bottom: Total size of each enhancer module showing enrichment (in kb).
Extended Data 9
Extended Data 9
a. Regulatory motif enrichment, DGF enrichment, and positional bias for predicted driver motifs, based on strong (positive or negative) correlations between TF expression and enhancer module activity. a. Regulatory motif enrichments for the 40 regulators showing the strongest absolute correlation between TF expression and module activity. Of these, 36 were also recovered solely based on their motif enrichment scores (Extended Data 8), but six were only discovered based on their correlations (Esrra_4, Max_4, Mga_3, Nfatc1_3, Rest_2, and Tead3_1), illustrating the importance of studying motif enrichments in the context of TF expression and enhancer activity patterns. b. Predicted driver regulatory motifs are enriched in high-resolution DNase footprints. Enrichment of predicted driver motif instances (Fig. 8 and Extended Data 9a) in 42 high-resolution (6bp-40bp) Digital Genomic Footprinting (DGF) libraries from deeply sequenced DNase datasets shows consistent tissue preferences in matching cell types. For example, POU5F1 in iPS cells, HNF1B and HNF4A1 in digestive tissues, RFX4 in mesendoderm and neural lineages, MFE2B in muscle. c. Matrix of significant positional bias across factors and cell types. For each Digital Genomics Footprinting (DGF) dataset (columns), positional bias score (heatmap) of predicted driver regulatory motifs (rows) found to be significantly enriched (Fig. 8, Extended Data 9a) in enhancer modules (Fig. 7a).
Extended Data 10. Positional biases of predicted…
Extended Data 10. Positional biases of predicted driver motifs relative to high-resolution DNase footprint centers and boundaries
a. Driver TF motif instance logo, as in Fig. 8 and Extended Data 9a. b. Distribution of motif instances relative to the center of the high-resolution DNase sites (digital genome footprints, DGF, lengths range from 6bp to 40bp), each curve colored according to the cell/tissue type (from Fig. 2, Table S5b). c. Distribution of shuffled motifs that match composition and number of conserved occurrences in the genome,. d. Positional bias relative to boundary of DGF region for true motifs, similar to b. e. Positional bias relative to boundary of DGF region for shuffled motifs, similar to c. f. Cell types showing significant positional bias after multiple testing correction, colored according to Fig. 2 and Table S5b.
Extended Data 11. Epigenomic enrichments of genetic…
Extended Data 11. Epigenomic enrichments of genetic variants associated with diverse traits
Tissue-specific enrichments for peaks of diverse epigenomic marks for genetic variants associated with complex disease, expanding Fig. 9. Enrichments are shown for: a. H3K4me1 peaks (enhancers). This panel includes all the data shown in Fig. 9, but expands the enrichments shown to all reference epigenomes (columns), and additional traits (rows) that did not meet the FDR=0.02 threshold. b. H3K27ac peaks (active enhancers). a-b. Studies were defined by a set of SNPs annotated in the GWAS catalog with the same combination of a trait (far left column) and publication shown by the Pubmed ID (far right column), uncorrected p-value (in -log10), and estimated FDR.
Extended Data 12. Epigenomic enrichments of genetic…
Extended Data 12. Epigenomic enrichments of genetic variants associated with diverse traits
Tissue-specific enrichments for peaks of diverse epigenomic marks for genetic variants associated with complex disease, expanding Fig. 9. Enrichments are shown for: a. H3K4me3 peaks (promoters). b. H3K9ac peaks (active promoters and active enhancers). c. DNase peaks (accessible regions). d. H3K36me3 peaks (transcribed regions). e. H3K27me3 peaks (Polycomb-repressed regions). f. H3K9me3 peaks (heterochromatin regions). a-f. Studies were defined by a set of SNPs annotated in the GWAS catalog with the same combination of a trait (far left column) and publication shown by the Pubmed ID (far right column), uncorrected p-value (in -log10), and estimated FDR.
Figure 1. Tissues and cell types profiled…
Figure 1. Tissues and cell types profiled in the Roadmap Epigenomics Consortium
Primary tissues and cell types representative of all major lineages in the human body were profiled, including multiple brain, heart, muscle, GI-tract, adipose, skin, and reproductive samples, as well as immune lineages, ESCs and induced Pluripotent Stem (iPS) cells, and differentiated lineages derived from ESCs. Box colors match groups shown in Fig. 2b. Epigenome identifiers (EIDs, Fig. 2c) for each sample shown in Extended Data 1.
Figure 2. Datasets available for each reference…
Figure 2. Datasets available for each reference epigenome
List of 127 epigenomes including 111 by the Roadmap Epigenomics program (E001-E113) and 16 by ENCODE (E114-E129). Full list of names and quality scores in Table S1. a-d: Tissue and cell types grouped by type of biological material (a), anatomical location (b), showing reference epigenome identifier (EID, c), and abbreviated name (d). PB=Peripheral Blood. ENCODE 2012 reference epigenomes shown separately. e-g. Normalized strand cross-correlation quality scores (NSC) for the core set of five histone marks (e), additional acetylation marks (f) and DNase-seq (g). h. Methylation data by WGBS (red), RRBS (blue), and mCRF (green). 104 methylation datasets available in 95 distinct reference epigenomes. i. Gene expression data using RNA-seq (Brown) and microarray expression (Yellow). j. 26 epigenomes contain a total of 184 additional histone modification marks. k. 60 highest-quality epigenomes (purple) were used for training the core chromatin state model, which was then applied to the full set of epigenomes (purple and orange).
Figure 3. Epigenomic information across tissues and…
Figure 3. Epigenomic information across tissues and marks
a. Chromatin state annotations across 127 reference epigenomes (rows, Fig. 2) in a ~3.5Mb region on chromosome 9. Promoters are primarily constitutive (red vertical lines), while enhancers are highly dynamic (dispersed yellow regions). b. Signal tracks for IMR90 showing RNA-seq, a total of 28 histone modification marks, whole-genome bisulfite DNA methylation, DNA accessibility, Digital Genomic Footprints (DGF), input DNA, and chromatin conformation information. c. Individual epigenomic marks across all epigenomes in which they are available. d. Relationship of figure panels highlights dataset dimensions.
Figure 4. Chromatin states and DNA methylation…
Figure 4. Chromatin states and DNA methylation dynamics
a. Chromatin state definitions, abbreviations, and histone mark probabilities. b. Average genome coverage. Genomic annotation enrichments in H1-ESC. c. Active and inactive gene enrichments in H1-ESC (see Extended Data 2b for GM12878). d. DNA methylation. e. DNA accessibility. d-e. Whiskers show 1.5 * interquartile range. Circles are individual outliers. f. Average overlap fold enrichment for GERP evolutionarily conserved non-coding regions. Bars denote standard deviation. g. DNA methylation (WGBS) density (color, ln scale) across cell types. red=max ln(density+1). Left column indicates tissue groupings, full list shown in Extended Data 4f. h. DNA methylation levels (left) and TF enrichment (right) during ESC differentiation. i. Chromatin mark changes during cardiac muscle differentiation. Heatmap=average normalized mark signal in Enh. C5 cluster enrichment.
Figure 5. Cell type differences in chromatin…
Figure 5. Cell type differences in chromatin states
a. Chromatin state variability, based on genome coverage fraction consistently labeled with each state. b. Relative chromatin state frequency for each reference epigenome. c. Chromatin state switching log10 relative frequency (inter-cell-type vs. inter-replicate). d. Clustering of 2Mb intervals (columns) based on relative chromatin state frequency (fold enrichment), averaged across reference epigenomes. LaminB1 occupancy profiled in ESCs. Red lines show cluster average.
Figure 6. Epigenome relationships
Figure 6. Epigenome relationships
a. Hierarchical epigenome clustering using H3K4me1 signal in Enh states. Numbers indicate bootstrap support scores over 1,000 samplings. b-c. Multidimensional scaling (MDS) plot of cell type relationships based on similarity in H3K4me1 signal in Enh states (b) and H3K27me3 signal in ReprPC states (c). First four dimensions shown as dim1 vs. dim2 and dim3 vs. dim4.
Figure 7. Regulatory modules from epigenome dynamics
Figure 7. Regulatory modules from epigenome dynamics
a. Enhancer modules by activity-based clustering of 2.3 million DNase-accessible regions classified as Enh, EnhG or EnhBiv (color) across 111 reference epigenomes. Vertical lines separate 226 modules. Broadly-active enhancers shown first. Module IDs shown in Fig. S11c. b-c. Proximal gene enrichments (b) for each module using gene ontology (GO) biological process (panel b) and human phenotypes (panel c). Rectangles pinpoint enrichments for selected modules. Representative gene set names (left) selected using bag-of-words enrichment.
Figure 8. Linking regulators to their target…
Figure 8. Linking regulators to their target enhancers
Module-level regulatory motif enrichment (Fig. S11) and correlation between regulator expression and module activity patterns (Extended Data 8a) are used to link regulators (boxes) to their likely target tissue and cell types (circles). Edge weight represents motif enrichment in the reference epigenomes of highest module activity.
Figure 9. Epigenomic enrichments of genetic variants…
Figure 9. Epigenomic enrichments of genetic variants associated with diverse traits
Tissue-specific H3K4me1 peak enrichment for genetic variants associated with diverse traits. Circles denote reference epigenome (column) of highest enrichment for SNPs reported by a given study (row), defined by trait and publication (PubMed identifier, PMID). Tissue (Abbrev) and p-value (-log10) of highest enrichment are shown. Only rows and columns containing a value meeting a FDR of 2% are shown (Full matrix for all studies showing at least 2% FDR in Extended Data 11-12).

References

    1. Rivera CM, Ren B. Mapping human epigenomes. Cell. 2013;155:39–55.
    1. Zhou VW, Goren A, Bernstein BE. Charting histone modifications and the functional organization of mammalian genomes. Nat Rev Genet. 2011;12:7–18.
    1. Jones PA. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet. 2012;13:484–492.
    1. Smith ZD, Meissner A. DNA methylation: roles in mammalian development. Nat Rev Genet. 2013;14:204–220.
    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10:57–63.
    1. Park PJ. ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet. 2009;10:669–680.
    1. Thurman RE, et al. The accessible chromatin landscape of the human genome. Nature. 2012;489:75–82.
    1. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5:621–628.
    1. Ernst J, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011;473:43–49.
    1. Heintzman ND, et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet. 2007;39:311–318.
    1. Xie W, et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell. 2013;153:1134–1148.
    1. Zhu J, et al. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell. 2013;152:642–654.
    1. Neph S, et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature. 2012;489:83–90.
    1. Maurano MT, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–1195.
    1. Bernstein BE, et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol. 2010;28:1045–1048.
    1. Mikkelsen TS, et al. Genome-wide maps of chromatin state in pluripotent and lineage- committed cells. Nature. 2007;448:553–560.
    1. Barski A, et al. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129:823–837.
    1. John S, et al. Genome-scale mapping of DNase I hypersensitivity. Curr Protoc Mol Biol. 2013 Chapter 27, Unit 21 27.
    1. Lister R, et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009;462:315–322.
    1. Meissner A, et al. Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis. Nucleic Acids Res. 2005;33:5868–5877.
    1. Weber M, et al. Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells. Nat Genet. 2005;37:853–862.
    1. Maunakea AK, et al. Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature. 2010;466:253–257.
    1. ENCODE_Project_Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74.
    1. Bernstein BE, et al. Genomic maps and comparative analysis of histone modifications in human and mouse. Cell. 2005;120:169–181.
    1. Bonasio R, Tu S, Reinberg D. Molecular signals of epigenetic states. Science. 2010;330:612–616.
    1. Peters AH, et al. Partitioning and plasticity of repressive histone methylation states in mammalian chromatin. Mol Cell. 2003;12:1577–1589.
    1. Heintzman ND, et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature. 2009;459:108–112.
    1. Rada-Iglesias A, et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature. 2011;470:279–283.
    1. Creyghton MP, et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A. 2010;107:21931–21936.
    1. Cedar H, Bergman Y. Linking DNA methylation and histone modification: patterns and paradigms. Nat Rev Genet. 2009;10:295–304.
    1. Stevens M, et al. Estimating absolute methylation levels at single-CpG resolution from methylation enrichment and restriction enzyme sequencing methods. Genome Res. 2013;23:1541–1553.
    1. Zhang Y, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9:R137.
    1. Butterfield YS, et al. JAGuaR: Junction Alignments to Genome for RNA-Seq Reads. PLoS One. 2014;9:e102398.
    1. Coarfa C, et al. Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing. BMC Bioinformatics. 2010;11:572.
    1. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760.
    1. Fejes AP, et al. FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics. 2008;24:1729–1730.
    1. Landt SG, et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 2012;22:1813–1831.
    1. Kunde-Ramamoorthy G, et al. Comparison and quantitative verification of mapping algorithms for whole-genome bisulfite sequencing. Nucleic Acids Res. 2014;42:e43.
    1. Harris RA, et al. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat Biotechnol. 2010;28:1097–1105.
    1. Ernst J, Kellis M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat Biotechnol. 2010;28:817–825.
    1. Davydov EV, et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput Biol. 2010;6:e1001025.
    1. Kagey MH, et al. Mediator and cohesin connect gene expression and chromatin architecture. Nature. 2010;467:430–435.
    1. Stadler MB, et al. DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature. 2011;480:490–495.
    1. Gascard P, et al. Epigenetic and transcriptional determinants of mammary gland development. Companion Manuscript. 2015
    1. Mohn F, Weber M, Schubeler D, Roloff TC. Methylated DNA immunoprecipitation (MeDIP). Methods Mol Biol. 2009;507:55–64.
    1. Elliott G, et al. Intermediate DNA Methylation is a Conserved Signature of Genome Regulation. Companion Manuscript. 2015
    1. Ji H, et al. Comprehensive methylome map of lineage commitment from haematopoietic progenitors. Nature. 2010;467:338–342.
    1. Meissner A, et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature. 2008;454:766–770.
    1. Gifford CA, et al. Transcriptional and epigenetic dynamics during specification of human embryonic stem cells. Cell. 2013;153:1149–1163.
    1. Ziller MJ, et al. Charting a dynamic DNA methylation landscape of the human genome. Nature. 2013;500:477–481.
    1. Tsankov AM, et al. Modular and context dependent rewiring of transcription factor networks during human ESC differentiation. Companion Manuscript. 2015
    1. Ziller MJ, et al. Dissecting neural differentiation regulatory networks through epigenetic footprinting. Nature. 2014
    1. Xie M, et al. DNA hypomethylation within specific transposable element families associates with tissue-specific enhancer landscape. Nat Genet. 2013;45:836–841.
    1. McLean CY, et al. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol. 2010;28:495–501.
    1. Lowdon RF, et al. Regulatory network decoded from epigenomes of surface ectoderm-derived cell types. Nat Commun. 2014;5:5442.
    1. Amin V, et al. Epigenomic footprints across 111 reference epigenomes reveal tissue-specific epigenetic regulation of lincRNAs. Nature Communications. 2015
    1. Bernstein BE, et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell. 2006;125:315–326.
    1. Hawkins RD, et al. Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell. 2010;6:479–491.
    1. Varley KE, et al. Dynamic DNA methylation across diverse human cell lines and tissues. Genome Res. 2013;23:555–567.
    1. Leung D, et al. Integrative analysis of haplotype-resolved epigenomes across human tissues. Companion Manuscript. 2015
    1. Lieberman-Aiden E, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009;326:289–293.
    1. Meuleman W, et al. Constitutive nuclear lamina-genome interactions are highly conserved and associated with A/T-rich sequence. Genome Res. 2013;23:270–280.
    1. Guelen L, et al. Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature. 2008;453:948–951.
    1. Antequera F, Boyes J, Bird A. High levels of de novo methylation and altered chromatin structure at CpG islands in cell lines. Cell. 1990;62:503–514.
    1. Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29.
    1. Kohler S, et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 2014;42:D966–974.
    1. Kheradpour P, Kellis M. Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments. Nucleic Acids Res. 2014;42:2976–2987.
    1. Hesselberth JR, et al. Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nat Methods. 2009;6:283–289.
    1. Kheradpour P, Stark A, Roy S, Kellis M. Reliable prediction of regulator targets using 12 Drosophila genomes. Genome Res. 2007;17:1919–1931.
    1. Whitaker JW, Chen Z, Wang W. Predicting the human epigenome from DNA motifs. Nat Methods. 2014
    1. Dixon JR, et al. Global Reorganization of Chromatin Architecture during Embryonic Stem Cell Differentiation. Companion Manuscript. 2015
    1. Lindblad-Toh K, et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature. 2011;478:476–482.
    1. Trynka G, et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat Genet. 2013;45:124–130.
    1. Welter D, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–1006.
    1. Franke A, et al. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci. Nat Genet. 2010;42:1118–1125.
    1. Cooper JD, et al. Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci. Nat Genet. 2008;40:1399–1401.
    1. Berndt SI, et al. Genome-wide association study identifies multiple risk loci for chronic lymphocytic leukemia. Nat Genet. 2013;45:868–876.
    1. Stahl EA, et al. Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci. Nat Genet. 2010;42:508–514.
    1. Barrett JC, et al. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat Genet. 2009;41:703–707.
    1. Jostins L, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature. 2012;491:119–124.
    1. Yang W, et al. Meta-analysis followed by replication identifies loci in or near CDKN1B, TET3, CD80, DRAM1, and ARID5B as associated with systemic lupus erythematosus in Asians. Am J Hum Genet. 2013;92:41–51.
    1. Musunuru K, et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature. 2010;466:714–719.
    1. Willy PJ, et al. LXR, a nuclear receptor that defines a distinct retinoid response pathway. Genes Dev. 1995;9:1033–1045.
    1. Pasquali L, et al. Pancreatic islet enhancer clusters enriched in type 2 diabetes risk-associated variants. Nat Genet. 2014;46:136–143.
    1. Dalcik H, et al. Expression of insulin-like growth factor in the placenta of intrauterine growth- retarded human fetuses. Acta Histochem. 2001;103:195–207.
    1. Lesch KP, et al. Molecular genetics of adult ADHD: converging evidence from genome-wide association and extended pedigree linkage studies. J Neural Transm. 2008;115:1573–1585.
    1. Repunte-Canonigo V, et al. A potential role for adiponectin receptor 2 (AdipoR2) in the regulation of alcohol intake. Brain Res. 2010;1339:11–17.
    1. Sawcer S, et al. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature. 2011;476:214–219.
    1. Heneka MT, Kummer MP, Latz E. Innate immune activation in neurodegenerative disease. Nat Rev Immunol. 2014;14:463–477.
    1. Gjoneska E, Pfenning AR, Kundaje A, Tsai L-H, Kellis M. Conserved epigenomic signatures between mouse and human elucidate immune basis of Alzheimer's disease. Nature, Companion Manuscript. 2015
    1. Zhou X, et al. Epigenomic annotation of genetic variants using the Roadmap EpiGenome Browser. Nat Biotechnol. 2015
    1. Ward LD, Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 2012;40:D930–934.
    1. Satterlee JS, Schubeler D, Ng HH. Tackling the epigenome: challenges and opportunities for collaboration. Nat Biotechnol. 2010;28:1039–1044.
    1. Farh KK, et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature. 2014
    1. Seumois G, et al. Epigenomic analysis of primary human T cells reveals enhancers associated with TH2 memory cell differentiation and asthma susceptibility. Nat Immunol. 2014;15:777–788.
    1. De Jager PL, et al. Alzheimer's disease: early alterations in brain DNA methylation at ANK1, BIN1, RHBDF2 and other loci. Nat Neurosci. 2014;17:1156–1163.
    1. Lunnon K, et al. Methylomic profiling implicates cortical deregulation of ANK1 in Alzheimer's disease. Nat Neurosci. 2014;17:1164–1170.
    1. Polak P, et al. Cell type of origin chromatin organization shapes the mutational landscape of cancer. Companion Manuscript. 2015
    1. Yao L, Tak YG, Berman BP, Farnham PJ. Functional annotation of colon cancer risk SNPs. Nat Commun. 2014;5:5114.
    1. Zhou X, et al. The Human Epigenome Browser at Washington University. Nat Methods. 2011;8:989–990.
    1. Karolchik D, et al. The UCSC Genome Browser Database. Nucleic Acids Res. 2003;31:51–54.
    1. Chadwick LH. The NIH Roadmap Epigenomics Program data resource. Epigenomics. 2012;4:317–324.
    1. John S, et al. Chromatin accessibility pre-determines glucocorticoid receptor binding patterns. Nat Genet. 2011;43:264–268.
    1. Ernst J, Kellis M. Interplay between chromatin state, regulator binding, and regulatory motifs in six human cell types. Genome Res. 2013;23:1142–1154.
    1. Ernst J, Kellis M. ChromHMM: automating chromatin-state discovery and characterization. Nat Methods. 2012;9:215–216.
    1. Dixon JR, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380.
    1. Lister R, et al. Global epigenomic reconfiguration during mammalian brain development. Science. 2013;341:1237905.
    1. Schultz MD, Schmitz RJ, Ecker JR. 'Leveling' the playing field for analyses of single-base resolution DNA methylomes. Trends Genet. 2012;28:583–585.
    1. Bar-Joseph Z, Gifford DK, Jaakkola TS. Fast optimal leaf ordering for hierarchical clustering. Bioinformatics. 2001;17(Suppl 1):S22–29.
    1. Leisch F. A toolbox for KK-centroids cluster analysis. Computational Statistics and Data Analysis. 2006 .
    1. Matys V, et al. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. 2003;31:374–378.
    1. Sandelin A, Alkema W, Engstrom P, Wasserman WW, Lenhard B. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res. 2004;32:D91–94.
    1. Berger MF, et al. Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat Biotechnol. 2006;24:1429–1435.
    1. Berger MF, et al. Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell. 2008;133:1266–1276.
    1. Jolma A, et al. DNA-binding specificities of human transcription factors. Cell. 2013;152:327–339.
    1. Badis G, et al. Diversity and complexity in DNA recognition by transcription factors. Science. 2009;324:1720–1723.
    1. Shannon P, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504.
    1. Karolchik D, et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004;32:D493–496.
    1. Garber M, et al. Identifying novel constrained elements by exploiting biased substitution patterns. Bioinformatics. 2009;25:i54–62.
    1. Osborne JD, et al. Annotating the human genome with Disease Ontology. BMC Genomics. 2009;10(Suppl 1):S6.
    1. Hill DP, et al. The mouse Gene Expression Database (GXD): updates and enhancements. Nucleic Acids Res. 2004;32:D568–571.

Source: PubMed

Подписаться