Main

We used chromatin immunoprecipitation followed by DNA sequencing (ChIP-seq) or microarray hybridization (ChIP-chip) to generate profiles of core histones, histone variants, histone modifications and chromatin-associated proteins (Fig. 1, Supplementary Fig. 1 and Supplementary Tables 1 and 2). Additional data include DNase I hypersensitivity sites in fly and human cells, and nucleosome occupancy maps in all three organisms. Compared to our initial publications1,2,3, this represents a tripling of available fly and worm data sets and a substantial increase in human data sets (Fig. 1b, c). Uniform quality standards for experimental protocols, antibody validation and data processing were used throughout the projects6. Detailed analyses of related transcriptome and transcription factor data are presented in accompanying papers7,8.

Figure 1: Overview of the data set.
figure 1

a, Histone modifications, chromosomal proteins and other profiles mapped in at least two species (see Supplementary Fig. 1 for the full data set and Supplementary Table 1 for detailed descriptions). Different protein names for orthologues are separated by ‘/’ (see Supplementary Table 2). b, The number of all data sets generated by this and previous consortia publications1,2,3 (new, 815; old, 638). Each data set corresponds to a replicate-merged normalized profile of a histone, histone variant, histone modification, non-histone chromosomal protein, nucleosome or salt-fractionated nucleosome. c, The number of unique histone marks or non-histone chromosomal proteins profiled.

PowerPoint slide

We performed systematic cross-species comparisons of chromatin composition and organization, focusing on targets profiled in at least two organisms (Fig. 1). Sample types used most extensively in our analyses are human cell lines H1-hESC, GM12878 and K562; fly late embryos, third instar larvae and cell lines S2, Kc, BG3; and worm early embryos and stage 3 larvae. Our conclusions are summarized in Extended Data Table 1.

Not surprisingly, the three species show many common chromatin features. Most of the genome in each species is marked by at least one histone modification (Supplementary Fig. 2), and modification patterns are similar around promoters, gene bodies, enhancers and other chromosomal elements (Supplementary Figs 3 –12). Nucleosome occupancy patterns around protein-coding genes and enhancers are also largely similar across species, although we observed subtle differences in H3K4me3 enrichment patterns around transcription start sites (TSSs) (Extended Data Fig. 1a and Supplementary Figs 12–14). The configuration and composition of large-scale features such as lamina-associated domains (LADs) are similar (Supplementary Figs 15 –17). LADs in human and fly are associated with late replication and H3K27me3 enrichment, suggesting a repressive chromatin environment (Supplementary Fig. 18). Finally, DNA structural features associated with nucleosome positioning are strongly conserved (Supplementary Figs 19 and 20).

Although patterns of histone modifications across active and silent genes are largely similar in all three species, there are some notable differences (Extended Data Fig. 1b). For example, H3K23ac is enriched at promoters of expressed genes in worm, but is enriched across gene bodies of both expressed and silent genes in fly. H4K20me1 is enriched on both expressed and silent genes in human but only on expressed genes in fly and worm (Extended Data Fig. 1b). Enrichment of H3K36me3 in genes expressed with stage or tissue specificity is lower than in genes expressed broadly, possibly because profiling was carried out on mixed tissues (Supplementary Figs 21–23; see Supplementary Methods). Although the co-occurrence of pairs of histone modifications is largely similar across the three species, there are clearly some species-specific patterns (Extended Data Fig. 1c and Supplementary Figs 24 and 25).

Previous studies showed that in human9,10 and fly1,11 prevalent combinations of marks or ‘chromatin states’ correlate with functional features such as promoters, enhancers, transcribed regions, Polycomb-associated domains, and heterochromatin. ‘Chromatin state maps’ provide a concise and systematic annotation of the genome. To compare chromatin states across the three organisms, we developed and applied a novel hierarchical non-parametric machine-learning method called hiHMM (see Supplementary Methods) to generate chromatin state maps from eight histone marks mapped in common, and compared the results with published methods (Fig. 2 and Supplementary Figs 26–28). We find that combinatorial patterns of histone modifications are largely conserved. Based on correlations with functional elements (Supplementary Figs 29–32), we categorized the 16 states into six groups: promoter (state 1), enhancer (states 2 and 3), gene body (states 4–9), Polycomb-repressed (states 10 and 11), heterochromatin (states 12 and 13), and weak or low signal (states 14–16).

Figure 2: Shared and organism-specific chromatin states.
figure 2

Sixteen chromatin states derived by joint segmentation using hiHMM (see Supplementary Methods) based on enrichment patterns of eight histone marks. The genomic coverage of each state in each cell type or developmental stage is also shown (see Supplementary Figs 26–32 for detailed analysis of the states). States are named for putative functional characteristics.

PowerPoint slide

Heterochromatin is a classically defined and distinct chromosomal domain with important roles in genome organization, genome stability, chromosome inheritance and gene regulation. It is typically enriched for H3K9me3 (ref. 12), which we used as a proxy for identifying heterochromatic domains (Fig. 3a and Supplementary Figs 33 and 34). As expected, the majority of the H3K9me3-enriched domains in human and fly are concentrated in the pericentromeric regions (as well as other specific domains, such as the Y chromosome and fly 4th chromosome), whereas in worm they are distributed throughout the distal chromosomal ‘arms’11,13,14 (Fig. 3a). In all three organisms, we find that more of the genome is associated with H3K9me3 in differentiated cells and tissues compared to embryonic cells and tissues (Extended Data Fig. 2a). We also observe large cell-type-specific blocks of H3K9me3 in human and fly11,14,15 (Supplementary Fig. 35). These results suggest a molecular basis for the classical concept of ‘facultative heterochromatin’ formation to silence blocks of genes as cells specialize.

Figure 3: Genome-wide organization of heterochromatin.
figure 3

a, Enrichment profiles of H3K9me1, H3K9me2, H3K9me3 and H3K27me3, and identification of heterochromatin domains based on H3K9me3 (illustrated for human H1-hESC, fly L3 and worm L3). For fly chr 2, 2L, 2LHet, 2RHet and 2R are concatenated (dashed lines). C, centromere; Het, heterochromatin. b, Genome-wide correlation among H3K9me1, H3K9me2, H3K9me3, H3K27me3 and H3K36me3 in human K562 cells, fly L3 and worm L3; no H3K9me2 profile is available for human. c, Comparison of Hi-C-based and chromatin-based topological domains in fly LE. Heat maps of similarity matrices for histone modification and Hi-C interaction frequencies are juxtaposed (see Supplementary Fig. 40).

PowerPoint slide

Two distinct types of transcriptionally repressed chromatin have been described. As discussed above, classical ‘heterochromatin’ is generally concentrated in specific chromosomal regions and enriched for H3K9me3 and also H3K9me2 (ref. 12). In contrast, Polycomb-associated silenced domains, involved in cell-type-specific silencing of developmentally regulated genes11,14, are scattered across the genome and enriched for H3K27me3. We found that the organization and composition of these two types of transcriptionally silent domains differ across species. First, human, fly and worm display significant differences in H3K9 methylation patterns. H3K9me2 shows a stronger correlation with H3K9me3 in fly than in worm (r = 0.89 versus r = 0.40, respectively), whereas H3K9me2 is well correlated with H3K9me1 in worm but not in fly (r = 0.44 versus r = −0.32, respectively) (Fig. 3b). These findings suggest potential differences in heterochromatin in the three organisms (see below). Second, the chromatin state maps reveal two distinct types of Polycomb-associated repressed regions: strong H3K27me3 accompanied by marks for active genes or enhancers (Fig. 2, state 10; perhaps due to mixed tissues in whole embryos or larvae for fly and worm), and strong H3K27me3 without active marks (state 11) (see also Supplementary Fig. 31). Third, we observe a worm-specific association of H3K9me3 and H3K27me3. These two marks are enriched together in states 12 and 13 in worm but not in human and fly. This unexpectedly strong association between H3K9me3 and H3K27me3 in worm (observed with several validated antibodies; Extended Data Fig. 2b) suggests a species-specific difference in the organization of silent chromatin.

We also compared the patterns of histone modifications on expressed and silent genes in euchromatin and heterochromatin (Extended Data Fig. 2c and Supplementary Fig. 36). We previously reported prominent depletion of H3K9me3 at TSSs and high levels of H3K9me3 in the gene bodies of expressed genes located in fly heterochromatin14, and now find a similar pattern in human (Extended Data Fig. 2c and Supplementary Fig. 36). In these two species, H3K9me3 is highly enriched in the body of both expressed and silent genes in heterochromatic regions. In contrast, expressed genes in worm heterochromatin have lower H3K9me3 enrichment across gene bodies compared to silent genes (Extended Data Fig. 2c and Supplementary Figs 36, and 37). There are also conspicuous differences in the patterns of H3K27me3 in the three organisms. In human and fly, H3K27me3 is highly associated with silent genes in euchromatic regions, but not with silent genes in heterochromatic regions. In contrast, consistent with the worm-specific association between H3K27me3 and H3K9me3, we observe high levels of H3K27me3 on silent genes in worm heterochromatin, whereas silent euchromatic genes show modest enrichment of H3K27me3 (Extended Data Fig. 2c and Supplementary Fig. 36).

Our results suggest three distinct types of repressed chromatin (Extended Data Fig. 3). The first contains H3K27me3 with little or no H3K9me3 (human and fly states 10 and 11, and worm state 11), corresponding to developmentally regulated Polycomb-silenced domains in human and fly, and probably in worm as well. The second is enriched for H3K9me3 and lacks H3K27me3 (human and fly states 12 and 13), corresponding to constitutive, predominantly pericentric heterochromatin in human and fly, which is essentially absent from the worm genome. The third contains both H3K9me3 and H3K27me3 and occurs predominantly in worm (worm states 10, 12 and 13). Co-occurrence of these marks is consistent with the observation that H3K9me3 and H3K27me3 are both required for silencing of heterochromatic transgenes in worms16. H3K9me3 and H3K27me3 may reside on the same or adjacent nucleosomes in individual cells17,18; alternatively the two marks may occur in different cell types in the embryos and larvae analysed here. Further studies are needed to resolve this and determine the functional consequences of the overlapping distributions of H3K9me3 and H3K27me3 observed in worm.

Genome-wide chromatin conformation capture (Hi-C) assays have revealed prominent topological domains in human19 and fly20,21. Although their boundaries are enriched for insulator elements and active genes19,20 (Supplementary Fig. 38), the interiors generally contain a relatively uniform chromatin state: active, Polycomb-repressed, heterochromatin, or low signal22 (Supplementary Fig. 39). We found that chromatin state similarity between neighbouring regions correlates with chromatin interaction domains determined by Hi-C (Fig. 3c, Supplementary Fig. 40 and Supplementary Methods). This suggests that topological domains can be largely predicted by chromatin marks when Hi-C data are not available (Supplementary Figs 41 and 42).

C. elegans and D. melanogaster have been used extensively for understanding human gene function, development and disease. Our analyses of chromatin architecture and the large public resource we have generated provide a blueprint for interpreting experimental results in these model systems, extending their relevance to human biology. They also provide a foundation for researchers to investigate how diverse genome functions are regulated in the context of chromatin structure.