Specific mammalian genes functionally and dynamically associate together within the nucleus. Yet, how an array of many genes along the chromosome sequence can be spatially organized and folded together is unknown. We investigated the 3D structure of a well-annotated, highly conserved 4.3-Mb region on mouse chromosome 14 that contains four clusters of genes separated by gene “deserts.” In nuclei, this region forms multiple, nonrandom “higher order” structures. These structures are based on the gene distribution pattern in primary sequence and are marked by preferential associations among multiple gene clusters. Associating gene clusters represent expressed chromatin, but their aggregation is not simply dependent on ongoing transcription. In chromosomes with aggregated gene clusters, gene deserts preferentially align with the nuclear periphery, providing evidence for chromosomal region architecture by specific associations with functional nuclear domains. Together, these data suggest dynamic, probabilistic 3D folding states for a contiguous megabase-scale chromosomal region, supporting the diverse activities of multiple genes and their conserved primary sequence organization.
The information encoded in the primary sequence of eukaryotic genomes is three-dimensionally organized within the cell nucleus. Although the underlying principles of this 3D genome organization are just beginning to be explored, emerging evidence indicates that it plays an important role in genomic functions. For example, gene expression is related to the 3D position of a locus within the overall nuclear volume, to associations of individual genes with specific nuclear compartments, and to spatial interactions between pairs of genes (Kosak and Groudine, 2004; Misteli, 2004; Osborne et al., 2004; Spilianakis et al., 2005). Given these functional interactions of individual genes, a key unanswered question is how a chromosome manages and coordinates the structural demands of multiple genes and their associated activities. Indeed, little is known about “higher order” DNA folding in the nucleus (Muller et al., 2004), much less how this folding is related to the diverse information encoded in the underlying primary sequence.
In mammals, the most widely known feature of genome 3D organization is the differential enrichment of euchromatin and heterochromatin in the nuclear interior and periphery, respectively. These patterns tend to be recapitulated by large-scale (∼5 Mb) gene-rich and -poor chromosome regions, which correspond to different cytogenetic chromosome bands (Ferreira et al., 1997; Zink et al., 1999). The partitioning of chromosome regions according to gene density provides an overall framework for genome organization in the nucleus. However, these regions represent crude divisions of sequence. As such, they provide limited insights into the nuclear organization of specific sets of genes and how the chromatin polymer folds to accommodate different sequences.
Completely sequenced mammalian genomes now allow for more precise and comprehensive studies of 3D genome organization within these large chromosomal regions. However, only a few pairs of well-annotated, closely linked genes have been localized relative to each other in nuclei (Chambeyron and Bickmore, 2004; Zink et al., 2004). Likewise, the handful of studies of chromosome regions typically have involved homogenous labeling across the region or probing a single locus in the region relative to the whole chromosome “territory” (Volpi et al., 2000; Mahy et al., 2002; Williams et al., 2002; Gilbert et al., 2004; Muller et al., 2004). These studies have clearly demonstrated that chromosome architecture is related to gene density and gene activity and that this architecture is dynamic. However, the spatial relationships among the many different sequences within a large chromosome region remain poorly understood.
We address how a series of multiple genes within a megabase-scale chromosomal region are organized relative to each other in the nucleus. We focused on a well-annotated gene-poor region on mouse (Mus musculus) chromosome 14 (Mmu14). This 4.3-Mb region is enriched with genes that affect the development of multiple embryonic tissues, including the heart, skeleton, and various structures in the nervous system (Peterson et al., 2002). In the Mmu14 primary sequence, these genes are organized into small clusters separated by >400-kb stretches of gene-poor sequence called “gene deserts” (Peterson et al., 2002; Nobrega et al., 2003). We probed the nuclear structure of this entire region and found evidence for a dynamic, probabilistic framework for the 3D organization of multiple genes within a chromosome region.
Predominant arrangements of gene clusters and deserts in nuclei
We selected a well-annotated, 4.3-Mb gene-poor region of distal Mmu14 to study 3D genome organization and chromatin folding across a contiguous chromosomal region (Fig. 1 A). Similar to other gene-poor regions, the selected Mmu14 region is later replicating and enriched with developmental genes (Somssich et al., 1981; Ferreira et al., 1997; Peterson et al., 2002; Nelson et al., 2004). The region's 19 genes are grouped into four distinct clusters separated by four gene deserts, all of which are 0.2–1.0 Mb long (Fig. 1 A; Peterson et al., 2002). The content, order, and relative spacing of the genes in this region are conserved from primates to chickens, suggesting functional and structural constraints on gene organization in the primary sequence (Bourque et al., 2005).
We first examined the 3D structure of this Mmu14 region in NIH-3T3 fibroblasts. Quantitative RT-PCR showed that loci in each of the Mmu14 gene clusters are expressed in this cell type, indicating that although this region is gene-poor, it is not silenced chromatin (Fig. 1 B). NIH-3T3 cells were probed by FISH according to the pattern of gene clusters and deserts in the primary sequence (Fig. 2 A). Because of their genomic size (0.2–1.0 Mb), the gene clusters and deserts are readily resolvable in interphase nuclei (Lawrence et al., 1990). To directly compare cluster and desert positions, 23 bacterial artificial chromosomes (BACs) spanning the region were used as FISH probes (Fig. S1), differentiating all gene clusters from all deserts with a two-color labeling scheme (Fig. 2 A, red and green, respectively).
Our initial survey of the Mmu14 clusters and deserts in NIH-3T3 nuclei revealed a striking partitioning of genic and nongenic sequences (Fig. 2 B). Moreover, we observed three basic patterns of 3D cluster–desert arrangement, plus combinations of those patterns (Fig. 2 C). One cluster–desert pattern was marked by alternating signals along a “striped” DNA fiber. This 3D arrangement reflects the linear organization of gene clusters and deserts in the primary sequence, albeit in a compacted state. End–end measurements of these structures in deconvolved epifluorescence images indicated a packing ratio of 300:1 relative to naked DNA (4.1 ± 0.5 μm/4.3 Mb), which is greater than the ∼40:1 packing ratio for 30-nm chromatin fibers (see Materials and methods). In the second conformation, all of the gene clusters were displaced to one side of all the deserts, compressing the striped fiber by a “zigzag” arrangement (∼800:1 packing ratio; 1.5 ± 0.3 μm end–end). The third structure was marked by close grouping of all gene clusters into a “hub” with peripherally arranged deserts (∼900:1 packing ratio; 1.4 ± 0.2 μm end–end). The latter two conformations indicate additional 3D levels of gene cluster organization, beyond their arrangement in the primary sequence. These conformations further reveal local domains of genic DNA within the nuclear chromosome territory, which are formed by the spatial aggregation of multiple gene clusters.
To more rigorously classify these structures, we generated 3D deconvolved images and analyzed 132 chromosomes at multiple viewing angles. G1 phase cells were selected for this and all subsequent analyses to rule out the effects of cell cycle on architecture. To accomplish this, we included an additional probe for the earlier-replicating agrin locus on Mmu4 (Somssich et al., 1981), and selected cells with a prereplication, singlet agrin signal, which is indicative of G1 cells rather than those with a postreplication doublet signal. Scoring of cluster–desert patterns in these G1 cells indicated that the aforementioned three morphological classes were indeed predominant, representing 67% of chromosomes (Fig. 2 C). Unlike gene cluster hubs, desert hubs were rare (3%), indicating sequence-specific chromosome architecture. The remainder of chromosomes (20%) exhibited combinations of the three predominant conformations (e.g., half striped and half zigzag; Fig. 2 C, combo), suggesting transitional structures. The cluster–desert conformations were rarely the same for homologous chromosomes within a given cell (4% of cells). In addition, all combinations of different cluster–desert patterns within a cell were present at similar frequencies, which is consistent with independent folding of homologues in the same nucleus. Together, these findings indicate that Mmu14 regions form multiple, defined, and likely dynamic structures in nuclei.
We confirmed the Mmu14 gene cluster–desert arrangements in additional cell populations and under different fixation conditions (Fig. S2). More than 500 NIH-3T3 cells were evaluated by higher throughput 2D analyses, which showed similar relative frequencies of the three predominant conformations. These were not affected by fixation protocol. Consistent with this, several studies have shown that FISH does not significantly affect chromatin organization at the size scale of these Mmu14 gene clusters and deserts (Robinett et al., 1996; Solovei et al., 2002; Muller et al., 2004). In addition, the three predominant conformations were present at similar frequencies in both NIH-3T3 and primary mouse embryo fibroblasts (P ≥ 0.5, χ2 test; Fig. S2). Thus, they do not result from aneuploidy of immortalized NIH-3T3 fibroblasts. Collectively, the data strongly indicate specific 3D organizational states for the Mmu14 region gene clusters and deserts in fibroblast nuclei.
Gene distribution pattern in primary sequence directly corresponds to higher order chromatin structures
We tested whether Mmu14 stripes, zigzags, and hubs truly reflect 3D organization based on the gene distribution pattern in the primary sequence or whether they are sequence-independent and simply a consequence of the alternating labeling scheme. The pattern of probe labels was shifted ∼250 kb down the chromosome so that each label no longer matched exclusively with gene clusters or deserts. This resulted in the increased overlap of the two labels in nuclei (Fig. 3 A) and significant differences in the distributions of nuclear label patterns, as assessed by 2D scoring (P < 1 × 10−6, χ2 test; Fig. 3 C). These differences largely resulted from fewer of the more highly folded conformations, zigzags and hubs of “red” label, which corresponded to gene clusters in the cluster–desert–matched labeling scheme. We found little change in the frequencies of striped fibers and hubs of “green” label (P = 0.8, χ2 test; Fig. 3 C). However, the shifted labels would not be expected to affect these probe patterns if striped fibers represent a linear arrangement of chromosome sequence and if gene deserts form hubs at random frequencies. Thus, these findings provide strong evidence for nonrandom chromatin folding that is specifically matched to the gene cluster pattern in the primary sequence.
If gene clusters and deserts establish Mmu14 region structure, then a chromosomal region with different primary sequence organization should also appear different when probed by a Mmu14 labeling scheme. To confirm this, we applied the Mmu14 labeling scheme to a homogeneously gene-dense region on Mmu15 that lacks gene deserts (Fig. 3 B). Hybridization to NIH-3T3 cells revealed an increase in striped fibers (P < 1 × 10−7, χ2 test; Fig. 3 C). These were more decondensed (≤300-nm diam by FISH) than Mmu14 fibers (∼400-nm diam), resembling chromatin structures seen at other gene-rich sequences (Volpi et al., 2000; Mahy et al., 2002; Muller et al., 2004). Importantly, the Mmu15 probes revealed fewer zigzag and red hub structures compared with Mmu14 cluster–desert probes (P < 1 × 10−2). Furthermore, the differentially red- or green-labeled Mmu15 regions, which mark sequences with similar gene densities, were found in hubs at similar frequencies (P = 0.2). This contrast with Mmu14 morphological patterns is consistent with higher order structures that are based on the pattern of gene distribution in the Mmu14 primary sequence.
Nonrandom organization of Mmu14 gene clusters and deserts in nuclei
The dependence of Mmu14 region 3D conformation on genomic clusters of expressed genes suggests nonrandom organization in nuclei. We next investigated the arrangement of specific Mmu14 gene clusters and deserts relative to each other and compared their organization with theoretical models of chromosome organization that do not include functional information. We localized pairs of gene clusters or deserts that were separated by similar genomic distances in NIH-3T3 cells (Fig. 4 A). Two-color FISH of the proximal gene clusters, C1 and C2, produced signals that frequently overlap or abut each other (72%; Fig. 4 B). In contrast, the flanking deserts (D1 and D2) contact each other in only 39% of chromosomes (Fig. 4 B). This differential organization was confirmed by 3D measurements of center–center distances (P < 1 × 10−3, Kolmogorov-Smirnov (KS) test; Fig. 4 C). We also measured the distance between C1 and its flanking desert, D1, and found it similar to the C1–C2 distance, though C1 and C2 are further apart in the primary sequence (Fig. 4, A and C).
Because a randomly folded 5-Mb region of chromatin has yet to be identified empirically, we compared our in situ data to a computational model of a randomly folded chromosomal region. This model includes all chromosomes in a mouse nucleus, with each chromosome represented as a polymer of connected 1-Mb spherical “domains,” which are similar in size to the gene clusters and deserts being studied (Fig. 4 D). The modeled domains are consistent with empirically observed ∼1-Mb chromatin foci or rosettes (Ma et al., 1998; Munkel et al., 1999; Sadoni et al., 1999), and they are depicted as elastic spheres to allow partial overlap (Fig. 4 B). The domains are connected by short DNA linkers defined by an entropic spring potential. Each chromosome territory is further bounded by a weak barrier potential to maintain a volume similar to an empirically determined average (Kreth et al., 2004; Bolzer et al., 2005). With no further constraints on domain positions, ∼400,000 Monte Carlo steps were calculated to independently move all chromosomal domains.
Simulations of 50 different nuclei showed similar separations between C1 and C2 and between D1 and D2 in the virtual Mmu14 (P > 0.1, KS test; Fig. 4 E). This contrasts markedly with our empirical measurements (Fig. 4 C). As expected, the model generated closer positions for C1 and D1 (P < 0.01, KS test), which are also closer in the primary sequence. These modeling and empirical data strongly support nonrandom, sequence-specific folding of the Mmu14 region in nuclei.
To additionally verify nonrandom chromatin folding, we compared the Mmu14 region to a random-walk polymer model. For a random-walk polymer, the mean-squared distance between two points in 2D or 3D (e.g., nuclear distance) is proportional to their distance along the polymer (e.g., genomic distance; Fig. 5, dashed line; Yokota et al., 1995). We measured the nuclear distance between C1 and successively more distal points in the Mmu14 region, ending 4.0 Mb away at D4. A simple linear relationship between the mean-squared nuclear distance and the genomic distance was not observed (Fig. 5, black lines). Rather, this relationship was multiphasic, with at least two transitions in slope. The initial 1.5 Mb (C1–C2) showed a consistent nuclear separation, producing a line with no slope and indicating highly nonrandom substructure. The central 2 Mb of the region produced a line with a positive slope, which is consistent with a short segment of random walk. The distal end of the Mmu14 region marked a transition to a negative slope, additionally suggesting that the distal end loops back toward the proximal end, which is similar to previously reported 2-Mb giant chromatin loops (Yokota et al., 1995). These multiple relationships indicate that the Mmu14 region is not folded by a simple random walk of the chromatin fiber, but contains specific subdomains with different folding properties.
Expressed gene cluster organization is not strictly dependent on ongoing transcription
Chromosome folding that is based on the genomic distribution of genes suggests a relationship to gene activity. To explore this potential relationship, we examined the expression status of Mmu14 region genes in several different cell types and then compared expression states to the region's nuclear organization. First, quantitative RT-PCR of transcripts from chondrocytes, embryonic stem (ES), and T cells indicated at least twofold variations in expression levels of 7/19 genes across the region (Fig. 6 A). In no cell type was an entire gene cluster completely inactive, though a few individual genes were undetectable above background levels. The expression of at least one gene per cluster in multiple cell types is consistent with this region's diverse mixture of genes.
Second, we examined Mmu14 region structure in the nuclei of these three diverse cell types. Cluster–desert arrangements similar to those detected in NIH-3T3 cells were found in all three cases, and these occurred at similar frequencies (P > 0.1, χ2 test; Fig. 6 B). These data indicate that the predominant cluster–desert structures correlate with the activated expression status of the whole cluster rather than the expression levels of individual genes within the cluster.
We next determined whether transcriptional activity at these active gene clusters affects cluster and desert nuclear organization. NIH-3T3 cells were treated with 5,6-dichlorobenzimidazole riboside (DRB) for 60 min to arrest transcription. This period of inhibition is sufficient for other regions of the genome to reorganize (Muller et al., 2004). However, DRB treatment did not significantly change the frequencies of the predominant Mmu14 cluster–desert conformations in nuclei (P ≥ 0.5, χ2 test; Fig. 6 B). Even after prolonged (5 h) DRB treatment, the predominant conformations remained, though overall nuclear morphology changed significantly (unpublished data). Thus, transcriptional elongation by itself does not maintain the predominant arrangements of Mmu14 gene clusters and deserts.
Gene clusters are not preferentially organized around a common transcription site or splicing factor–rich domain
The mammalian nucleus is organized into several functional compartments that are marked by accumulations of specific proteins. These compartments include the nucleolus, splicing factor–rich domains or “speckles,” and the nuclear periphery. Specific genes preferentially associate with these distinct compartments (e.g., ribosome DNA at nucleoli), and these associations are related to gene expression (Kosak and Groudine, 2004; Misteli, 2004). Given the tendency for the expressed Mmu14 gene clusters to aggregate in nuclei, we examined whether the clusters organize around nuclear compartments related to mRNA gene expression.
Previous studies suggest that transcribing mRNA genes may converge at common nuclear sites, so-called transcription factories (Iborra et al., 1996; Osborne et al., 2004). In cultured cell lines, including the NIH-3T3 cells used here, transcription sites are marked by thousands of small (∼70 nm) accumulations of nascent transcripts and RNA polymerase II (pol II; Fig. S3; Martin et al., 2004; Osborne et al., 2004). Gene cluster associations with transcription factories might not be perturbed by DRB (Fig. 6 B), which halts pol II elongation rather than destabilizing pol II DNA binding (Mok et al., 2001). Therefore, we determined whether Mmu14 gene clusters tend to associate with a common transcription site in uninhibited cells. Given the highly dispersed, complex pol II distribution in NIH-3T3 cells, we detected specific transcription sites for the Mmu14 gene clusters using RNA FISH.
We examined the relative nuclear positions of transcribing clusters C1 and C2, and C2 and C4, each separated by ∼1.7 Mb (Fig. 7 A). For a given experiment in NIH-3T3 cells, transcripts from both of the probed clusters were detected in ∼60% of chromosomes (Fig. 7, A and B), which is consistent with variable expression levels reported for many homologous loci in the same nucleus (Levsky et al., 2002; Osborne et al., 2004). In transcribing chromosomes, both separated and contacting transcript signals were detected (Fig. 7, A and B). Separated and contacting classes of signals were present at similar frequencies, indicating that transcribing gene clusters neither favor nor disfavor close nuclear aggregation. These findings argue against gene cluster associations in the Mmu14 region that are based solely on the clustering of genes at a common, small transcription factory. We note, however, that ∼50% of transcribing clusters are close enough that at times they could share a common transcription site.
In addition to sites of transcription, genes can be functionally organized in the nucleus via association with larger nuclear domains. By fluorescence microscopy, such coassociating loci frequently appear to localize near each other rather than to directly overlap, similar to the gene clusters in the Mmu14 region. Though NIH-3T3 cells do not contain the large accumulations of pol II reported in primary cells, they do contain splicing factor–enriched domains or speckles (Fig. S3), which associate with multiple genes (Shopland et al., 2003; Osborne et al., 2004). We examined whether the aggregated Mmu14 gene clusters organize around splicing factor speckles with triple-label experiments (Fig. S3). Multiple gene clusters did not align with or surround any splicing factor domains. Rather, they were typically localized to a different focal plane. Only 8% of Mmu14 signals contacted splicing factor domains (Table I), which is similar to other gene-poor chromosome regions and loci that associate with splicing factor domains at random frequency (Xing et al., 1995; Shopland et al., 2003).
Mmu14 gene deserts preferentially associate and align with the nuclear periphery
In addition to nuclear domains associated with active genes, other nuclear regions are enriched with inactivated and gene-poor chromatin (Kosak and Groudine, 2004; Zink et al., 2004). These include the heterochromatic centromeres and nuclear periphery. Quantitative 3D image analysis indicated that the Mmu14 region is most concentrated near the nuclear periphery (Fig. 8 A). Approximately half (51%) of Mmu14 regions localize within the nuclear zone defined by the outermost 10% of the nuclear radius, which represents only 27% of the nuclear volume (Table I). Interestingly, this analysis also suggested that Mmu14 deserts localize more peripherally than gene clusters (Fig. 8 A).
To further examine the organization of gene deserts with the nuclear periphery, we scored cluster and desert signals for association with the outermost edge of the nucleus, which was defined by lamin B receptor immunostain (Fig. 8 B). We found that deserts more frequently contact and align with the nuclear edge than gene clusters. Moreover, this enrichment was detected predominantly in chromosomes with zigzag and gene cluster hub conformations (P < 1 × 10−4, χ2 tests), in contrast with striped fibers (P = 0.2). These data indicate a preferential alignment of gene deserts with the edge of the nucleus, and suggest that this alignment plays a role in Mmu14 region folding.
Though gene deserts preferentially align with the nuclear edge in zigzags and gene cluster hubs, we did not detect significant enrichment of any one folding pattern in the peripheral or more internal nuclear zones (P = 0.3, χ2 test; Fig. S4). 60% of zigzags, 54% of gene cluster hubs, and 46% of striped fibers contact the nuclear periphery. Of the chromosomes contacting the periphery, a similar proportion (75%) of each morphological class has multiple deserts contacting the periphery. In contrast, gene cluster contacts with the periphery do vary according to conformation. Fewer zigzags and gene cluster hubs than striped fibers have gene clusters aligned with the nuclear edge (Fig. 8 B). These findings suggest that clusters shift away from the nuclear periphery to form zigzags and gene cluster hubs.
Half of the Mmu14 signals localize to the nuclear interior, where the most prominent heterochromatic domains are chromocenters. These centromere aggregates appear as bright spots in DAPI-stained mouse nuclei (Fig. S4; Moen et al., 2004). We found that 43% of the Mmu14 regions in the internal nuclear zone contact the edges of chromocenters (Table I). However, the frequency of these contacts was distributed equally between gene clusters and deserts (Fig. S4). Thus, neither clusters nor deserts specifically align with chromocenters. These data suggest that Mmu14 conformation does not simply reflect a general association of gene deserts with heterochromatin and that other interior nuclear compartments may be linked to Mmu14 region folding instead.
We show defined nuclear organization for the collective set of genes across a 4.3-Mb chromosome region. Arrayed gene clusters and intervening gene deserts form multiple, but predominant, 3D arrangements in nuclei, typically marked by hubs of multiple associated gene clusters. Though gene clusters are activated and expressed, their nuclear aggregation is not simply correlated with on-going transcription, consistent with the diverse functions of multiple genes. Gene-depleted deserts preferentially align with the nuclear periphery, suggesting that this functional nuclear compartment, as well as gene deserts, play a role in chromosome region architecture. Collectively, our findings suggest a sequence-dependent, dynamic 3D framework for the organization of multiple genes within a chromosome region.
A sequence-based, dynamic model of higher order chromatin structure
The structural features of the Mmu14 region studied here suggest a model of chromatin folding beyond the 30-nm fiber that is based on patterns of encoded primary sequence information (Fig. 9). We found chromosome region structures that matched the distribution of gene clusters and deserts in primary sequence, indicating that each cluster and desert may be a building block for assembling higher order structures. A recent study showed different levels of chromatin compaction for gene-dense and -poor ∼100 kb–1 Mb genomic segments, which is similar to the Mmu14 clusters and deserts studied herein (Gilbert et al., 2004), further suggesting distinct structural domains. Our data also indicate that cluster and desert domains fold together in different combinations and at different frequencies (Fig. 4). These probabilistic interactions (Fig. 9, dashed arrows) give rise to multiple, predominant structures. Additional structures correspond to combinations of the predominant patterns, suggesting intermediate states that arise from dynamic transitions between the predominant states (Fig. 9, bold arrows). This dynamic organization is consistent with the reported movements of chromosomal loci in living mammalian cells, where loci move about 0.5–1 μm/min (Chubb et al., 2002), similar to the distances that separate the Mmu14 gene clusters and deserts (Fig. 4). Though dynamic and variable, we present several lines of evidence indicating that the folding of gene clusters and deserts is not completely random. Rather, Mmu14 subchromosomal structure is defined by preferential, probabilistic folding states.
Several different models of higher order chromatin folding have been proposed previously, but have not taken annotated sequence information into account. These models include the following: (a) ∼1-Mb spherical domains or rosettes (Ma et al., 1998; Munkel et al., 1999; Sadoni et al., 2004), (b) chromatin fibers that coil or kink into thicker fibers (Manuelidis, 1990; Belmont and Bruce, 1994), and (c) randomly organized, ∼2-Mb giant loops (Yokota et al., 1995). Interestingly, the model we have assembled from Mmu14 data reflects different aspects of each of the seemingly disparate existing models. For example, the ∼1-Mb domain model is based on discrete, persistent DNA domains of ∼0.5-μm diam that are labeled by nucleotide- analogue incorporation during S phase (Ma et al., 1998; Cremer and Cremer, 2001). The gene cluster and desert domains we uncovered resemble replication foci in size and shape, suggesting that they are analogous structures, though they have yet to be compared directly.
The striped fibers collectively formed by cluster and desert domains also are reminiscent of 200–400 nm chromatin fibers identified by electron microscopy (Belmont and Bruce, 1994). Because FISH detection of gene clusters and deserts can swell their appearance, direct size-based comparison to chromatin structures detected by other methods is not possible (Robinett et al., 1996). FISH and the resolution limits of light microscopy also make it difficult to discriminate chromatin structures smaller than the ∼400-nm megabase-scale clusters and deserts studied herein (Muller et al., 2004). However, the positions of these domains relative to each other are likely unaffected by FISH (Robinett et al., 1996; Solovei et al., 2002; Muller et al., 2004), strongly supporting a large-scale, fiber-like structure that is formed in a substantial fraction of chromosomes and that is compacted significantly more than a 30-nm chromatin fiber (300:1 vs. 40:1 packing ratios).
Our measurements of larger scale organization across the distal 1.5-Mb portion of the Mmu14 region also fit with the previously proposed random walk-giant loop model (Yokota et al., 1995). In contrast, measurements across the 2-Mb proximal half of the Mmu14 region indicate a nonrandom walk without a loop, where desert 1 and part of cluster 2 on average are wrapped around or aligned with cluster 1. Thus, not all chromatin conforms to the giant loop model. Additional “averaged” structures form in nuclei, and these are sequence dependent.
An important feature of our analysis is that it goes beyond the averaged measurements of a random-walk model and, in doing so, reveals that chromatin folds in multiple defined patterns. For example, our averaged measurements did not indicate the presence of four small loops of extended deserts anchored by gene clusters in a hub configuration because other predominant conformations can form as well. By classifying hundreds of chromosomes individually, we found that the Mmu14 region, at times, forms a fiberlike structure and at other times forms structures with apparent loops (e.g., gene-cluster hubs).
Complex genomic structures accommodate multiple functions
A large genomic region serves as the substrate for multiple biochemical activities for the replication, maintenance, and expression of genetic information. These activities are complex, multistep processes, and they are mechanistically connected. For example, the surveillance of DNA damage is coupled to replication and transcription. The structure of a chromosome region must accommodate these interwoven activities. Our finding of variable chromosome organization is not surprising in this light, as it would enable transient interactions between different parts of the sequence to coordinate many activities; differential expression of a diverse gene set being just one example.
In this study, we specifically focused on the functional relationships between region-wide organization and transcription, which in some cases can affect higher order structure (Mahy et al., 2002; Muller et al., 2004). We found that inhibition of transcriptional elongation by DRB did not change nuclear cluster–desert patterns (Fig. 6 C), indicating that transcription elongation is not solely required for gene cluster association. Consistent with this, RNA FISH showed that transcribing gene clusters do not preferentially associate with a common, small (∼70 nm) transcription factory typical of NIH-3T3 cells (Fig. 7 and Fig. S3; Martin et al., 2004). Finally, we did not find a spatial relationship between the Mmu14 region and splicing factor–enriched domains in the nucleus (Table I), which associate with multiple active genes in gene-rich chromosome regions (Shopland et al., 2003). Nevertheless, the Mmu14 gene clusters are expressed in a variety of different cell types (Fig. 6 A), and, thus, are typically permissive to transcription.
Similar to our findings for the Mmu14 region, nuclear associations between other genomic sequences are not just related to gene expression at the level of ongoing transcription. The formation of transcriptionally poised, but not yet transcribing, chromatin has also been correlated with locus–locus nuclear interactions. Spatial associations between poised loci can be mediated by regulatory sequences (Spilianakis et al., 2005). Notably, gene deserts contain highly conserved regulatory sequences as well (Nobrega et al., 2003; Bourque et al., 2005; Ovcharenko et al., 2005), raising the possibility that these sequences play a role in region-wide chromatin organization.
In contrast to nuclear domains that are rich in gene expression, we did find that the Mmu14 region was organized relative to the nuclear periphery. In particular, gene deserts tend to specifically align with the nuclear edge in the more polarized zigzag and gene cluster hub conformations. These findings suggest that desert associations with the nuclear periphery play a role in higher order chromatin folding. The nuclear periphery is enriched with heterochromatin and components of the nuclear lamina, indicating potential mediators of Mmu14 region structure. Although heterochromatin tends to cluster together in the nucleus and Mmu14 deserts may be heterochromatic because of low gene density, these deserts do not show a preferential orientation to centromeric heterochromatin in the nuclear interior. Thus, desert organization does not simply reflect a general aggregation of heterochromatin. Gene desert organization instead might be linked to the nuclear lamina, which associates with chromatin and affects overall patterns of chromatin distribution in the nucleus (Gruenbaum et al., 2005). Lamins are present not only at the nuclear periphery but also throughout the nuclear interior at reduced levels, which is consistent with both internally and peripherally localized Mmu14 regions (Gruenbaum et al., 2005).
Conserved primary sequence organization and nuclear architecture
The cluster–desert organization of the Mmu14 region's primary sequence has been conserved between humans, mice, dogs, and chickens (Bourque et al., 2005). This conserved sequence architecture is consistent with an important role for gene spacing and an interdependency of the encoded sequences. Our data extend the role of 1D sequence organization to 3D nuclear architecture, where gene clusters are able to form dynamic, associative structures. These probabilistic associations may reflect the interwoven structural demands of the region's multiple genes. Chromosome regions containing gene deserts comprise ∼25% of mouse and human genomes (Nobrega et al., 2003), and many of these are conserved across multiple vertebrate species (Bourque et al., 2005; Ovcharenko et al., 2005). Thus, the nuclear organization of Mmu14 sequences described in this study is likely to reflect megabase-scale regions throughout the mammalian genome.
Materials And Methods
NIH-3T3 fibroblasts (a gift from L. Lau, University of Illinois, Chicago, IL), primary mouse embryo fibroblasts, and SV-40–transformed chondrocytes (a gift from T. Barak, The Jackson Laboratory, Bar Harbor, ME) were grown at 37°C in DME (Invitrogen) supplemented with 10% heat-inactivated fetal bovine serum. Feeder-independent ES cells (XC749; a gift from G. Cox, The Jackson Laboratory) were cultured in Glasgow's minimal Essential Medium, 10% fetal bovine serum, and leukemia inhibitory factor. CD8+ T cells (B/nx3; a gift from D. Roopenian, The Jackson Laboratory, Bar Harbor, ME) were grown in DME, 10% fetal bovine serum, and 50 U/ml interleukin 2. Where indicated, cells were treated with 30 μg/ml DRB (Calbiochem).
RNA was isolated with TRIzol (Invitrogen), treated with RNase-free DNaseI (Ambion), and 1 μg RNA per cell type was reverse transcribed with random hexamers and SuperScript II (Invitrogen). Gene expression levels for biological replicates, as well as technical triplicates, were quantified using an ABI7500 and SYBR-green (Applied Biosystems). Primer pairs are listed in Table SI . Expression levels for each target gene were determined relative to the constitutive metabolic enzyme glucose phosphate isomerase (Gpi) using 2-ΔCT, where ΔCT = CT target − CT Gpi.
Cells grown on coverslips were fixed with 4% formaldehyde according to two previously established protocols (Solovei et al., 2002; Tam et al., 2002). BAC pools were nick-translated with biotin-11-dUTP (Roche) or digoxigenin-16-dUTP (Roche; Tam et al., 2002). Fixed cells were base-hydrolyzed, heat denatured, and hybridized with 200 ng of each probe and 40 μg mouse CoT1 DNA (Invitrogen; Tam et al., 2002). Probes were detected with anti-digoxigenin antibody or avidin labeled with TRITC or fluorescein (Roche). Cells were counterstained with 1 μg/ml DAPI (Sigma-Aldrich) and mounted in Vectashield (Vector Laboratories).
Cells fixed as in the previous section were immunostained with antibody CTD 4H8 (Upstate Biotechnology), rat anti-BrdU (Harlan SeraLab), guinea pig anti–lamin B receptor (a gift from L. Schultz, The Jackson Laboratory; Hoffmann et al., 2002), or rabbit anti-SRm300 (a gift from B. Blencowe, University of Toronto, Toronto, Canada; Blencowe et al., 1998) as previously described (Moen et al., 2004). They were then subsequently detected with Alexa Fluor 488–goat anti–mouse IgG (Invitrogen), FITC–goat anti–rat, Cy5–goat anti–guinea pig, or Cy5–goat anti–rabbit IgG (Jackson ImmunoResearch Laboratories).
Cells were examined with an Axioplan2 (Carl Zeiss MicroImaging, Inc.) or a DMRE (Leica) microscope equipped with a filter wheel, triple-bandpass epifluorescence filter set (model 83000; Chroma Technology), and a 100×, 1.4 N.A., oil PlanApo objective, at 21 ± 1°C. Images were acquired with a Micromax (Princeton Instruments) camera and Metamorph imaging software (Universal Imaging), or a Zeiss Axiocam MRm and Axiovision 4.1. Image stacks were acquired at 0.1 μm intervals, deconvolved and rendered in 3D with AutoDeblurr (Media Cybernetics, Inc.). In some cases (Fig. 8 A and Fig. S3), cells were also imaged with a confocal microscope (SP2; Leica) with a 100X, 1.4 N.A., oil PlanApo objective and with pinhole set at 1 Airy disc.
Morphological and image analyses
The following definitions were applied for scoring morphological pattern: striped, at least six clearly alternating cluster–desert signals along a linear fiber; zigzag, >50% of cluster signals shifted in one direction from desert signals; gene cluster/red hub, central core of at least three clusters (or red foci with shifted labeling scheme), with peripheral deserts (or green foci) >180° around the cluster hub. Statistical differences in morphological pattern were determined by χ2 tests (Miller and Freund, 1965). Packing ratios relative to naked DNA were determined from 3D images by measuring along the length of largest overall structure. Measurements of distances between specific gene clusters and deserts were compared by two-sided KS tests (Miller and Freund, 1965). Image stacks were both manually (MetaMorph; Molecular Devices) and automatically segmented with custom written software (Khoros; AccuSoft; Edelmann et al., 2001), with similar results. For analysis of Mmu14 region radial distribution, gene cluster and desert signals in confocal image stacks were mapped relative to a series of 25 shells that expand from the center of the nucleus (0) to the periphery (100; Cremer et al., 2003).
The spherical 1-Mb chromatin domain model (Kreth et al., 2004; Bolzer et al., 2005) was adapted to accommodate a diploid mouse genome in a spherical 8-μm-diam nucleus. In this model, each chromosome is described by a 1-Mb linear chain and 500-nm diam domains, similar to empirically observed replication foci (Ma et al., 1998; Sadoni et al., 2004). 1-Mb domains are connected by 100-kb DNA linkers modeled with an entropic spring potential, which enforce a mean distance of 600 nm between consecutive domains. Different domains interact against each other by a slightly increasing exclusive potential that allows a certain amount of overlap. In addition, to maintain a mean chromosome territory volume similar to that measured in nuclei, each chain of domains representing a chromosome was surrounded by a weak barrier potential. Approximately 200,000 Metropolis Monte Carlo steps were calculated to independently move the positions of all chromosome domains and “relax” all chromosomes from a condensed, mitotic-like state. To ensure equilibrated interphase configurations, 200,000 additional steps were sampled (Kreth et al., 2004; Bolzer et al., 2005). To measure separation distances between each Mmu14 gene cluster and desert, the gravity centers of Mb domains corresponding to these regions in the virtual Mmu14 were identified for both homologues in 50 simulated nuclei.
Online supplemental material
Fig. S1 shows BAC contigs used for probing Mmu14 and Mmu15 regions. Fig. S2 shows predominant cluster–desert conformations are independent of fixation conditions, sample size, and immortalization of fibroblasts. Fig. S3 shows gene cluster organization relative to pol II sites and splicing factor–rich domains. Fig. S4 shows cluster–desert organization relative to chromocenters. Table SI shows primer sets used for real time RT-PCR.
We thank Greg Cox, Derry Roopenian, Tali Barak, Rob Burgess, Ben Blencowe, Leonard Shultz, and Lester Lau for reagents, Megan McOsker and Joel Graber for statistical analysis, and Sue Ackerman, Simon John, Barbara Tennet, and Barbara Knowles for critical comments on this manuscript.
This work was supported by National Institutes of Health (NIH) grants (HD41066 and CA034196 to T.P. O'Brien and to The Jackson Laboratory [TJL], respectively) and a National Science Foundation (NSF) grant (EPSCoR 0132384) to the University of Maine.
Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NIH or NSF.
K.A. Peterson and T.P. O'Brien's present address is the Dept. of Biomedical Sciences, Cornell University, Ithaca, NY 14853.
Abbreviations used in this paper: BAC, bacterial artificial chromosome; DRB, 5,6-dichlorobenzimidazole riboside; ES, embryonic stem; Mmu, Mus musculus; KS, Kolmogorov-Smirnov; pol II, RNA polymerase II.