Ionizing radiations (IR) alter hematopoietic stem cell (HSC) function on the long term, but the mechanisms underlying these effects are still poorly understood. We recently showed that IR induces the derepression of L1Md, the mouse young subfamilies of LINE-1/L1 retroelements. L1 contributes to gene regulatory networks. However, how L1Md are derepressed and impact HSC gene expression are not known. Here, we show that IR triggers genome-wide H3K9me3 decrease that occurs mainly at L1Md. Loss of H3K9me3 at intronic L1Md harboring NF-κB binding sites motifs but not at promoters is associated with the repression of HSC-specific genes. This is correlated with reduced NFKB1 repressor expression. TNF-α treatment rescued all these effects and prevented IR-induced HSC loss of function in vivo. This TNF-α/NF-κB/H3K9me3/L1Md axis might be important to maintain HSCs while allowing expression of immune genes during myeloid regeneration or damage-induced bone marrow ablation.
Introduction
Exposure to ionizing radiations (IR), in the context of medical use such as radiotherapies, is an independent risk factor for many disorders characteristic of an accelerated aging. The hematopoietic tissue is particularly sensitive to IR. We and others have shown that total body irradiation (TBI) in mice leads to long-term defects in hematopoiesis due to loss of hematopoietic stem cell (HSC) reserves and functions (Fleenor et al., 2015; de Laval et al., 2013; Mohrin et al., 2010). In HSCs, IR induces DNA damage accumulation, loss of self-renewal, and a biased differentiation toward the myeloid lineage leading to increased myeloid cell counts and decline of the adaptive immune response. These changes are likely contributing to many IR-induced premature aging disorders and to the high risk of developing myeloid leukemia. Understanding the molecular mechanisms leading to HSC loss of function upon IR is necessary to modulate its adverse effects. It may also help identifying the first events leading to hematologic malignancies.
IR induces DNA double strand breaks (DSBs). DNA damage is thought to be one of the main driving forces of aging. However, delaying the effects of age in mice by decreasing the levels of DNA damage has never been achieved, and a direct link between DSB formation and physiological aging is still lacking (White and Vijg, 2016). Although IR has been shown to induce chromosomal abnormalities in progenitors (de Laval et al., 2013; de Laval et al., 2014; Mohrin et al., 2010), in fact HSCs are quite resilient toward accumulating DNA mutations in response to DNA damage (Moehrle et al., 2015).
In addition to DSBs, IR has been shown to induce changes in chromatin state, mainly at the level of DNA methylation, in different tissues and cell lines (Miousse et al., 2017b). Epigenetic alterations have been observed in aged HSCs in the absence of mutations in epigenetic factors (Djeghloul et al., 2016; Sun et al., 2014). Reorganization of heterochromatin is among the most commonly reported changes in aging and senescence, supporting its essential role in maintaining proper cellular function (Tsurumi and Li, 2012). Maintenance of HSC identity is dependent on the heterochromatin mark H3K9me3 (Koide et al., 2016). Decreased H3K9me3 in HSCs due to loss of Suv39h2 and/or Suv39h1 methyltransferases occurs with age and is associated with the loss of B cell differentiation (Djeghloul et al., 2016) and hematopoietic changes archetypal of aging (Keenan et al., 2020). However, whether the long-term effects of IR on HSCs are linked to IR-induced changes in heterochromatin remains to be addressed.
Heterochromatin plays also a major role in the maintenance of the genome stability by repressing transposable elements (TEs), including DNA transposons and retro-TEs (RTEs), further classified as LTR sequences that characterize endogenous retroviruses (ERVs), and non-LTR elements such as long or short interspersed elements (LINE-1/L1; SINE). Propagation of RTEs in the genome has been recognized as a great source of genomic instability. Even without propagating, RTEs have also been recently recognized as major contributors of gene regulatory networks (Chuong et al., 2017). Indeed, they qualitatively and quantitatively control gene expression, providing alternative enhancers, promoters, splicing, or polyadenylation signals, and also serving as cis-regulatory elements in a cell-specific fashion. Basal L1 expression in early mouse embryo is necessary for its proper development (Jachowicz et al., 2017). RTEs are involved in T cell differentiation by regulating genes involved in immune processes (Adoue et al., 2019). Abnormal RTE expression has been observed in cancers, including acute myeloid leukemia (AML), and may be involved in the pathogenesis through the alteration of host gene expression (Chuong et al., 2017) and the expression of oncogenes (Deniz et al., 2020; Jang et al., 2019).
We recently showed that evolutionary recent mouse L1s (L1Md) are highly expressed in HSCs and that their expression is further increased and maintained at high levels up to 1 mo after TBI (Barbieri et al., 2018). TE expression is also increased in HSCs after chemotherapies (Clapes et al., 2021) and the decreased H3K9me3 in aged HSCs is associated with increased L1Md expression (Djeghloul et al., 2016). However, the impact of L1Md derepression on the HSC transcriptome remains to be addressed. In addition, the mechanisms and signaling pathways by which IR specifically triggers L1Md expression in HSCs are currently unknown. We and others previously showed that the NF-κB signaling pathway is required to prevent IR-induced HSC injury (Hu et al., 2021; de Laval et al., 2014). In addition, TNF-α–induced NF-κB supports HSC survival during inflammation and chemotherapeutic stress induced by 5-fluorouracil (Yamashita and Passegué, 2019). On the other hand, basal activation of NF-κB is required for HSC homeostasis and self-renewal potential and the expression of key HSC maintenance genes are severely impaired in mice deficient for NF-κB pathway factors (Fang et al., 2018; Hu et al., 2021; Stein and Baldwin, 2013). We show here that IR induces a major loss of H3K9me3 in HSCs, which mainly affects L1Md subfamilies, and more specifically intronic L1Md displaying NF-κB binding sites. By controlling the levels of H3K9me3 at selected intronic L1Md located in genes belonging to the HSC signature, this pathway plays a crucial role to preserve HSC-specific gene expression and HSC function during IR stress.
Results
H3K9me3 is mainly enriched at recent L1Md subfamilies in HSCs
To assess the effect of IR on H3K9me3 in HSCs and more particularly at TE sequences, we performed H3K9me3 chromatin immunoprecipitation sequencing (ChIP-seq) experiments in HSCs (Lin−Sca+-Kit+-CD34−Flk2−) sorted from mice 1 mo after TBI (2 Gy) or not, as previously described (Barbieri et al., 2018; Fig. 1, A and B). Deep characterization of H3K9me3 enrichment in HSCs, notably at TE sequences, has never been performed. We thus first characterized H3K9me3 genomic coverage in HSCs at steady state.
Analysis of genomic repeats is still a bioinformatics challenge. The assignment of reads complementary to sequences that are repeated in the genome and that present low sequence variation is largely compromised. These multiple mapping reads are often discarded in ChIP-seq studies. However, when using only unique mapping reads, one can induce a bias of representation of the TE subfamilies toward the oldest ones as the youngest subfamilies, such as L1Md, present a very low mappability due to quasi-identical copies (Teissandier et al., 2019). Therefore, while young families of TEs are the most epigenetically regulated in the genome (Barau et al., 2016; Pezic et al., 2014), they are severely underestimated in unique read analysis.
To maximize the output information on these sequences, we considered all reads that mapped to the genome without mismatches and randomly assigned them at one of their best possible positions in the genome (Fig. 1 A). We combined both unique and multiple mapping reads analyses (U-MRA and M-MRA, respectively) to finally get a compromise between precise assignment of the unique reads and global information at youngest TE subfamily level.
Quality control of the resulting reads indicated high genomic coverage in both U- and M-MRA and in non-irradiated (NIR) and IR conditions (Table S1). Peak calling followed by reproducibility measurement between replicates (irreproducible discovery rate [IDR]) identified 456 peaks on average in the NIR conditions in the U-MRA. As expected, with 7,769 peaks, M-MRA gave a substantial increased number of peaks (Fig. 1 C and Table S1). Our peak calling retrieved the SUV39h-1– and -2–dependent H3K9me3 peaks that were previously described at young TEs such as intracisternal A particle (IAP) ERVs (Fig. S1 A) using this strategy in mouse embryonic stem cells (mESCs; Bulut-Karslioglu et al., 2014). Of note, these elements were not covered by U-MRA.
After annotation of the peaks using annotateR, both U-MRA and M-MRA showed that the majority of the H3K9me3-enriched peaks occur at TEs (41.9 and 52%, respectively; Fig. 1 D). Interestingly, M-MRA shows a gain in LINE representation, with 32.7% of the total H3K9me3 enriched peaks vs. 16.5% in U-MRA. By contrast, M-MRA shows reduced SINE and DNA representation while LTR representation was not affected. This suggests that H3K9me3 enrichment occurs mainly at LINEs in HSCs. Heatmap representation of the average concentration (i.e., H3K9me3 enrichment normalized to input) of H3K9me3 at peaks retrieved from M-MRA, followed by RTE genomic coverage, further confirmed that peaks enriched in H3K9me3 at basal are mostly covering LINEs compared to LTR, SINEs, or DNA, and more especially the young subfamilies of LINEs (L1Md) compared to older ones (Lx5; Fig. 1 E).
Altogether, these data demonstrate that H3K9me3 is mainly enriched at repetitive sequences, and more particularly at young L1Md elements in HSCs.
Irradiation induces a loss of H3K9me3 that mainly affects the recent L1Md subfamilies
We next investigated changes in H3K9me3 that occur upon IR. We first compared the number of H3K9me3 peaks identified in IR vs. NIR conditions. Only 115 peaks in average were found in IR in the U-MRA, compared to 456 peaks in NIR condition, and 283 compared to 7,769 in the M-MRA (Fig. 1 C). Heatmap representation further showed a loss of H3K9me3 enrichment at peaks retrieved from M-MRA upon IR (Fig. 1 E).
Differential H3K9me3 enrichment analysis performed at H3K9me3 peaks identified in both IR and NIR conditions further revealed a strong decrease in H3K9me3 enrichment in IR (Fig. 2 A). We found 13 and 253 peaks showing significant (P < 0.05) differential H3K9me3 enrichment upon IR in U- and M-MRA, respectively. All of them in U-MRA and 252/253 peaks in M-MRA showed decreased H3K9me3 upon IR (Fig. 2 A). These data reveal a major loss of H3K9me3 genomic coverage upon IR, regarding both the number of peaks and the concentration of H3K9me3 at conserved peaks.
We performed the same analysis at TE genomic loci instead of peaks and found 2,559 loci with significant H3K9me3 differential enrichment, 1,667 and 892 loci showing decreased or increased H3K9me3, respectively, upon IR (Fig. 2 B). Annotation of these loci showed that they are enriched in LINEs (66.7%) and LTR (29.23%) compared to the distribution of these TEs in the mouse genome (24.71 and 22.07%, respectively; Fig. 2 C). The distribution of the H3K9me3 concentration showed a significant (P < 0.0001) decreased in H3K9me3 at both LINEs and LTR upon IR (Fig. 2 D). Comparing the distribution of the log2 fold change between IR and NIR conditions in H3K9me3 concentration showed that this decrease is significantly (P < 0.0001) more pronounced at LINEs than at LTRs (Fig. 2 E).
Since young TE subfamilies, notably L1Md, were previously reported to be the most epigenetically regulated in the genome compared to old LINEs in ESCs and testis (Barau et al., 2016; Pezic et al., 2014), we further dissected H3K9me3 enrichment at TEs depending on their age, as calculated in Sookdeo et al. (2013). We observed a significant negative correlation between the age of the LINE and H3K9me3 enrichment, with the youngest, typically L1Md_A and L1Md_Tf, showing the highest enrichment (Fig. 2 F), as previously observed for DNA methylation (Miousse et al., 2017a). IR induced a significantly (P < 0.0001) more pronounced loss of H3K9me3 at young L1Md compared to other LINEs (Fig. 2 G).
Altogether these data reveal that 1 mo after IR, HSCs display a major loss of H3K9me3, which mainly occurs at L1Md, the subfamilies of TEs the most enriched in H3K9me3 at steady state. This prompted us to further focus our analysis on L1Md.
Plot profile of H3K9me3 enrichment along L1Md sequences showed asymmetric distribution of H3K9me3 along the L1 body, with the highest enrichment and the highest loss upon IR at the 5′ end of these elements (Fig. 2 H and Fig. S1 B). Finally, we confirmed the global decrease of H3K9me3 at L1Md, and more particularly at L1Md_A, using ChIP–quantitative PCR (qPCR) experiments with primers recognizing either all L1Md or specifically L1Md_A promoter (Fig. 2 I), as described previously (Barbieri et al., 2018).
Irradiation induces a strong deregulation of the HSC transcriptome
In order to unravel the consequences of IR-induced H3K9me3 loss on the HSC transcriptome, we performed RNA-seq of HSCs 1 mo after TBI. Comparison of IR vs. NIR revealed 1,067 differentially expressed genes (DEGs; P < 0.05), with 602 (56.4%) genes downregulated and 465 (43.6%) genes upregulated upon IR (Fig. 3, A and B; and Table S2). Differences in gene expression are very strong, as almost 80% of the DEGs between IR and NIR present a fold change above 10 and more than 50% present a fold change above 50 (Fig. 3 B). Gene set enrichment analysis (GSEA) on Hallmark gene sets indicated significant enrichments in DNA repair, G2/M checkpoint, and oxidative phosphorylation pathways, as expected upon IR (Fig. 3 C), whereas the main pathways lost in IR are related to cell signaling (Fig. 3 D). Among these, the most significant decrease concerns genes regulated by NF-κB in response to TNF-α (Fig. 3, D and E). A recent report showed that TNF-α induces a specific prosurvival gene signature in HSCs (Yamashita and Passegué, 2019). Interrogating this gene signature, composed of 62 genes representing both core regulators of the NF-κB pathway and TNF-α–induced HSC-specific survival genes, we observed its significant loss upon IR (Fig. 3 F). IR also induced the loss of the different HSC TNF-α signatures taken individually: the two in vitro signatures obtained 3 and 12 h after TNF-α treatment and the in vivo signature obtained 3 h after TNF-α treatment in mice (Fig. S2 A). By contrast, IR had no effect on the granulocyte/monocyte progenitor (GMP) TNF-α gene signature (Fig. 3 G; Yamashita and Passegué, 2019).
Given the importance of the NF-κB pathway at maintaining HSC survival and self-renewal or during chemotherapeutic and IR stress (Hu et al., 2021; de Laval et al., 2014; Yamashita and Passegué, 2019), we next interrogated different published HSC signatures that are enriched in low-output/self-renewing and functional long-term regeneration HSCs compared to differentiating HSCs, or in dormant vs. activated HSCs (Cabezas-Wallscheid et al., 2017; Chambers et al., 2007; Pietras et al., 2015; Rodriguez-Fraticelli et al., 2020). We found a significant loss of all these signatures upon IR (Fig. 3 H and Fig. S2 B). Likewise, the megakaryocyte (MK)-biased output HSC signature (Rodriguez-Fraticelli et al., 2020), representing platelet-primed HSCs that were also previously described to be at the apex of the HSC hierarchy (Sanjuan-Pla et al., 2013), was also enriched in NIR vs. IR HSCs (Fig. S2 B). Conversely, the high-output and multilineage signatures (Rodriguez-Fraticelli et al., 2020), which mark differentiating HSCs, showed a significant enrichment upon IR (Fig. S2 C). Altogether, these data show that IR induces a loss of transcriptional signatures involved in HSC quiescence, long-term potency, and self-renewal capacity, and a gain in gene signatures involved in HSC differentiation, thus recapitulating the HSC loss of self-renewal previously reported upon IR (de Laval et al., 2013).
IR-induced downregulation of gene expression is associated with the presence of L1Md in their introns
In order to check if gene deregulation upon IR might be associated with H3K9me3 enrichment changes at gene promoters, we quantified H3K9me3 enrichment over promoter regions (−2 kb; +1 kb around transcription start site [TSS]; U-MRA). We found 239 promoter sequences showing a significant (P < 0.05) H3K9me3 differential enrichment upon IR, 211 and 28 showing decreased and increased H3K9me3, respectively (Fig. S2 D). However, these variations are not correlated with gene upregulation, and only very poorly correlated with gene downregulation (Fig. 4 A and Fig. S2 E).
Since the loss of H3K9me3 upon IR mainly occurs at L1Md, we sought to determine if L1Md derepression might be involved in the gene deregulation observed after IR. Most of the information concerning H3K9me3 enrichment at young RTE such as L1Md is obtained through M-MRA. However, reads from the M-MRA are arbitrary assigned and cannot be precisely localized. Thus, it is impossible to determine if L1Md derepression is associated with gene deregulation by crossing our ChIP-seq and RNA-seq data. To overcome this issue, we crossed the list of the 1,067 DEGs (P < 0.05) in IR vs. NIR from our RNA-seq data with the list of genes hosting one or several L1Md (reconstructed repeatMasker database; Fig. 4 B and Table S3). The vast majority of these L1Md are located in introns (99%; Fig. S2 F). We found that 377 DEGs in IR host one or several L1Md, in majority located in introns. This is significantly (P < 0.0001) more than one would expect by chance, as revealed by a permutation test using 10,000 lists of 1,067 genes randomly extracted from the refseq database (Fig. 4, B and C). These results reveal a strong and significant association between gene deregulation upon IR and the presence of an intronic L1Md in these genes. This is specific for L1Md as no significant association was observed between DEGs and the presence of Lx5, an older LINE subfamily (Fig. 4 D). Surprisingly, this association is specific for genes that are downregulated upon IR (Fig. 4 E) and was not found for upregulated genes (Fig. 4 F and Table S3).
Interestingly, 50% of the genes from long-term (LT)-HSC signatures (Chambers et al., 2007; Pietras et al., 2015) whose expression is significantly decreased upon IR (Fig. 4, G and H) contain one or several L1Md in their introns. Similarly, we found that 31 and 41% of the genes from the low-output and MK-biased signatures (Rodriguez-Fraticelli et al., 2020) host one or several L1Md (Fig. S2, G and H). Altogether, these data highlight a significant association between genes whose expression is repressed upon IR, notably those belonging to the HSC signature, and the presence of intronic L1Md.
Gene repression upon IR is associated with loss of H3K9me3 at selected intronic L1Md loci harboring NF-κB binding sites
We next investigated if the loss of H3K9me3 induced by IR at intronic L1Md in a given gene could be associated with its decreased expression in cis. For this purpose, we chose six candidate genes whose expression is significantly reduced by IR in our RNA-seq data: Mecom, Pkia, Ttc8, Rbms3, Rmdn2, and Akt3 (Table S2), five of them (Mecom, Pkia, Ttc8, Rbms3, Rmdn2) being part of the different HSC signatures, as well as four negative controls, also harboring at least one intronic L1Md but whose expression remains unchanged upon IR (0.7 < fold change < 1.3, P > 0.05): Snx27, Mapre2, Celf2, and Pdcd1lg2. 1 mo after TBI, a significant downregulation of Mecom, Pkia, and Rmdn2, but not of Snx27, Mapre2, Celf2, was observed (Fig. 5, A and C). To assess H3K9me3 enrichment specifically at intronic L1Md of these genes, we performed H3K9me3 ChIP-qPCR experiments using primer pairs located both in the intron and in the 5′ end of the L1Md, thus allowing unique and specific amplicon production (Fig. 5, A and B). Apart for Rmdn2, we chose the longest L1Md (>5 kb), displaying higher enrichment at the 5′ end (Fig. 2 H), to detect significant basal H3K9me3. We first confirmed that H3K9me3 is indeed present at intronic L1Md, as compared to Spi1 and 5S negative controls (Fig. S3, A and B). Of note, H3K9me3 levels at tested L1Md are similar to those found globally at L1Md_A promoters (Fig. S3 B). IR induced a significant decrease in H3K9me3 specifically at all the chosen candidate intronic L1Md of genes downregulated upon IR (Fig. 5 D and Fig. S3 C). Of note, we randomly chose four negative candidates that fit our criteria, and for the four candidates selected, no loss of H3K9me3 was detected. This indicates that IR-induced H3K9me3 loss at intronic L1Md is not a general event but rather that it seems to occur only in specific genes whose expression is reduced upon IR. This suggests that the presence of the H3K9me3 mark at this location may play a role in the regulation of gene expression in cis upon IR.
In order to test this hypothesis, we chose to delete specifically the intronic L1Md of Mecom through CRISPR/Cas9 and guide RNAs targeting each side of the L1Md (5′gRNA and 3′gRNA; Fig. 5 E) and tested Mecom expression upon IR. We electroporated LSK cells with the Cas9/guide RNA (gRNA) ribonucleoprotein particles (RNP) complex (Gundry et al., 2016) together with a siglo green electroporation indicator and we sorted electroporated (FITC+) HSCs for analysis (Fig. 5 F and Fig. S3 D). We first tested different combinations of 5′ and 3′ gRNAs (Table S4). Only one of these combinations was efficient in deleting the L1Md (Fig. S3 E). Using this combination, we then validated that L1Md deletion occurred in Mecom, but not in Pkia, through qRT-PCR using the ChIP-qPCR primers (Fig. S3 F). We also validated that in mock electroporated cells IR could induce the specific downregulation of Mecom and Pkia but not of Snx27 expression in vitro, as observed in vivo (Fig. 5 G). Deletion of its intronic L1Md leads to a reduction in Mecom gene expression, without affecting Pkia or Snx27 expression, suggesting that the presence of the L1Md is important for the proper regulation of Mecom expression. While Pkia expression remained significantly reduced upon IR after Mecom intronic L1Md deletion, Mecom expression was not affected anymore. This suggests that Mecom intronic L1Md acts in cis, and is responsible for the specific gene downregulation upon IR (Fig. 5 G).
To unravel what makes the specificity of H3K9me3 loss at L1Md upon IR, we interrogated the potential enrichment for binding motifs for transcription factors in the L1Md located in genes downregulated in IR (P < 0.05, 1069 L1Md = input sequences) vs. nonderegulated genes (P > 0.05, 0.7 < fold change < 1.3, 1310 L1Md = background sequences) through de novo motif search using BAMMmotif (https://bammmotif.soedinglab.org/; Kiesel et al., 2018). Of the four motifs found significantly enriched in L1Md located in downregulated genes, two were not retrieved in the mouse Hocomoco motif database, one motif corresponds to the binding site of Hoxa1, and one to the binding site of Rel, a member of the NF-κB transcription factor family (Fig. 5 H). De novo motif search in the L1Md located in downregulated genes vs. upregulated genes also retrieved Rel motif (Fig. S3 G). Furthermore, de novo search for motifs specifically enriched in L1Md located in genes participating (284 L1Md) vs. genes not participating (793 L1Md) to the loss of the HSC signature (Fig. 3 H) also revealed significant enrichment for NF-κB transcription factors Rel, RelA, and Nfkb1 (Fig. 5 I). Similar results were found for the low-ouput/self-renewing LT-HSC signatures (Fig. S3, H and I). Of note, this motif is present in the intronic L1Md of IR-regulated genes Mecom, Pkia, Ttc8, Rbms3, Akt3, and Rmdn2 which loose H3K9me3 upon IR (Fig. 5 D, Fig. S2 C, and Table S3), supporting the possibility that it may regulate the presence of H3K9me3 at these loci.
Performing de novo search on promoter sequences of genes downregulated by IR (P < 0.05, 3,893 promoter sequences) vs. nonderegulated genes (0.7 < fold change < 1.3, P > 0.05, 14,208 promoter sequences from which we randomly extracted 3,893 sequences) did not show specific enrichment in motifs for NF-κB members. Instead, motifs for different transcription factors such as Egr, Znf, Foxj, and Foxq were found (Fig. S3 J).
Altogether, these data suggest that the NF-κB pathway may control HSC gene expression by regulating the presence of H3K9me3 mark at L1Md located in introns, and not by affecting promoters.
TNF-α treatment prevents loss of H3H9me3 at intronic L1Md and HSC gene repression during IR stress
In mammals, the NF-κB family is composed of five members: RELA (p65), RELB, c-REL, and the precursor proteins NFKB1 (p105) and NFKB2 (p100), which are processed into their active forms, p50 and p52, respectively, and are active as homo- or heterodimers (Cartwright et al., 2016). The canonical NF-κB pathway involves p50, p65, and c-Rel. P50 lacks transactivation domain. Thus, while p50/p65 or c-Rel heterodimers act as transcriptional activators, p50/p50 (NFKB1) homodimers are generally described as transcriptional repressors. NFKB1 notably represses the expression of proinflammatory genes through the recruitment of chromatin modifiers and H3K9 methylation (Ea et al., 2012; Elsharkawy et al., 2010). In addition, p50 has been shown to shuttle between the nucleus and cytoplasm and to bind to a large number of genes in unstimulated cells (Schreiber et al., 2006). This makes of this factor a good candidate to regulate H3K9me3 levels at L1Md presenting NF-κB sites and IR-induced HSC gene expression changes. Supporting this possibility, HSCs sorted from mice 1 mo after TBI showed a significant decrease in both NFKB1 mRNA, as well as protein expression tested with two different antibodies (Fig. 6, A–C; and Fig. S4 A). Processing of p105 to p50 is regulated both independently of the NF-κB activation pathway and during activation of the canonical pathway induced by proinflammatory cytokines such as TNF-α (Cartwright et al., 2016). As shown above, IR induces a loss of the TNFA_signaling_Via_NFKB signature (Fig. 3, D and E). Thus, we asked whether TNF-α stimulation could prevent IR effects by rescuing the levels of p50 homodimers. NFKB1 protein expression decreased after 48 h of culture of purified HSCs that have been irradiated in vitro (Fig. 6, D and E; and Fig. S4 B). Addition of TNF-α to the cell medium prior to IR prevented this effect. As after TBI, the loss of NFKB1 induced by IR in vitro was associated with a specific decrease of Mecom, Pkia, and Ttc8 but not of Celf2, Snx27, and Mapre2 mRNAs (Fig. 6 F and Fig. S4 C). Importantly, H3K9me3 ChIP-qPCR at intronic L1Md of the selected genes correlated with gene expression with a significant and specific reduction at Mecom and Pkia intronic L1Md but not at Snx27 and Celf2 (Fig. 6 G). This shows that IR-induced H3K9me3 loss at intronic L1Md and gene expression changes in HSCs are direct and short-term and that TNF-α stimulation can rescue both effects. Supporting a role of NFKB1 in these effects, Nfkb1−/− HSCs showed reduced expression of Mecom and Pkia but not of Snx27 mRNA when compared to WT HSCs (Fig. 6 H). In addition, TNF-α was unable to rescue HSC gene expression or H3K9me3 levels at their intronic L1Md in Nfkb1−/− HSCs (Fig. 6, I and J).
We then investigated if TNF-α stimulation could prevent IR effects in vivo. WT mice received two injections with 2 µg TNF-α at 12-h intervals and were irradiated 1 h after the last injection. We then performed H3K9me3 CUT&Tag and RNA-seq in HSCs sorted 1 mo after IR (Fig. 7 A). CUT&Tag gave an efficient profiling of H3K9me3 with exceptionally low background as observed at the SUV39h-1– and -2–dependent H3K9me3 peaks (Bulut-Karslioglu et al., 2014; Fig. S5 A) and as previously described (Kaya-Okur et al., 2019). Plot profile of H3K9me3 enrichment along L1Md sequences confirmed the loss of H3K9me3 at L1Md upon IR and showed it could be prevented by TNF-α (Fig. 7 B). Notably, TNF-α also inhibited the specific loss of H3K9me3 enrichment at the intronic L1Md of Mecom and Pkia but not Mapre2 (Fig. 7 C) and restored the corresponding gene expression (Fig. 7 D). RNA-seq analysis further showed that TNF-α injection in vivo prevented IR-induced loss of both TNF-α via NF-κB (Fig. 7 E) and long-term HSC signatures (Fig. 7 F and Fig. S5 B).
TNF-α treatment prevents HSC loss of function during IR stress independently of their level of DNA damage
To confirm the importance of this pathway in HSC maintenance, we analyzed the effect of TNF-α treatment on HSC function upon IR. Total bone marrow (BM) cells isolated from mice treated with TNF-α before TBI or not were transplanted in competition with total BM cells from mice ubiquitously expressing GFP (ubi-GFP mice) into lethally irradiated ubi-GFP mice (Schaefer et al., 2001; Fig. 7 A). 7 wk and 3 mo after reconstitution, as expected, the percentage of GFP negative IR donor cells in the blood was greatly decreased. TNF-α treatment before TBI significantly prevented this effect (Fig. 8, A and B; and Fig. S5 C). It also significantly prevented IR-induced Lin-Sca+Kit+ (LSK) and LT-HSC loss (Fig. 8, C and D; and Fig. S5 D). Secondary transplants showed that the self-renewal function of HSCs after IR could be restored by TNF-α treatment (Fig. 8 E). TNF-α treatment could similarly rescue blood reconstitution and LT-HSCs in the BM when injected after TBI, either in one dose at 6 h or in two doses at 1 and 13 h (Fig. 8, F–J).
Contrary to what we previously observed with thrombopoietin (de Laval et al., 2013), TNF-α treatment was not able to prevent γH2AX foci formation upon IR or to enhance the DSB repair as we observed no effect of TNF-α on the number of γH2AX foci at short (30 min) or long (24 h) term after IR in vitro (Fig. 8 K). In addition, TNF-α treatment had no effect on the number of γH2AX foci in HSCs 1 mo after IR (Fig. 8 L). Altogether, these data suggest that TNF-α treatment rescues HSC reconstitution ability upon IR independently of their level of DNA damage by preventing IR-induced decrease in NFKB1 repressor expression, specific derepression of L1Md harboring NF-κB binding sites in the introns of HSC genes, and thereby their repression (Fig. 9).
Discussion
H3K9me3 alterations are a hallmark of aging and cellular senescence in model organisms (Criscione et al., 2016; Ocampo et al., 2016). Although H3K9me3 has been shown to be crucial for HSC identity (Koide et al., 2016), H3K9me3 changes in HSCs have, to our knowledge, never been studied in the context of stress such as IR. We show here that IR stress profoundly affects HSC heterochromatin by significantly reducing H3K9me3 enrichment, without affecting the expression of factors controlling H3K9 tri-methylation (SETDB1, KAP1, or MPP8 from the human silencing hub complex). This loss mainly occurs at evolutionary recent L1Md elements. This effect was observed both after short time in vitro and long time after TBI in vivo, suggesting that heterochromatin alterations may explain the long-term effect of IR on HSC function.
Heterochromatin alterations are associated with a strong deregulation of the HSC transcriptome. H3K9me3 enrichment at promoters has recently emerged as a key player in the repression of lineage-inappropriate genes (Koide et al., 2016). Surprisingly, we found here that gene deregulation is not associated with H3K9me3 changes at gene promoters, but is rather associated with the loss of H3K9me3 at intronic L1Md. H3K9me3 enriched at intronic L1Md was previously shown to be involved in the tight regulation of gene transcription in ESCs (Liu et al., 2018). Some ERVs also play the role of AML enhancers with a driving role in leukemia cell phenotype (Deniz et al., 2020). However, our study is the first showing the involvement of L1Md on the regulation of HSC gene expression.
With CRISPR/Cas9 targeted deletion, we show that the presence of intronic L1Md in Mecom is required for Mecom gene-specific downregulation upon IR and suggests that intronic L1Md can act in cis to regulate gene expression. Although surprising at first glance, repression of genes following derepression of intragenic L1 was previously reported in cancers (Aporntewan et al., 2011). This may be due to transcriptional interference (Han et al., 2004; Kaer et al., 2011; Ninova et al., 2020). We cannot test directly this hypothesis due to the repetitive nature of these sequences. However, we could observe downregulation of Rmdn2 gene whereas the L1Md located in its intron lacks 5′UTR promoter sequences (Fig. 5 B), suggesting that gene repression may not be exclusively due to intronic L1Md transcription. Instead of blocking transcription, DNA methylation or H3K9me3 within gene bodies is a feature of transcribed gene (Jones, 2012; Ninova et al., 2020; Vakoc et al., 2005). H3K9me3 loss in gene bodies was previously shown to be associated with gene repression (Ninova et al., 2020). The presence of H3K9me3 islands in the body of the genes has been proposed to slow-down RNA Polymerase II (RNAP II) elongation rate (Saint-André et al., 2011; Vakoc et al., 2005). This was shown to help the recognition of true vs. cryptic RNA processing sites, controlling alternative splicing (de la Mata et al., 2003), polyadenylation, and finally transcript stability. This is particularly relevant in the case of genes bearing long introns. Interestingly, the genes downregulated upon IR in HSCs are significantly longer than by chance (data not shown). Slowing down RNAP II elongation rate in these long intron genes might also prevent R-loop formation and genomic instability (Aguilera and Gaillard, 2014). Thus, even in the absence of intronic L1Md transcription, loss of H3K9me3 islands enriched at intronic L1Md in the body of the genes might lead to gene repression. In accordance with this hypothesis, we showed that deletion of the L1Md in the intron of Mecom resulted in decreased Mecom gene expression (Fig. 5 G). Mecom repression in this case may be due to the absence of H3K9me3 islands in its gene body, as observed upon IR-mediated loss of H3K9me3. However, the precise mechanisms involved in HSC gene regulation through their intronic L1Md will require further investigations.
Our RNA-seq analysis indicates a strong downregulation of the TNF-α–NF-κB gene signature 1 mo after TBI. IR specifically reduced the recently described HSC-prosurvival TNF-α–NF-κB signature required to maintain HSCs during inflammation or cytotoxic BM ablation (Yamashita and Passegué, 2019). This suggests that loss of long-term regenerating HSCs and TNF-α–NF-κB gene expression upon IR may be linked. Supporting this possibility, treatment of HSCs with TNF-α before IR in vitro, or its injection to mice before or after TBI, restored HSC gene expression and their reconstitution ability. These results strongly support previous data showing that TNF-α–NF-κB signaling is required to regulate HSC function under stress (Hu et al., 2021; de Laval et al., 2014; Yamashita and Passegué, 2019). TNF-α promotes HSC survival through p65/RelA NF-κB subunit (Yamashita and Passegué, 2019). This factor has also been found to control the expression of genes involved in HSC maintenance (Stein and Baldwin, 2013). However, the promoters of HSC genes downregulated upon IR are not enriched in NF-κB binding sites. We show that L1Md associated with gene repressed upon IR are specifically and significantly enriched in NF-κB binding sites, and that this pathway regulates gene expression by controlling the level of H3K9me3 at these sequences. Interestingly, H3K9me3 enriched genomic regions specific to human ESCs as compared to more differentiated cells are enriched in NF-κB binding sites, suggesting their importance in establishing and maintaining the pluripotent state (Whitaker et al., 2015).
Whereas most of the NF-κB members can form active transcription factors, NFKB1 p50 subunit lacks transactivation domain and p50:p50 homodimers have been shown to act as stimulus-specific repressors, notably during the resolution phase of inflammation, by recruiting H3K9 methyltransferases and histone deacetylases at both NF-κB and type-I IFN response genes (Cartwright et al., 2018; Ea et al., 2012; Elsharkawy et al., 2010). p50:p65 heterodimers are the most abundant form of NF-κB generated upon inflammatory stimuli. By contrast, p50 homodimers predominate in unstimulated cells where they can be prebound to the chromatin (Cartwright et al., 2016; Schreiber et al., 2006), suggesting that this factor may also play a role under noninflammatory conditions. Indeed, we found that NFKB1 is present in both the cytoplasm and the nucleus of resting HSCs. Deletion of Nfkb1, or downregulation of p50/NFKB1 gene and protein as induced by IR, correlates with the decrease of both gene expression and H3K9me3 levels at their intronic L1Md harboring an NF-κB binding sites. Conversely, increasing p50 production upon TNF-α stimulation rescued H3K9me3 levels at the specific intronic L1Md and gene expression in WT but not in Nfkb1−/− HSCs. This strongly supports the possibility that p50:p50 homodimers are the active repressor promoting the enrichment of H3K9me3 at L1Md located in HSC genes and the cis regulation of the host gene.
A growing body of evidence indicates that TEs have been coopted for transcriptional regulation in different cell and tissue types (Chuong et al., 2017). TEs are reservoirs of functional transcription factor binding sites. Since these sequences are widespread in the genome, they are largely contributing to the innovation of regulatory networks in a tissue-specific fashion (Chuong et al., 2017; Hermant and Torres-Padilla, 2021; Sundaram and Wang, 2018). Although LTRs dominate this relationship, a search for binding motifs in young L1 in human and mouse has revealed the presence of various TF motifs, including CTCF, YY1, and MYC (Sun et al., 2018; Sundaram and Wang, 2018). Our results show that NF-κB motifs are specifically enriched in most of the intronic L1Md sequences of genes downregulated during IR stress, and involve as much as 96% of these genes (Table S3). The presence of NF-κB binding sites in TEs is reminiscent of a study reporting that, in the human genome, 11% of NF-κB–binding sites reside in specific Alu SINEs, and that the vast majority of sites bound by NF-κB do not correlate with changes in gene expression (Antonaki et al., 2011). Although it is not known how many of these NF-κB motifs present in intronic L1Md have a functional role, the ability of TNF-α to restore both H3K9me3 levels at the L1Md and the expression specifically of genes including NF-κB motif-enriched intronic L1Mds strongly suggests that at least some of these NF-κB–TE associations can influence gene expression.
TEs have rewired the antiviral gene regulatory network and they have been shown to play a key role in the regulatory evolution of immune response. Strong but opposing forces are driving the coevolution of TEs and antiviral defense (Chuong et al., 2016; Moelling and Broecker, 2019). Many IFN/NF-κB–target genes are viral restriction factors and contribute to the immune control of both endogenous (i.e., TEs) and exogenous genomic parasites (Gázquez-Gutiérrez et al., 2021;,Schneider et al., 2014). We and others have previously shown that IFN-I signaling controls young L1Md expression and L1 retrotransposition in HSCs and various tissues (Barbieri et al., 2018; Goodier et al., 2015; Yu et al., 2015). However, through the formation of double-strand (ds)RNA or cytoplasmic cDNA resembling viral nucleic acids, TEs are sensed by the cells as invading viruses and promote the activation of IRF3 and NF-κB transcription factors and the major antiviral immune pathways (Gázquez-Gutiérrez et al., 2021; Volkman and Stetson, 2014). Notably, TE-derived dsRNAs have been shown to provide the inflammatory signal necessary for HSC generation during embryonic development (Lefkopoulos et al., 2020). Intriguingly, beside HSC maintenance genes, many genes involved in IFN and NF-κB immune response pathway are found among genes downregulated in IR presenting an intronic L1Md with an NF-κB binding site (Table S3). These include target genes of IFN and NF-κB, such as EiF2ak2 and Oas1g, that are known to control L1 retrotransposition and/or levels, and whose activity are triggered by virus- or TE-derived dsRNAs; Jak2, Tyk2, Tnfrsf9, and Birc2 involved in IFN and TNF responses, respectively, as well as T-cell suppressing activity genes, CD274 (PD-L1) and CD86. This further reinforces the causal relationships between TEs and immune genes and their coevolution. Interestingly, a higher TE occurrence has been found in immune gene-associated genomic regions and young TEs are specifically enriched in blood cells, as compared to other tissues (Trizzino et al., 2018; Ye et al., 2020). Using BAMMmotif for de novo motif search, we have found that the NF-κB motif is specifically enriched in L1Md that are present in genes of the HSC signature, and in the myeloid-leucocyte-mediated-immunity signature (GO:0002444) as compared to genes enriched in pancreas, testis, kidney, liver, placenta, and salivary gland (Su et al., 2002; Fig. S3 K), or in genes from the immune system process (GO:0002376) as compared to genes from the reproductive process (GO:0022414; Fig. S3 L). This suggests that NF-κB binding sites in L1Md might have been actively selected in introns of key HSC genes because of the immune-linked maintenance. This regulation might be important to expand the NF-κB and TNF-α activity by engaging more genes, including HSC maintenance genes into the NF-κB regulatory networks. Such activity could be important to maintain HSCs while allowing expression of immune gene during TNF-α–NF-κB–induced myeloid regeneration or damage-induced bone marrow ablation, and further highlights the complex role of inflammation-induced pathways in HSCs. TNF-α levels are increased in patients with hematopoietic malignancies and the HSC-specific TNF-α signature is upregulated in myelodysplastic syndrome/AML malignant HSCs (Yamashita and Passegué, 2019). Exploring the mechanisms controlling TE expression and how inflammatory signals and aging impact them in normal and malignant HSC could lead to the identification of new selective dependencies of AML and new treatment strategies.
Materials and methods
Mice strains and treatments
WT C57BL/6J mice (6–8 wk old) were from the Envigo Laboratories. Nfkb1−/− mice were from The Jackson Laboratory (B6.Cg-Nfkb1tm1Bal/J; Stock No:006097). All the mice were housed in a specific pathogen–free environment. All procedures were reviewed and approved by the Animal Care Committee no. 26 approved by the French Ministry for Research (#2019_078_23286; CE). Mice were injected retro-orbitally with 2 µg TNF-α (Biolegend-Ozyme) before or after sublethal TBI (2 Gy; RX irradiator X- RAD 320).
Cell harvest and culture
Bone marrow was harvested from femur, tibia, and hip bones in mice. Total bone marrow was depleted of differentiated hematopoietic cells (lineage-positive cells) using Mouse Hematopoietic Progenitor (Stem) Cell Enrichment Set (BD). Magnetically sorted Lineage-negative (lin−) cells were kept overnight (O/N) at 4°C in IMDM medium supplemented with 10% FBS (HyClone) and 1% penicillin-streptomycin (Thermo Fisher Scientific). Staining was performed for 20 min at room temperature (RT) using CD3ε (Lin)–APC clone 145-2C11 (553066; BD), TER-119 (Lin)–APC clone Ter-119 (557909; BD), CD45R/B220 (Lin)–APC clone RA3-6B2 (553092; BD), Ly6G-6C (Lin)–APC clone RB6-8C5 (553129; BD), Ly-6A/E (Sca-1)–PeCy7 clone d7 (558162; BD), CD117 (c-Kit)–PE or PerCP-Cy5.5, clone 2B8 (553355 or 560557, respectively; BD), CD34–FITC or AF700 clone RAM34 (560238 or 560518; BD), CD135 (Flk2)–BV421 or PE clone A2F10.1 (562898 or 553842, respectively; BD). HSCs (Lin−Sca+c-Kit+CD34lowFlk2−) were sorted using ARIA3, ARIA Fusion, or Influx cell sorters (BD) and collected in Stem Span (StemCell).
When the cells were irradiated in vitro, HSCs were cultured in medium containing Flt3-Ligand, IL-3, IL-6, SCF as described (de Laval et al., 2013) in the presence or absence of TNF-α. TNF-α was added to the medium at 1 µg/ml 1 h before IR.
CRISPR-Cas9 deletion
gRNAs were designed to generate specific deletion of the intronic L1Md of Mecom using CRISPOR (http://crispor.tefor.net/; Table S4). 1 µg total gRNAs (0.5 µg 5′-gRNA + 0.5 µg 3′-gRNA; Dharmacon) were incubated with 1 µg Cas9 (CAS12205; Dharmacon) during 15 min at RT and the Cas9-gRNA RNP was then co-electroporated with an equimolar siglo-green transfection indicator (D-001630-01-05) in 100 000 LSK after an O/N culture in medium containing Flt3-Ligand, IL-3, IL-6, SCF, as described (de Laval et al., 2013), and using a Neon transfection system (Thermo Fisher Scientific) with the optimized electroporation condition 1700 V, 20 ms, 1 pulse as previously described (Gundry et al., 2016). Just after electroporation, FITC + HSC were sorted and collected in Stem Span (StemCell) containing Flt3-Ligand, IL-3, IL-6, SCF, and left O/N in culture before irradiation. Cells were finally collected 48 h after irradiation for further experiments.
DNA extraction and genomic deletion verification
DNA from electroporated HSC was extracted using the tissue XS kit (Macherey-Nagel) according to the manufacturer’s instructions, and the specific deletion of Mecom intronic L1Md was checked through qRT-PCR using the ChIP-qPCR primers (Table S4).
qRT-PCR
HSCs were lysed in Tri-Reagent (Zymo Research) and stored at −80°C until used. Total RNA was extracted using the Direct-Zol RNA microprep kit (Zymo research) and reverse-transcribed with EZ Dnase VILO (Invitrogen). Real-time PCR was performed using the SYBR pPCR premix Ex Taq (Takara) or LUNA Universal qPCR Master Mix (NEB) on a 7500 real-time PCR machine (Applied Biosystems). Samples were tested for qPCR before reverse transcription to rule out detection of contaminating DNA. qPCR primers used were designed in different exons so as to minimize possible gDNA amplification. All data were normalized to the mean expression of RPL32 and hypoxanthine phosphoribosyltransferase (HPRT). Primer sequences are shown in Table S4.
When necessary, 1.25 µl of cDNA was preamplified for 14 PCR cycles in a multiplex reaction using Preamp Master-Mix (100-5580; Fluidigm) and primer mix (200 µM of each primer). To rule out primer dimerization or hairpin formation in the preamplification mix, primer sequences were previously analyzed using MFE3.0 PCR Primer Quality Control Software (Wang et al., 2019).
ChIP-qPCR
10,000 HSCs were harvested in 1 ml IMDM medium supplemented with 10% FBS and cross-linked using 1% formaldehyde (Invitrogen) for 10 min at RT. ChIP-qPCR experiments were performed using the True Micro-ChIP Kit (Diagenode) according to the manufacturer’s instructions. Cells were sonicated using the Bioruptor Pico (Diagenode) sonication device for 10 cycles (20 s ON/40 s OFF). Chromatin was incubated O/N at 4°C using 0.25 µg of H3K9me3 (C15410193; Diagenode) per IP. ChIP DNA was eluted and purified using the MicroChIP Diapure Columns (Diagenode). Subsequent qPCR was performed as above. ChIP-qPCR primers for intronic L1Md were designed such that one primer is located in the 5′ region of the L1Md, and the other primer is located in the intron of the host gene to allow the amplification of unique and specific product (Table S4).
Immunofluorescence
3,000–5,000 HSCs were cytospun on glass slides and immunofluorescence was performed as previously described (de Laval et al., 2013). Two different monoclonal anti-NFKB1 (p50) antibodies were used at 1/200: clone E10 was purchased from Santa Cruz Biotechnology, and clone D4P4D from Cell Signaling. γH2AX antibody was purchased from Millipore (05-636-I) and used at 1/2000. Detection was performed using Alexa Fluor 488–coupled antimouse secondary antibody (1/600). All slides were visualized using SPE confocal microscope (Leica). Pictures were analyzed using CellProfiler.
Statistical analysis
Results were statistically evaluated using either the one-way ANOVA or unpaired t test using GraphPad Prism version 6.0 software (GraphPad Software Inc.). The results are displayed as the means and SEM. The value of *, P < 0.05 was considered as significant, and **, P < 0.01 or ***, P < 0.001 as highly significant.
ChIP-seq
The ChIP-seq experiment has been conducted by Diagenode ChIP-seq (Cat# G02010000; Diagenode). The chromatin was prepared using the True MicroChIP Kit (Cat# C01010130; Diagenode). Chromatin was sheared using Bioruptor Pico sonication device (Cat# B01060001; Diagenode) combined with the Bioruptor Water cooler for seven cycles using 30″ (ON) 30″ (OFF) settings. Shearing was performed in 0.65 ml Bioruptor Pico Microtubes (Cat# C30010011; Diagenode) with the following cell numbers: 10,000 cells in 100 µl for batch #1 and 20,000 in 100 μl for batch #2. 30 μl of this chromatin was used to assess the size of the DNA fragments obtained by High Sensitivity NGS Fragment Analysis Kit (DNF-474) on a Fragment Analyzer (Advanced Analytical Technologies, Inc.). ChIP was performed using IP-Star Compact Automated System (Cat# B03000002; Diagenode) following the protocol of the aforementioned kit. Chromatin corresponding to 7,000 or 18,000 cells was immunoprecipitated using 0.5 µg of H3K9me3 antibody (C15410193; Diagenode antibody). Chromatin corresponding to 10% was set apart as Input.
For each sample, a library preparation was performed on 500 pg of DNA using the MicroPLEX v2 protocol. The ChIP samples were processed together and a control library was processed in parallel of the samples using the same amount of a control Diagenode ChIP’d DNA. According to the protocol, 12 cycles of amplification were performed to amplify the libraries. After amplification, 1 μl of each library was loaded on Fragment Analyzer to check if enough material was generated. If not, additional cycles were performed until having enough material. The libraries were amplified for two to seven additional cycles, and then 1 μl of the libraries was analyzed on the Fragment Analyzer. Using the quantification values from the Qubit and the size measurement generated by the Fragment Analyzer, the molar concentration of each library was calculated. Then, the different libraries were diluted to reach the final concentration each and pooled together. Batch #1 was sequenced into two lanes of a Hiseq 4000 (75 bp, paired end), and batch #2 was sequenced into one lane of a NovaSeq (150 bp, paired-end).
RNA-seq
HSCs from individual mice were lysed in Tri-Reagent (Zymo Research) and stored at −80°C until used. Total RNA was extracted using the Direct-Zol RNA microprep kit (Zymo research). For the IR vs. NIR analysis, the RNA integrity (RNA Integrity Score ≥7.0) was checked on the Agilent Fragment Analyzer (Agilent) and quantified. All samples were subjected to SMARTer cDNA synthesis using SMARTer Ultra Low Input RNA Kit for Sequencing - v3. Double-stranded cDNA (ds-cDNA) was sheared using Covaris to obtain ds-cDNA in the 200–500 bp size range. ds-cDNA fragments were end-repaired, extended with an “A” base on the 3′ end, ligated with indexed paired-end adaptors (NEXTflex; Bioo Scientific) using the Bravo Platform (Agilent), amplified by PCR for 6 cycles and purified with AMPure XP beads (Beckman Coulter). For TNF-α analysis, all samples were subjected to SMARTer cDNA synthesis using SMARTer stranded total RNA-seq kit v3 following the manufacturer’s instructions. Fragmentation time is adjusted depending on the quality of the RNA input.
The final libraries were pooled and sequenced using the onboard cluster method, as paired-end sequencing (2 × 100 bp reads) on Illumina NovaSeq-6000 sequencer at Gustave Roussy (Illumina).
CUT&Tag
CUT&Tag-IT assay kit (Active Motif) was used on 3,000 HSCs according to the manufacturer’s instructions. Cells were incubated O/N with 0.5 µg of H3K9me3 (C15410193; Diagenode).
Genomic analysis
RNA-seq
Reads quality
Quality of RNA-seq reads was assessed with Fastqc v0.11.8, Fastq-screen (Wingett and Andrews, 2018) v0.13.0, and MultiQC (Ewels et al., 2016) v1.7.
RNA quantification
Salmon (Patro et al., 2017) tool v0.14.1 was used to quantify mm10 NCBI RNA reference sequences (O’Leary et al., 2016; RefSeq Curated, last updated 2017-11-16) downloaded from the UCSC Table Browser (Karolchik et al., 2004). Salmon was launched with the following parameters: --numBootstraps 60 --libType A --validateMappings.
For the second RNA-seq performed in NIR, IR, and IR + TNF-a conditions, we used nf-core/rnaseq (version 3.3) pipeline for RNAseq analysis (https://doi.org/10.5281/zenodo.1400710), with the following additional parameters: --genome mm10 --clip_r2 14, and performed on the Core Cluster of the Institut Français de Bioinformatique (ANR-11-INBS-0013).
Differential gene expression analysis
Statistical analysis was performed using R v3.5.1. Transcript expression levels were aggregated in gene expression levels using tximport Bioconductor package (Soneson et al., 2015) v1.13.16. Deseq2 (Love et al., 2014) v1.22.2 method was used to identify differentially expressed genes between groups with a P value threshold of 0.05.
Permutation test
To create the list of genes hosting an L1Md, browser extensible data (BED) files containing L1Md genomic localizations (reconstructed Repbase from Walter et al. [2016]) were intersected with the refseq_curated database from UCSC. Permutation test (n = 10,000) between lists of genes hosting an L1Md and DEG in IR vs. NIR, or the same number of random genes (randomly extracted from Refseq without DEG) was performed using R studio and considered significant if P < 0.01.
Motif enrichment analysis
Motif enrichment analysis was performed using BaMM! Web interface (Kiesel et al., 2018; Siebert and Söding, 2016) and de novo motif discovery module (pattern = 10, P < 0.001). Query motif was matched to known motifs using the Hocomoco mouse database.
GSEA analysis
GSEA analysis was performed using Hallmark Gene Sets V7. To plot graphs, −log10 P value is set to 4 when P < 0.0001.
ChIP-seq
Alignment
Human sequences were found in mouse ChIP-seq reads. The contamination was removed with Xenome (Conway et al., 2012) v1.0.0. After contamination removal, ChIP-seq sequence reads were mapped to the Mouse genome build mm10 by using Burrows-Wheeler Aligner MEM algorithm (Li and Durbin, 2009; BWA v0.7.17). The read group ID was attached to every read in the resulting alignment file (bam file) with the -R parameter, and shorter split hits were marked as secondary with -M. Samtools (Li et al., 2009) fixmate v1.9 was used to check mate-pair information between mates and fixed if needed on a name-sorted bam file. The duplicate reads were tagged by samtools markduplicates using a position sorted bam file. Secondary alignments and unmapped reads have been filtered out and only properly paired reads have been kept. Two types of downstream analysis have been performed, with multimapped reads (mapping quality score ≥ 0) and one with uniquely mapped reads (mapping quality score ≥ 1). Cross-correlation scores (normalized or relative strand cross-correlation coefficient) have been calculated by phantompeakqualtools package (Kharchenko et al., 2008; Landt et al., 2012) v1.2. DeepTools (Ramírez et al., 2016) bamCoverage v3.3.0 has been used to generate normalized bigwig files with the following parameters: --binSize 1 --normalizeUsing BPM --extendReads –ignoreDuplicates. Then, deepTools bigwigCompare was used to subtract input signal from chip signal.
Peak calling
Areas in the genome enriched with aligned reads (also called peaks) were identified with MACS2 (Zhang et al., 2008) callpeak v2.1.2 with the following parameters: -f BAMPE -g mm10 -q 0.05 --broad --broad-cutoff 0.05 for H3K9me3 broad mark.
Irreproducible discovery rate (IDR) analysis
To measure the reproducibility between replicate experiments, we used the IDR method (Li et al., 2011) v2.0.4.2 with the following parameters: --rank q.value --random-seed 12345 --plot. Peaks with a global IDR score <0.05 were selected and used for downstream analysis.
Peak annotation
Annotatr 1.8.0 (R3.5.1) was used for peak annotation.
H3K9me3 quantification and differential binding
To quantify H3K9me3 concentration at TE or promoters (−2 kb; +1 kb TSS), the Bioconductor package Diffbind (Ross-Innes et al., 2012) v2.10 was used in R v3.5.1. Paired-end mode was activated for read counting step with SummarizeOverlaps method. The default mapping quality threshold (mapQCth) was modified in 0 for multimapping analysis or 1 for unique mapping analysis. DBA_DESEQ2_BLOCK method was used to consider unwanted variable during normalization. Normalized H3K9me3 concentration at all TE loci from a same family/subfamily was summed to get a total H3K9me3 concentration per TE family. The age of a TE was calculated as in Sookdeo et al. (2013): divergences were converted to time assuming a neutral rodent genomic substitution rate of 1.1%/million yr.
Differential binding at peaks was identified with a P value threshold of 0.05.
Heatmaps
To plot heatmaps of H3K9me3 enrichment at peaks, deeptools package v3.2.0 was used in R v3.5.1. The peaks (IDR < 0.05) files obtained for NIR and IR conditions were first fused using bedops. A matrix was then built using ComputeMatrix tool in the scale-regions mode between the generated fused bed file and the corresponding normalized bigwig files after input subtraction. A body length of 2.5 kb (mean size of the peaks) was selected, as well as a 4-kb distance upstream and downstream of the start and the end of the peak. We asked for a “–outFileSortedRegions” that gives the sorted bed file used for the heatmap. This sorted bed file was then used for genome coverage analysis, i.e., identification of the presence of a given TE in each row after computing a matrix with the TE genome coverage bigwig.
TE genome coverage
To generate TE genome coverage, bedtools package v2.27.1 was used. –bga option on the genomeCoverageBed tool was used. The bedGraphs generated were then converted to bigwig files using the bedGraphToBigWig tool.
CUT&Tag
CUT&Tag was analyzed as described as in https://yezhengstat.github.io/CUTTag_tutorial/index.html with the following parameters: Quality Control was performed using FastQC (0.11.9) and MutiQC (1.10.1); Bowtie2 (2.4.1) alignment to mm10 (UCSC genome) was performed with the following parameters: --end-to-end --very-sensitive --no-mixed --no-discordant --phred33 -I 10 -X 700; duplicate reads were removed using Picar (2.26.9) with the following parameters: --REMOVE_DUPLICATES true --VALIDATION_STRINGENCY LENIENT; aligned read quality score was set to 0 to keep all reads by using samtools (1.13) with the following parameters: -q 0; aligned reads were sorted and indexed using samtools (1.13); a coverage track (bigWig) was generated using deeptools (3.5.0) with the following parameters: -bs 5 --normalizeUsing BPM; peak calling was performed using macs2 (2.2.7.1) with the following parameters: -B --broad --broad-cutoff 0.1 -f BAMPE -g mm --max-gap 2000 --min-length 200; profile plot for scores over genomic regions (mm10.rmsk.mod.L1Md.bed) were performed using deeptools (3.5.0) with the following parameters: --beforeRegionStartLength 1000 --regionBodyLength 5000 --afterRegionStartLength 1000 \; statistics of the CUT and TAG signal were performed using the R package Rseb 0.2.0: using as input a score matrix computed by deeptools’s computeMatrix, we plotted the mean density profile of all condition with the SEM.
Online supplemental material
Fig. S1 relates to Figs. 1 and 2 showing additional information on ChIP-seq data. Fig. S2 relates to Figs. 3 and 4 showing additional information on RNA-seq data and the comparison between the ChIP-seq and the RNA-seq data. Fig. S3 relates to Fig. 5 showing ChIP-qPCR at intronic L1Md of additional target genes, the gating strategy for sorting HSC electroporated with the Cas9/gRNA RNP complex for intronic L1Md deletion, the results of the tests for gRNAs efficiency, and additional de novo motif search. Fig. S4 relates to Fig. 6 showing NFKB1 immunofluorescence data using another antibody (sc-8414, clone E10) and the effect of TNF-α on gene expression upon IR at additional target genes. Fig. S5 relates to Figs. 7 and 8 showing comparison of ChIP-seq and CUT&Tag profiles, additional gene signatures affected by IR and TNF-α treatment, and the gating strategies for analyzing HSC reconstitution capacity in blood and BM after IR and with or without TNF-α treatment. Table S1 shows quality control of the reads and peak calling data for the ChIP-seq analysis, Table S2 shows information on differentially expressed genes upon IR for RNA-seq analysis, Table S3 gives information on intragenic L1Md, and Table S4 lists primers and gRNAs used for the study.
Data availability
The dataset generated from the ChIP-seq for Figs. 1 and 2 are available in ArrayExpress accession no. E-MTAB-11865, from the RNA-seq for Figs. 3 and 4 in ArrayExpress accession no. E-MTAB-11866, and from RNA-seq and CUT&Tag for Fig. 7 in ArrayExpress accession nos. E-MTAB-11867 and E-MTAB-11864, respectively.
Acknowledgments
We thank the animal facility, the Genomic and the Imaging and Cytometry Platforms of Gustave Roussy for RNA-seq, confocal, and cell sorting analysis, respectively; S. Gregoricchio and Drs. C. Lobry and C. Guillouf (INSERM U1170, Gustave Roussy) for their advice for ChIP-seq analysis; D. Dubray for statistical analysis; and Drs. M. Goodhardt and D. Garrick (INSERM UMRS-1126, Paris) for helpful discussions. We also thank A. Teissandier and D. Bourc’his for providing us with the reconstructed repeatMasker database.
This work was supported by INSERM and grants from Ligue National Contre le Cancer (LNCC; Equipe labellisée EL2020) and Institut National du Cancer (PLBIO No. 2020-095) to F. Porteu, ARC Foundation (No. 20161204988), GEFLUC Paris-Ile de France (2017), LNNC (2018–2019), and Agence National de la Recherche (ANRJCJC20-CE14-0018-01) to E. Elvira-Matelot. Y. Pelinski and D. Hidaoui are recipients of fellowship from the Ministère de l’Enseignement Supérieur de la Recherche et de l’Innovation. A. Stolz is recipient of a fellowship from LNCC. F. Hermetet is supported by Institut National du Cancer (PLBIO no. 2020-095 to F. Porteu).
Author contributions: Y. Pelinski, A. Stolz, and E. Elvira-Matelot performed the RNA-seq, ChIP-seq, CUT&Tag, and ChIP-qPCR experiments and analyzed the results. M.K. Diop, R. Chelbi, A.M. Chioukh, and E. Elvira-Matelot performed bioinformatic analyses. D. Hidaoui, F. Hermetet, and A. Stolz performed IF, qPCR, and reconstitution experiments and analyzed the results. F. Porteu and E. Elvira-Matelot designed and supervised the study, analyzed the results and wrote the manuscript.
References
Author notes
D. Hidaoui and A. Stolz contributed equally as second author.
F. Porteu and E. Elvira-Matelot contributed equally as last author and are co-corresponding authors.
Disclosures: The authors declare no competing interests exist.