Stem and progenitor cell fate transitions constitute key decision points in organismal development that enable access to a developmental path or actively preclude others. Using the hematopoietic system, we analyzed the relative importance of cell fate–promoting mechanisms versus negating fate-suppressing mechanisms to engineer progenitor cells with multilineage differentiation potential. Deletion of the murine Gata2−77 enhancer, with a human equivalent that causes leukemia, downregulates the transcription factor GATA2 and blocks progenitor differentiation into erythrocytes, megakaryocytes, basophils, and granulocytes, but not macrophages. Using multiomics and single-cell analyses, we demonstrated that the enhancer orchestrates a balance between pro- and anti-fate circuitry in single cells. By increasing GATA2 expression, the enhancer instigates a fate-promoting mechanism while abrogating an innate immunity–linked, fate-suppressing mechanism. During embryogenesis, the suppressing mechanism dominated in enhancer mutant progenitors, thus yielding progenitors with a predominant monocytic differentiation potential. Coordinating fate-promoting and -suppressing circuits therefore averts deconstruction of a multifate system into a monopotent system and maintains critical progenitor heterogeneity and functionality.
Stem and progenitor cell fate transitions are critical determinants of development, physiological homeostasis, and adaptive responses to life-threatening stresses. In principle, engineering a particular fate potential into a cell with developmental plasticity can be accomplished by instigating fate-promoting or negating fate-suppressing mechanisms. Whether these operationally distinct mechanisms are commonly mutually exclusive or interlinked is unclear. It is instructive to consider this problem in the context of hematopoietic stem and progenitor cells (HSPCs) that generate the diverse cells comprising blood. Major progress has been made in identifying the heterogeneous HSPC populations and the intrinsic transcriptional networks and microenvironment-based mechanisms that control their functional transitions (Haas et al., 2018; Weissman, 2016). These studies have forged principles that guide a wide swath of basic and translational research and extend well beyond the hematopoietic system.
A plethora of macromolecules (proteins, RNAs, and metabolites) control hematopoiesis and generate HSPC heterogeneity (Haas et al., 2018; Orkin and Zon, 2008; Rossi et al., 2012). A major challenge has been to establish a global perspective of how factors, signals, and pathways functionally intersect or operate independently to sustain HSPC pools and accommodate massive demands to generate diverse progeny. Conventional strategies involve loss-of-function or gain-of-function analyses while measuring mechanistic (e.g., impact on a transcriptome) and biological (e.g., impact on a cell state transition) consequences. Transcriptomes of tens of thousands of mRNAs, which might or might not be highly concordant with proteomes, often weigh heavily in functional assessments. Proteomic advances permit quantification of several thousand proteins in a mammalian cell, far short of comprehensive coverage (Richards et al., 2015). Although single-cell transcriptomic (Watcham et al., 2019) and proteomic (Palii et al., 2019) analyses have utility for deconvoluting mechanisms emerging from cell population data, when deployed alone, these approaches may have intrinsic limitations. Amalgamating transcriptomic and proteomic data can surmount limitations to mechanistic discovery, including those involving master regulators that instigate complex networks to establish and/or maintain cellular states.
Transcription factors, such as the master hematopoietic regulator GATA2 (Tsai and Orkin, 1997; Tsai et al., 1994), establish genetic networks that promote HSPC proliferation, survival, and differentiation (Katsumura et al., 2017). Mutations of murine Gata2 (de Pater et al., 2013; Ling et al., 2004; Rodrigues et al., 2005; Tsai et al., 1994) or +9.5 and −77 enhancers (+9.5 kb downstream and −77 kb upstream of the Gata2 transcription start site; Gao et al., 2013; Grass et al., 2006; Johnson et al., 2012; Johnson et al., 2015; Mehta et al., 2017; Soukup et al., 2019) abrogate HSPC genesis and function. Human GATA2 coding (Dickinson et al., 2011; Hahn et al., 2011; Hsu et al., 2011; Ostergaard et al., 2011) or +9.5 enhancer (Hsu et al., 2013; Johnson et al., 2012) mutations yield immunodeficiency and predisposition to develop myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML; Churpek and Bresnick, 2019; Dickinson et al., 2014; McReynolds et al., 2018; Spinner et al., 2014). In poor-prognosis 3q21;q26 AML, an inversion translocates −77 next to MECOM (encoding the transcription factor EVI1) to generate a super-enhancer, increasing EVI1 and reducing GATA2 to cause leukemia (Gröschel et al., 2014; Katayama et al., 2017; Yamazaki et al., 2014).
As an erythroid and megakaryocytic fate-promoting mechanism, GATA2 increases GATA1 expression (Mehta et al., 2017), which upregulates its own coregulator friend of GATA1 (FOG1; Crispino et al., 1999). As a fate-suppressing mechanism, GATA1 antagonizes PU.1, thereby blocking myelopoiesis (Nerlov et al., 2000; Zhang et al., 2000); however, whether this occurs in single cells is unclear (Hoppe et al., 2016). Solitary and integrated fate-promoting and -suppressing mechanisms provide the architectural framework for building complex developmental and biological processes.
Prior to GATA1 expression, GATA2 must meet unique challenges to know when and where to induce HSPC cell genesis and function (Churpek and Bresnick, 2019; Katsumura et al., 2017). Although GATA2 and PU.1 cooperatively stimulate progenitor differentiation into mast cells (Walsh et al., 2002), the converging mechanisms that endow GATA2 (and other master regulatory transcription factors) with fate-promoting versus -suppressing activities are undefined. We describe a mechanism in which an enhancer (−77) promotes GATA2 expression, inducing a fate-promoting mechanism and concomitantly abrogating a fate-suppressing mechanism. −77−/− progenitors mounted a response to sustain the fate-suppressing mechanism involving innate immune machinery. The consequence of this genetic aberration is deconstruction of an integrated multipotent differentiation system into a predominantly unipotent system. Furthermore, the innovative multiomics resource of WT and enhancer mutant primary progenitor cells, involving quantitative proteomics, single-cell transcriptomics, and population transcriptomics, linked to GATA2 rescue, will catalyze many additional discoveries beyond those described herein.
Discovering progenitor cell fate mechanisms using multiomics
Hematopoietic progenitors from murine fetal liver, WT for the Gata2 −77 enhancer (−77+/+), can undergo erythroid, megakaryocytic, granulocytic, and monocytic differentiation ex vivo. By contrast, progenitors with a −77 homozygous deletion (−77−/−) exhibit a predominant monocytic cell fate and generate abundant macrophages (Fig. 1 A; Johnson et al., 2015). Human GATA2 deficiency syndrome, resulting from GATA2 coding (R398W) or intron 5 (human equivalent of murine +9.5 enhancer) mutations, is characterized by bone marrow hypocellularity, dysplastic megakaryocytes, and monocytopenia (Fig. 1 B; Calvo et al., 2011). Flow cytometry confirmed monocyte depletion from marrow aspirates (Fig. 1, C and D). Despite the monocytopenia, bone marrow macrophages were abundant in heterozygous R398W or intron 5 mutant patients (Fig. 1 B), consistent with the capacity of GATA2-deficient murine progenitors to generate macrophages ex vivo.
To address the contribution of cell fate–promoting versus –suppressing mechanisms to the activity of a cell fate–regulatory enhancer, we isolated a lineage-negative (Lin−)Sca1−c-Kit+CD34+ myeloid progenitor population (common myeloid progenitor [CMP] and granulocyte-monocyte progenitor [GMP]) from fetal liver of embryonic day 14.5 (E14.5) −77+/+ and −77−/− mouse embryos (Fig. 2, A and B). The CMP/GMP pool is a complex mixture of progenitors with diverse transcriptional profiles (Olsson et al., 2016; Paul et al., 2015). Previously, we showed that −77+/+ fetal liver CMPs possessed erythroid and myeloid colony-forming capacity ex vivo, whereas GMPs only generated myeloid cells. −77−/− CMPs and GMPs were largely restricted toward macrophage production (Johnson et al., 2015). Herein, we analyzed the molecular changes resulting from loss of the −77 enhancer, which underlie the restricted cell fate potential. Quantitative proteomics was conducted to discover the −77-regulated protein ensemble endowing progenitors with multifate potential (Fig. 2, C and D). GATA2 was 4.7-fold lower in −77−/− progenitors. GATA2 directly activates Gata1 transcription in progenitors (Mehta et al., 2017), and although essential GATA1 functions are manifested principally in committed megakaryocytes, erythrocyte and mast cell precursors, and their developing progeny (Fujiwara et al., 1996; Pevny et al., 1995; Pevny et al., 1991; Tsang et al., 1998), GATA1 protein was detected in progenitors. GATA1 was 51-fold lower in −77−/− progenitors (Fig. 2, C and D), and the GATA1-induced gene/protein FOG1 (Crispino et al., 1999) was 2.4-fold lower (Fig. 2 D). GATA2 directly activates Hdc and Gfi1b transcription (Gao et al., 2013; Katsumura et al., 2016; Katsumura et al., 2014; Mehta et al., 2017), and respective proteins were 52- and 2.7-fold lower in −77−/− progenitors.
Although GATA2 is not known to be a key regulator of genes/proteins mediating innate or adaptive immune processes, innate immune machinery was upregulated, including IFN signaling pathway components (Schneider et al., 2014; Table S1). IFN-inducible transcription factors, termed IFN regulatory factors (IRFs), were upregulated in −77−/− progenitors, including IRF5 (2.1-fold), IRF8 (2.7-fold), and IRF9 (2.3-fold). Deletion of murine Irf8 and biallelic human IRF8 mutations that cause severe monocytopenia and primary immunodeficiency disease (Bigley et al., 2018; Hambleton et al., 2011; Kurotaki et al., 2013; Yáñez et al., 2015) revealed IRF8 to be an essential monocytic differentiation determinant. PU.1 upregulates IRF8 expression (Schönheit et al., 2013), and IRF8 and PU.1 function collectively to control myeloid and inflammatory genes through composite binding sites (Marecki et al., 2001; Meraro et al., 2002). However, expression of Spi1 encoding PU.1 is not altered in −77−/− CMPs or GMPs (Johnson et al., 2015), and PU.1 levels were unaltered in the CMP/GMP pool (Fig. 2 D).
Other upregulated IFN-inducible innate immune components included the pattern recognition receptors TLR9 (9.7-fold) and TLR2 (7.2-fold; Fig. 2 D); the negative regulator of TLR signaling IRAK3 (IRAK-M; Kobayashi et al., 2002) was downregulated 2.8-fold. Other IFN-inducible proteins upregulated in −77−/− progenitors were oligoadenylate synthase-like protein 2 (OASL2; 5.6-fold), nucleotide-binding domain, leucine-rich repeat protein 1A (NLRP1A; 8.0-fold), the non-TLR pattern recognition receptor dectin-1 (3.4-fold), and the IFNγ target ISG15 (3.4-fold).
We used gene ontology and STRING network prediction tools to infer global consequences of proteome alterations. The 202 upregulated proteins highlighted a spectrum of immune and inflammatory mechanisms, whereas the 232 downregulated proteins were linked to functional processes in megakaryocyte and granulocyte biology (Fig. 2 E). This illustrated the overt loss of megakaryocyte and granulocyte biology–linked proteins, indicative of abrogated cell fate programs, with acquisition of IFN response proteins (Fig. S1).
To test if the aberrant −77−/− progenitor proteome reflects GATA2 downregulation and not genes topologically associated with the −77 enhancer, we used our previously described genetic rescue system in primary −77−/− progenitor cells (Katsumura et al., 2018). Lin− progenitors from −77+/+ or −77−/− E14.5 fetal liver (Fig. 2 A; McIver et al., 2018) were infected with GATA2-expressing or control retroviruses and cultured for 3 d. Under conditions in which GATA2 is expressed at near-physiological levels (Fig. S2 A), transcriptomes (four replicates of each condition; Fig. S2 B) were elucidated with RNA sequencing (RNA-seq). Of the 3,161 differentially expressed genes between −77+/+ and −77−/− progenitors, GATA2 expression rescued or partially rescued the majority (86%; 2,714) of the alterations (Fig. 2 F). Rescue was defined as genes that were up- or downregulated in the GATA2 rescue system and in −77+/+ versus −77−/− progenitors.
Many −77-repressed and -induced proteins (Fig. 2 D) were also regulated at the mRNA level (Fig. 2, F and G). Using criteria for protein and mRNA changes of twofold or greater and false discovery rate <0.05, we analyzed the 92 and 57 proteins downregulated and upregulated, respectively, in −77−/− progenitors. This analysis revealed 58 targets, 67% of which were regulated in a qualitatively indistinguishable manner at the mRNA level (Table S2). The rescue analysis demonstrated that GATA2 loss caused the IFN response and many of the gene expression alterations in −77−/− progenitors. Thus, the ectopic innate immune machinery induction in −77−/− progenitors was also detected in cultured −77−/− progenitors.
To further test if innate immune machinery induction is sustained when progenitors are segregated from niche components, progenitors were cultured for up to 3 d. At all times tested, Irf8 and Tlr9 mRNA levels in −77−/− exceeded those of −77+/+ progenitors (Fig. 2 H). Because the GATA2-reversible, aberrant −77−/− progenitor transcriptome is maintained upon progenitor propagation without a heterocellular microenvironment, these results support a cell-intrinsic transcriptome/proteome dysregulation in GATA2-downregulated progenitors.
Enhancer-instigated developmental circuits in single progenitor cells
On the basis of proteomic and transcriptomic analyses of progenitor populations (Fig. 2), we considered the relationship between −77-regulated GATA2 expression and the GATA2-dependent transcriptome and proteome in single progenitors. Single-cell RNA sequencing (scRNA-seq) of Lin−Sca1−c-Kit+CD34+ cells was conducted with the 10x Genomics platform and analyzed using multiple dimensional reduction approaches and tools (principal component analysis [PCA], t-distributed stochastic neighbor embedding [t-SNE]; van der Maaten and Hinton, 2008; and Uniform Manifold Approximation and Projection; Becht et al., 2019), which yielded qualitatively similar conclusions. scRNA-seq data derived from PCA was subjected to k-means clustering. Maximizing average silhouette width, calculated as a function of the number of clusters, revealed three clusters to be optimal (Fig. 3, A and B). t-SNE analyses, which leverage nonlinear dimensional reduction, also revealed three or four clusters to be optimal (Fig. 3 B and Fig. S3 A). On the basis of cohorts of genes enriched in each cluster, the transcriptome of cluster 2 cells is characterized by an innate immune response, whereas enriched genes of cluster 3 cells highlight transcriptional and translational processes linked to red cell production (Fig. S3 B). A comparison of differential gene expression between −77+/+ and −77−/− progenitors included prominent myeloid gene expression in cluster 2, which was reduced in −77−/− progenitors (Fig. 3 C). A subset of cells in all clusters expressed Gata2, and Gata2 expression was uniformly lower in −77−/− progenitors (Fig. 3 D). Irf8 was expressed in a subset of cells in all clusters, highly upregulated in cluster 2 in mutants (3.04-fold increase in expression per cell; P = 1.0e−205), and upregulated significantly but to a lesser extent in clusters 1 (1.26-fold increase; P = 3.3e−13) and 3 (1.29-fold increase; P = 1.7e−14; Fig. 3 E).
Because Gata2 and Irf8 were expressed in a subset of cells in all clusters, and because −77 deletion downregulated Gata2 and upregulated Irf8 transcription, we tested whether this opposing expression pattern occurs in distinct, common, or both cohorts of cells. In the heterogeneous population, most cells lacked coexpression of Gata2 and Irf8 (Fig. 3 F, hex plot). When single cells were parsed on the basis of Gata2 expression levels, and when expression of Irf8, relative to Gata2, was compared by cluster (Fig. 3 F), cluster 1 cells exhibited a broad range of Gata2 expression, and Irf8 expression was detected only in cells with a low level of or no Gata2. −77 deletion greatly reduced Gata2 expression, and only cells with the lowest Gata2 levels expressed Irf8. Cluster 1 and 2 cells shared a common range of Gata2 expression, and Irf8 expression was detected in cells with little to no Gata2. −77 deletion nearly abrogated Gata2 expression, and Irf8 expression emerged prominently, with the highest level detected in cluster 2 cells. Cluster 3 cells also exhibited a broad range of Gata2 expression, with Irf8 detectable only in cells expressing little to no Gata2. The −77 deletion abrogated Gata2 and upregulated Irf8 identically to the other clusters. This analysis identified cells expressing both Gata2 and Irf8, with one high and the other low. Taken together with IRF8 being a vital monocytic differentiation determinant and our discovery that GATA2 restoration in −77−/− progenitors rescued Irf8 expression, these results support a model in which −77 loss downregulates GATA2, corrupting the transcriptome and proteome. Under these conditions, Irf8 expression increases and IRF8 enables or drives the predominant monocytic differentiation.
Genetic construction and deconstruction of progenitor heterogeneity and developmental trajectories
To further dissect how an enhancer deletion deconstructs a multifate system to yield a predominant solitary fate, the progenitor scRNA-seq data were analyzed with the pseudotime trajectory tool SPRING (Weinreb et al., 2018). By segregating cells with disparate transcriptomes, pseudotime trajectory analysis identifies potential developmental paths. It was unclear, however, if this analysis would unveil trajectories when applied to a more restricted progenitor population. SPRING trajectory plots were color coded to illustrate the relationship of the trajectories (Fig. 4 A) with the previously established PCA-defined clusters (Fig. 3 B). One prominent trajectory (Fig. 4 A, trajectory a) and additional, less pronounced trajectories (Fig. 4 A, boxed inset, trajectories b–d) extrude from cluster 3 in the −77+/+ plot. Cells expressing the highest Gata2 levels composed the smaller trajectories (Fig. 4, B and C). Gata1 expression overlapped extensively with that of Gata2 in the short leftward extrusion (Fig. 4, B and C) but was also detected throughout trajectory a that lacked Gata2. Restricted expression of multiple erythroid-specific genes (e.g., Klf1, Hba-a1, Alas2, Slc4a1), as with Gata1, which is expressed in erythroid cells and additional cell types, provides evidence that this extended trajectory reflects progressive erythroid differentiation. These cells compose just 2% of the total −77+/+ population and may represent erythroid-primed cells within the CMP population. Within the inset, trajectory b cells of cluster 3 expresses erythroid markers Klf1 and Car1 (Fig. 4 C), whereas the megakaryocytic gene Pf4 is restricted to trajectory c. A distinct subset of cells, trajectory d, expressed basophil markers Lmo4, Ifitm1, Ly6e, and Srgn (Tusi et al., 2018). Because all of these trajectories are absent or greatly diminished in −77−/− progenitors, the enhancer deletion abrogated erythroid, megakaryocytic, and basophil developmental trajectories.
Cluster 1 cells comprise the central mass of the SPRING plot and express myeloid transcripts, including Flt3, Spi1, Cebpa, and Irf8 (Fig. 5 A). Expression of the monocyte progenitor markers Csf1r (CD115) and Cx3cr1 in the central mass was restricted to a cell cohort that was expanded in the −77−/− samples (Fig. 5 B). A rightward extrusion comprising cluster 2 cells was enriched in neutrophil transcripts (e.g., Elane and Fcnb; Fig. 5 B), with the left-to-right directionality characterized by attributes associated with progressive neutrophil development. Although the −77 deletion had little to no impact on Spi1- and Cepba-expressing cells in the central mass, it abrogated the neutrophil developmental trajectory. Expansion of GMP-derived monocyte progenitors and loss of granulocyte progenitors was confirmed by flow cytometry. Bipotential Ly6C− GMPs were unaffected by the −77 deletion (Fig. 5 C).
In the −77−/− progenitors, Irf8 expression was a common attribute of many cells within the central mass, consistent with Irf8 upregulation as an early step in acquisition of the predominant monocytic fate program. Thus, −77 and GATA2 endow myeloid progenitors with erythroid-, megakaryocyte-, basophil-, and neutrophil-primed transcriptomes, and this multilineage transcriptomic heterogeneity suggests a mechanism underlying the diverse differentiation potentials of the heterotypic progenitor population. Accordingly, −77 deletion attenuates transcriptomic heterogeneity, thus deconstructing progenitor multipotentiality to yield a predominant monocytic fate. This deconstruction of the multifate system occurs without gross changes in progenitor cell cycle status, though 2.0- and 2.3-fold increases in G2/M- and S-phase cells were detected in a small percentage of cluster 2 cells (Fig. S4 A). These alterations were not associated with changes in genes expressed specifically in proliferating cells (Pcna and Mki67). Pcna and Mki67 were expressed broadly in cells within all clusters, and −77 deletion had little to no impact on their expression (Fig. S4, B–D). By coordinating fate-promoting and -suppressing circuits, this enhancer mechanism generates progenitor functional heterogeneity to accommodate physiological requirements.
Coordinating fate-promoting and -suppressing circuits to generate multipotency: Mechanistic considerations
How does an enhancer deletion trigger progenitors to mount an ectopic innate immune response? IFN signaling is implicated in diverse HSPC functions in physiology and pathology (Baldridge et al., 2010; de Bruin et al., 2014; Essers et al., 2009; Li et al., 2014; Zoumbos et al., 1985), and its regulation is controlled at multiple levels (Schneider et al., 2014). During fetal liver hematopoiesis, GATA2-mediated suppression of IFN signaling may balance IFN-activated and -suppressed processes, thus minimizing the emergence of dysregulated signaling and deleterious consequences. GATA2 downregulation would disrupt this defensive mechanism, causing ectopically high IFN signaling. Alternatively, GATA2 downregulation might desensitize IFN signaling components, resulting in subphysiological signaling insufficient to support progenitor functions. Elevated IFN signaling may constitute an attempt to restore physiological IFN-dependent signaling outputs. Finally, GATA2 downregulation might upregulate IFN signaling components independent of IFN receptor signaling.
To address these potential mechanisms, we asked whether −77−/− progenitors with upregulated IFN signaling components are responsive to exogenous IFN or if the preinduced state reflects maximal pathway activity and lack of competence to respond further. Because gene ontology analysis, gene set enrichment analysis, and individual gene attributes did not reveal that upregulated components conform to a strict type I (α and β) or type II (γ) IFN signature, we analyzed the responsiveness of −77+/+ and −77−/− cells to α, β, or γ IFN. IFNγ increased Irf8 expression 2.8-fold (P = 0.007) and 2.9-fold (P = 0.005) in −77+/+ and −77−/− cells, respectively (Fig. 6 A). Similarly, IFNγ increased Tlr9 expression 2.9-fold (P = 0.012) and 4.4-fold (P = 0.049) in −77+/+ and −77−/− cells, respectively (Fig. S5 A). IFNα and IFNβ were less effective inducers than IFNγ in −77+/+ cells and elicited similar responses in −77+/+ and −77−/− cells (Fig. 6 A). Normalization of the data revealed that −77+/+ and −77−/− progenitors had a comparable sensitivity to IFNγ-mediated induction of Irf8 and Tlr9 expression (Fig. S5, A and B). Thus, the preinduced Irf8 and Tlr9 state of −77−/− progenitors did not preclude or impact the IFN dose-dependent transcriptional response. IFNγ did not affect Gata2 expression in −77+/+ or −77−/− progenitors (Fig. 6 A, right). Because IFN induced a hyperphysiological expression response in −77−/− cells, in which multifate potential was deconstructed into a singular fate, these results support a model in which −77 and GATA2 suppress IFN signaling, thus averting the emergence of dysregulated signaling with consequences deleterious for cellular differentiation.
Because IFN signaling components are upregulated in −77−/− progenitors with skewed, predominant monocytic differentiation, and since IFNγ induced Irf8 and Tlr9 expression in −77+/+ progenitors, we asked if IFNγ suffices to skew differentiation. Cells were treated with 20 ng/ml IFNγ when plated for colony formation. 8 d later, cells were analyzed by Wright-Giemsa staining (Fig. 6 B). Consistent with the prior report that IFNγ favors monocytic over granulocytic differentiation (de Bruin et al., 2012), IFNγ increased monocytic and decreased granulocytic progeny by 3.1-fold (P = 0.0004) and 1.5-fold (P = 0.0004), respectively (Fig. 6 C).
IFN induces Irf8 and Tlr9 expression (Schneider et al., 2014), both being ectopically high in −77−/− progenitors. In dendritic cells, TLR9 signaling requires IRF8 for NF-κB activation (Tsujimura et al., 2004). Thus, in certain contexts, IRF8 and TLR9 mechanisms are functionally intertwined. In other contexts, IFNγ and TLR signaling synergistically control cell function (Hu and Ivashkiv, 2009).
Is IFN signaling responsible for ectopic innate immune machinery upregulation in −77−/− progenitors? IFNγ dimerizes and binds a heterodimeric receptor of IFNGR1 and IFNGR2 subunits, which recruits Janus kinases (JAK1 and JAK2), leading to phosphorylation and STAT1 transcription factor activation (Stark and Darnell, 2012). If increased IFNγ signaling causes elevated Irf8 and Tlr9 expression, blocking JAK1/2 should attenuate or abrogate aberrant gene expression. If Irf8 and Tlr9 upregulation does not involve canonical IFNγ signaling, JAK1/2 inhibition should not impact expression. After treatment with the JAK1/2 inhibitor ruxolitinib for 48 h, the elevated expression of Irf8, Tlr9, and other IFN-inducible genes in −77−/− progenitors was attenuated to a level resembling that of −77+/+ progenitors (Fig. 6 D). Ruxolitinib did not significantly affect the low Gata2 expression in −77−/− versus −77+/+ progenitors.
Because the Irf8 expression level dictates monocytic versus granulocytic fate (Bigley et al., 2018; Giladi et al., 2018; Hambleton et al., 2011; Kurotaki et al., 2013; Yáñez and Goodridge, 2016; Yáñez et al., 2015), and because ruxolitinib downregulated Irf8 expression, we reasoned that ruxolitinib would attenuate the preferential monocytic fate potential. Ruxolitinib treatment reduced expression of monocytic genes Fcgr1, Siglec1, and Cx3cr1 to levels comparable with those of WT cells, whereas expression of the granulocytic gene Elane was unaffected (Fig. 6 D). Thus, the GATA2 deficiency–instigated fate-deconstructing mechanism requires JAK1/2 signaling to elevate expression of IFN response and monocytic genes.
The histone deacetylase HDAC11 is a suppressor of type I and type II IFNs. T cells from Hdac11-knockout mice display elevated IFNγ levels, which increase IFN signaling (Woods et al., 2017), and HDAC11 suppresses type I IFN signaling via limiting receptor deposition in the plasma membrane (Cao et al., 2019). Fatty acylation of SHMT2 promotes its localization to endosomes/lysosomes, where association with BRISC (BRCC36 isopeptidase complex) induces deubiquitination and stabilization of IFNαR1. HDAC11-mediated defatty-acylation of SHMT2 impairs endosome/lysosome localization, leading to reduced receptor recycling (Cao et al., 2019). Consistent with this mechanism, upregulated IFN signaling was associated with a 5.6-fold reduction in Hdac11 mRNA in primary −77−/− progenitors (Fig. S5 C). Chromatin immunoprecipitation sequencing revealed GATA2 occupancy at mouse and human Hdac11 loci, and GATA2 motifs (WGATAR) reside at the occupancy site (Fig. S5 D), suggesting direct regulation.
To further test whether the gene expression differences in −77−/− versus −77+/+ primary fetal liver progenitors are stable when progenitors are removed from an in vivo environment, we used estrogen-regulated HoxB8 (Wang et al., 2006) to immortalize the progenitors (Fig. 7 A). HoxB8-immortalized (hi) −77−/− cells retained lower expression of Gata2 and its target genes Gata1, Hdc, and Hdac11 versus hi−77+/+. However, innate immune (Irf8, Tlr9; Fig. 7 B) and monocytic (Cx3cr1 and Siglec1; Fig. 7 C) gene expression were higher in hi−77−/− progenitors. Expressing exogenous IRF8 in hi−77+/+ cells increased levels of both monocytic (Cx3cr1) and innate immunity (Oas3) mRNAs (Fig. 7 D), analogous to the high Irf8, Cs3cr1, and Oas3 expression in hi−77−/− cells. Thus, the −77−/− phenotype is stable ex vivo, and, on the basis of known monocytic differentiation activity of IRF8 in mouse (Kurotaki et al., 2013; Yáñez et al., 2015) and humans (Bigley et al., 2018; Hambleton et al., 2011) and our gain-of-function analysis, elevated IRF8 contributes to the phenotype.
Stem and progenitor cell activity to efficiently generate diverse progeny requires mechanisms that enable, propel, or restrict cell fate transitions to achieve specific developmental outputs. For multipotent cells, it is instructive to consider if the distinct fate potentials are acquired concomitantly, via independent steps, with each dedicated to an individual fate, or via a hybrid mechanism. Using hematopoietic progenitors and an enhancer mutant allele of a gene encoding the master regulator GATA2, we demonstrated that GATA2 primes the progenitor genome to generate a transcriptome and proteome that endow erythroid, megakaryocytic, granulocytic, and monocytic fates. The enhancer deletion renders GATA2 limiting, corrupting the GATA2-dependent transcriptome and proteome that confer multifate potential, leading to a predominant monocytic fate. Because restoring GATA2 reverses the transcriptomic aberrations, this powerful system was used to elucidate mechanisms that construct and deconstruct multifate systems. The −77 enhancer builds the multifate system by establishing a fate-promoting circuit and concomitantly negating a fate-suppressive circuit. The fate-promoting circuit requires the downstream target GATA1 with its coregulator FOG1 to establish erythroid- and megakaryocyte-primed transcriptomes (Fig. 8). The fate-suppressing circuit revealed a link between GATA2 and innate immune machinery, IFN signaling pathway components and sensors of pathogen constituents and activities, which defend against pathogen intruders. The analysis therefore unveiled mechanistic underpinnings of a multicell fate system construction process.
Foundational insights into circuitry underlying fate mechanisms have emerged from a transdifferentiation system in which pre–B cells acquire the capacity to form macrophages (Bussmann et al., 2009). Pre–B cells are heterogeneous vis-à-vis their potential to transdifferentiate. Single cells that generate macrophages rapidly are more refractory to reprogramming into induced pluripotent cells than are cells exhibiting slow transdifferentiation (Francesconi et al., 2019). The levels of a single protein, c-Myc, determine rapid transdifferentiation/low reprogramming (low c-Myc) efficiency versus slow transdifferentiation/high reprogramming (high c-Myc) efficiency states. Analogous to c-Myc, variable GATA2 levels control circuits that alter fate output. Gata2 +9.5−/− aorta gonad mesonephros is quantitatively defective in its capacity to generate HSPCs, and +9.5+/− embryos exhibit intermediate phenotypes in aorta gonad mesonephros and fetal liver hematopoiesis (Gao et al., 2013; Johnson et al., 2012). Gata2+/− embryos and adults also exhibit intermediate phenotypes (Ling et al., 2004; Rodrigues et al., 2005). Contrasting with these quantitative differences, GATA2 levels dictate establishment of a multifate system with qualitatively distinct fate outputs by induction of a fate-promoting circuit concomitant with negating a fate-suppressing circuit (Fig. 8).
The deconstruction of a multifate system, with emergence of a dominant fate-suppressing circuit, was unpredictable on the basis of transcriptional or developmental paradigms. Why would this circuit consist of innate immune machinery including IFN signaling components and pattern recognition receptors? In GATA2 deficiency syndrome, mechanisms that trigger MDS and AML are enigmatic (Churpek and Bresnick, 2019). GATA2 loss creates a disease predisposition, and we proposed that GATA2-low cells are vulnerable to genetic and/or environmental insults that launch HSPCs on a pathogenic path (Soukup et al., 2019). By instructing an efficient monocytic program, IRF8 induction in GATA2-deficient cells during embryogenesis may consume progenitors vulnerable to genetic or environmental insults. Pathogen infection of progenitors can elevate NLRP1A, an inflammasome component that senses pathogen enzymatic activity (Martinon et al., 2002) and triggers cell death via pyroptosis to clear pathogen-harboring progenitors (Masters et al., 2012); Nlrp1a is upregulated in −77−/− progenitors. Alternatively, IFN response gene expression may confer pathogen resistance and preserve progenitors, analogous to viral resistance of human embryonic stem cells and differentiated neural stem cell progeny (Wu et al., 2018). By extrapolation, innate immune machinery upregulation in −77−/− progenitors may ensure integrity of the progenitor pool. Pattern recognition receptors, which confer pathogen resistance (Ronald and Beutler, 2010), would provide sensors for progenitors to respond to bacterial (and Mycobacterium, a common pathogen in GATA2 deficiency syndrome; Dickinson et al., 2014; Spinner et al., 2014), viral, and fungal pathogens to evade deleterious consequences of infection. Without the upregulated innate immune machinery safety net, infection-induced stress might constitute a pathogenic trigger for an otherwise silent GATA2 mutation. Our results revealed some differences from IFN actions on embryonic stem cells (Wu et al., 2018). Embryonic stem cells with upregulated IFN components exhibit an attenuated IFNγ response, and oligoadenylate synthase family members, important determinants of pathogen immunity, were not upregulated (Wu et al., 2018). By contrast, −77−/− progenitors retain normal IFNγ responsiveness, and oligoadenylate synthase family members were upregulated. Given direct GATA2 activation of HDAC11 transcription described herein (Fig. S5 and Fig. 7) and established links between HDAC11 and reduced IFN signaling (Cao et al., 2019; Woods et al., 2017), it is attractive to propose that the fate-suppressing circuit involves a GATA2–HDAC11–innate immune axis.
In summary, we elucidated a mechanism in which an enhancer constructs a multifate system via coordinating fate-promoting and -suppressing circuits. The enhancer deletion instigates an ectopic innate immune response that deconstructs the multifate system, which unveils new dimensions to GATA factor, hematopoiesis, and immune mechanisms. Further dissecting the system will almost certainly yield additional principles governing broadly operational fate-regulatory circuits and transformative insights into innate immune machinery function in multipotent cells in physiology and pathology. In addition, the innovative multiomics resource of WT and enhancer mutant primary progenitor cells, involving quantitative proteomics, scRNA-seq, and population RNA-seq coupled with GATA2 genetic rescue, will uniquely enable diverse molecular/cellular discoveries beyond those described herein.
Materials and methods
Contact for reagent and resource sharing
A detailed list of reagents and resources is provided in Table S3. Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, E.H. Bresnick.
Patients with germline GATA2 mutations were enrolled in clinical protocols approved by the institutional review board at the National Institute of Allergy and Infectious Diseases (ClinicalTrials.gov identifier, NCT01905826) and in accordance with the Declaration of Helsinki. Patient-specific information is provided in Fig. 1. Bone marrow biopsies were performed with informed consent.
Hematopoietic progenitor cells were obtained from fetal livers of staged embryos from timed mated Gata2 −77+/− or C57BL/6J mice. All animal protocols were approved by the University of Wisconsin–Madison Institutional Animal Care and Use Committee in accordance with the Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC International) regulations.
Primary cell culture
Fetal livers were harvested on ice in PBS containing 2% FBS, 10 mM glucose, and 2.5 mM EDTA. Cells were dissociated and passed through a single-cell strainer. Cells expressing lineage markers were removed using biotin-conjugated antibodies CD3e, CD11b, CD19, CD45R (B220), GR-1, Ter119, CD71, and MojoSort Streptavidin Nanobeads, all purchased from BioLegend. The remaining Lin− cells were cultured for up to 3 d in Gibco IMDM (Life Technologies) containing 20% FBS, 4% stem cell factor (SCF)–conditioned media, 4% IL-3–conditioned media, and 1% penicillin-streptomycin (Gemini Bio). IFN responsiveness was assayed by treating Lin− cells with either IFNγ, IFNα, IFNβ, or PBS with 0.1% BSA (vehicle) at the time of culture for 24 h. Alternatively, unfractionated E14.5 fetal liver cells were plated in M3434 methylcellulose media at 20,000 cells per 35-mm dish with 20 ng/ml IFNγ or an equal volume of PBS, 0.1% BSA. After 7–8 d, cells were recovered for Giemsa staining. For JAK1/2 inhibition, ruxolitinib was added to the Lin− cells at 0.25 or 1 µM and cultured for 48 h, at which point RNA was isolated with TRIzol reagent (Thermo Fisher Scientific). Cells were cultured in a humidified 5% CO2 incubator at 37°C.
For generation of ER-HoxB8-immortalized (hi) progenitors, fetal liver Lin− cells were immortalized by retroviral expression of estrogen-regulated HoxB8 as described previously (Wang et al., 2006). Cells were cultured in OPTI-MEM supplemented with 10% FBS, 1% penicillin-streptomycin, 1% SCF-conditioned medium, 30 mM 2-mercaptoethanol, 1 µM β-estradiol, and 500 µg/ml G418. For transient expression of IRF8, hi−77+/+ cells were resuspended in 100 µl of Nucleofector Solution R and transfected with 20 µg of IRF8 expression vector or control vector using the G-016 program of Nucleofector II (Lonza). Cells were harvested for RNA extraction 72 h after transfection.
Total RNA was purified from cells with TRIzol reagent and treated with DNase I (Thermo Fisher Scientific) for 15 min at room temperature. Following heat inactivation for 10 min at 65°C, RNA was incubated with 125 ng of a 5:1 mixture of oligo(dT) primers and random hexamer at 68°C for 10 min. RNA/primers were incubated with Moloney murine leukemia virus reverse transcription (Thermo Fisher Scientific), 10 mM dithiothreitol (Thermo Fisher Scientific), RNAsin (Promega), and 0.5 mM deoxynucleoside triphosphates (New England Biolabs) at 42°C for 1 h and then heat inactivated at 98°C for 5 min. Quantitative gene expression analyses was performed by real-time PCR using Power SYBR Green Master Mix (Applied Biosystems) and run on a ViiA 7 Real-Time PCR System (Applied Biosystems).
Biopsies were fixed in buffered formalin, decalcified, paraffin embedded, cut into 4-μm sections, and stained with H&E. Paraffin-embedded biopsy sections (4 μm) were stained for CD68 and for CD14 in a Ventana Benchmark Ultra automated staining instrument (Ventana Medical Systems) according to the manufacturer’s protocols. Images were captured on an Olympus BX41 microscope equipped with an Olympus DP72 camera using Olympus cellSens Entry software.
Flow cytometry and cell sorting
Flow cytometric analysis of bone marrow aspirates was performed using a FACSCanto II analyzer (BD Biosciences) equipped with three lasers and eight fluorescence detectors. Antibodies used to identify monocytic populations were CD14 APCH7 (clone M7P9) and CD64 PE (clone 10.1). Cells were stained with the above-mentioned antibodies in appropriate dilutions for 15 min. RBCs were lysed with BD FACS lysing solution, and cells were washed with PBS containing 1% albumin. Cells were fixed in a 1% paraformaldehyde solution, and 105 events were acquired using FACSDiva software (BD Biosciences). The list mode files were analyzed with FCS Express (DeNovo Software).
Proteomic and single-cell transcriptomic analyses were performed on a CMP/GMP pool (Lin−Sca−c-Kit+CD34+) sorted from E14.5 fetal livers using a FACSAria II cell sorter (BD Biosciences). Lineage markers were stained with FITC-conjugated B220, CD3, CD4, CD5, CD8, CD19, IgM, Il7Ra, AA4.1, and TER-119 antibodies. Other surface proteins were detected with PE-conjugated FcγR, eFluor 660–conjugated CD34, peridinin chlorophyll (PerCP)-Cy5.5–conjugated Sca1, and PE-Cy7–conjugated c-Kit. After staining, cells were washed with PBS, 2% FBS, 10 mM glucose, and 2.5 mM EDTA, then resuspended in the same buffer containing DAPI and passed through 25-µm cell strainers to obtain single-cell suspensions for sorting. For proteomic analysis, sorted cell pellets were frozen in a dry ice/ethanol bath and stored at −80°C until the time of processing. For single-cell analysis, cells were sorted in PBS with 10% FBS, adjusted to 1,000 cells/µl, and cell viability was measured with a Countess II automated cell counter (Thermo Fisher Scientific) before processing. Quantitation of Lin−Sca1−cKit+CD34+FcgRHiLy6C+Flt3−CD115Hi monocyte progenitors and Lin−Sca1−cKit+CD34+FcgRHiLy6C+Flt3−CD115Lo granulocyte progenitors was performed using the LSR Fortessa flow cytometer (BD Biosciences). Antibodies are listed in Table S3.
Cells sorted by flow cytometry were frozen for storage at −80°C. Cell pellets were resuspended in 20–30 µl of lysis buffer (100 mM Tris, 8 M urea, 10 mM tris(2-carboxyethyl)phosphine, 40 mM 2-chloracetamide) and sonicated in a Qsonica Q700 sonicator at an amplitude of 35 and 4°C for 20 s on/10 s off with a total processing time of 10 min. The contents of two or three tubes were then combined on the basis of the number of cells counted during the sorting to obtain enough material for proteomic analysis (5–6e5 cells). The NanoDrop OneC Microvolume UV-Vis spectrophotometer (Thermo Fisher Scientific) at 280 nm was used to determine protein concentrations of the samples. Lysates were diluted with 50 mM Tris to a final urea concentration of ∼1.5 M before the addition of LysC in a 1:50 ratio (enzyme/protein; FUJIFILM Wako Chemicals) and overnight digestion at room temperature, followed by additional digestion with trypsin (1:50 ratio of enzyme/protein; Promega) for 3 h, acidification with 10% trifluoroacetic acid, desalting over 10-mg StrataX solid-phase extraction columns (Phenomenex), and lyophilization to dryness in a SpeedVac (Thermo Fisher Scientific). Peptides were resuspended in 0.2% liquid chromatography–mass spectrometry (LC-MS) grade formic acid (Pierce; Thermo Fisher Scientific), and the resultant peptide concentrations were determined using the NanoDrop spectrophotometer.
A 1260 Infinity II High Pressure Liquid Chromatography (HPLC) system with an Analytical-Scale Fraction Collector (Agilent) was used to separate ∼35 µg of peptides across the XBridge Peptide Ethylene Bridged Hybrid C18 Column, 130 Å, 3.5 µm, 4.6-mm × 150-mm column (Waters) at a flow rate of 0.8 ml/min over a 25-min gradient into 16 fractions. For high pH reverse-phase fractionation, mobile phase A consisted of 10 mM ammonium formate (Sigma-Aldrich) in HPLC MS-grade water (Thermo Fisher Scientific), buffered to pH 10.0 with ammonium hydroxide (Sigma-Aldrich), and mobile phase B contained 10 mM ammonium formate at pH 10.0 in 80% HPLC-MS–grade methanol (Thermo Fisher Scientific). Fractions were collected into conical bottomed 96–deep-well plates (Analytical Sales & Services) and concatenated by hand into eight final fractions. The plates were lyophilized to dryness in a SpeedVac, and peptides were resuspended in 0.2% formic acid for analysis by LC with tandem MS (LC–MS/MS) analysis.
For nanoscale LC–MS/MS, capillary columns were fabricated in-house. Self-pack PicoFrit 75–360-µm inner–outer diameter bare-fused silica capillary columns with 10-µm electrospray emitter tips (New Objective) were packed using an in-house–built ultrahigh-pressure column packing station (Shishkova et al., 2018) with 1.7-µm, 130-Å pore size Ethylene Bridged Hybrid C18 particles (Waters) to a final length of ∼40 cm and installed on a Dionex Ultimate 3000 nano-HPLC system (Thermo Fisher Scientific). Mobile phase buffer A was composed of 0.2% formic acid in water; mobile phase B was composed of 0.2% formic acid in 70% HPLC-MS grade acetonitrile (Thermo Fisher Scientific). Peptides (∼1 µg) from each fraction were loaded onto a column, which was kept at 50°C inside an in-house–made heater and separated at a flow rate of 300 nl/min over a 120-min gradient, including column wash and reequilibration time. Peptide ions were analyzed on a quadrupole ion trap hybrid Orbitrap Fusion Lumos mass spectrometer (Thermo Fisher Scientific). During data-dependent acquisition (whole-proteome analysis) Orbitrap survey scans were performed at a resolving power of 240,000 at 200 m/z with an automatic gain control target of 1.5 × 106 ions and maximum injection time of 50 ms. The instrument was operated in the top-speed mode with 1-s cycle times using an advanced precursor determination algorithm (Hebert et al., 2018) for monoisotopic precursor selection. Precursors were isolated using a quadrupole with an isolation window of 0.7 Th. MS-MS scans were performed in the ion trap using the rapid scan rate on precursors with two to four charge states using higher-energy collisional dissociation fragmentation with normalized collision energy of 25 and dynamic exclusion of 20 s. The ion trap MS/MS ion count target was set to 3 × 104 with a maximum injection time of 18 ms and fixed m/z range of 200–1,200. For parallel reaction monitoring analysis (Peterson et al., 2012), Orbitrap MS2 scans of targeted peptides were performed over a 200–2,000 m/z scan range at a resolving power of 60,000 at 200 m/z with an automatic gain control target of 1.5 × 106 ions and maximum injection time of 425–750 ms, depending on the number of peptides targeted simultaneously. Precursor ions were isolated using a quadrupole with 1.6 daltons isolation window and fragmented using higher-energy collisional dissociation with normalized collisional energy of 25.
The whole-proteome data were processed using the MaxQuant quantitative software package (version 126.96.36.199) and searched against the UniProt Mus musculus database (downloaded on June 18, 2018), containing protein isoforms. If not specified, default MaxQuant settings were used. Label-free quantitation (LFQ) was performed using an LFQ minimum ratio count of 1 and no MS/MS requirement for LFQ comparisons. Carbamidomethylation of cysteine residues was included as a fixed modification; oxidation of methionine and acetylation of protein N-termini were set as variable modifications. Match between runs was performed using default settings. Ion trap mobility spectrometry tandem MS tolerance was set to 0.3 daltons, and first search tolerance was set to 27 ppm. Lists of quantified proteins were filtered to remove reverse identifications, potential contaminants, and proteins, which were identified only by a modification site. Missing quantitative values were imputed for proteins that were observed in at least four of seven samples by randomly drawing values from the low end of the distribution of all measured protein abundance values (Cox et al., 2014).
GATA2 rescue assay
The reduced levels of GATA2 observed in −77−/− hematopoietic progenitors were rescued by infection of E14.5 Lin− fetal liver cells with retrovirus carrying the murine Gata2 cDNA in the murine stem cell virus plasmid (pMSCV; Katsumura et al., 2018). Ecotropic retrovirus was packaged in 293T cells, and retrovirus-containing supernatants were collected 24 and 48 h after transfection. Cells were infected with infectious supernatant by spinoculation for 90 min at 1,315 ×g, followed by 3 d of culturing in IMDM containing 20% FBS, 1% penicillin-streptomycin, 4% IL-3–conditioned media, and 4% SCF-conditioned media. RNA was purified with TRIzol. Global changes in gene expression were determined by RNA-seq of four biological replicates each of −77+/+ infected with empty vector, −77−/− infected with empty vector, and −77−/− infected with the Gata2 pMSCV-PIG expression vector. RNA libraries were prepared by the University of Wisconsin Gene Expression Center and sequenced using an Illumina HiSeq 2500 sequencer. Sequencing reads were aligned by STAR (version 2.5.2b) to the mouse genome (mm10; chromosomes 1 to 19; X, Y, and M) with GENCODE comprehensive gene annotation (version M16) on the reference chromosomes only. Gene expression levels were quantified by RSEM (version 1.3.0). STAR and RSEM runnings followed the RNA-seq quantification protocol from ENCODE (https://github.com/ENCODE-DCC/long-rna-seq-pipeline/blob/master/dnanexus/quant-rsem/resources/usr/bin/lrna_rsem_quantification.sh).
scRNA-seq was performed by the University of Wisconsin Gene Expression Center using the Chromium Single Cell Gene Expression platform (10x Genomics). Single-cell suspensions of CMP/GMP pools sorted from E14.5 fetal livers of two −77+/+ and two −77−/− embryos were loaded onto the Chromium Controller to generate single-cell barcoded gel bead emulsions for preparation of cDNA libraries and sequencing. Sequences were obtained from 5,167 −77+/+ and 10,028 −77−/− cells. Mean reads per cell were as follows: −77+/+ (37,365 and 25,498) and −77−/− (11,599 and 12,368). CellRanger (Zheng et al., 2017) was used with default parameter settings for alignment of sequencing reads, quantification of unique molecular identifier (UMI) counts, and filtering of empty barcodes. Cells with unusually high or low total UMI counts, a low number of detected genes, and a high proportion of UMI counts originating from mitochondrial genes were filtered with the isOutlier function in the R package scater (version 1.10.1; McCarthy et al., 2017) with nmads = 3. Genes that did not have UMI counts >4 in more than five cells were filtered out as well. Scran (version 1.10.2; Lun et al., 2016a) was used to normalize data across cells, and PCA, t-SNE (van der Maaten and Hinton, 2008), and SPRING (Weinreb et al., 2018) were used for dimension reduction. For PCA and t-SNE, the Seurat function (version 2.3.4; Butler et al., 2018) ScaleData was used to regress out total UMI count and average mitochondrial gene expression to ensure that principal components and t-SNE components were independent of these sources of variation. PCA and t-SNE were fit using the most variable genes via Seurat’s FindVariableGenes function. For trajectory analysis, SPRING was used with default settings after subsampling cells to ensure equal numbers of −77+/+ and −77−/− cells. Clustering analysis was performed with k-means using the R package cluster. We maximized average silhouette, which quantifies how similar a cell is to its own cluster compared with other clusters to determine the numbers of clusters, using the function silhouette from the R package cluster.
Quantification and statistical analysis
Results are presented as either the mean ± SEM or as box-and-whisker plots with whiskers ranging from minimum to maximum values. Multiple independent cohorts were used in each experiment. Statistical comparisons were performed using two-tailed Student’s t tests (significance cutoff of P value <0.05), with correction of statistical overrepresentation of functions calculated using the two-stage stepup method of Benjamini, Krieger, and Yekutieli as calculated using Prism software (GraphPad Software).
Statistical significance of changes in protein abundance between WT and mutant samples was determined using two-tailed Student’s t test followed by the correction for multiple hypothesis testing according to the Benjamini-Hochberg method (q < 0.05).
RNA-seq for GATA2 rescue
Differentially expressed genes were detected by DESeq2 (version 1.16.1), requiring that genes have at least twofold changes and an adjusted P value <0.05. Heatmaps of gene expression levels were prepared using ComplexHeatmap (version 1.99.7). Fragments per kilobase of transcript per million mapped reads values were added by 10−4 to avoid taking logarithm on zero.
For differential expression analysis, MAST (version 1.8.2; Finak et al., 2015) was used along with the Benjamini-Hochberg false discovery rate control to adjust for multiple comparisons. Genes with an adjusted P value <0.05 were considered differentially expressed.
Data and software availability
All raw files associated with proteomic analysis were deposited in the Proteomics Identifications Database (PRIDE) archive (Vizcaíno et al., 2016) under project accession no. PXD013855. RNA-seq raw files and RSEM quantification results were deposited in the Gene Expression Omnibus database under accession no. GSE133606. scRNA-seq raw files were deposited in the Gene Expression Omnibus database under accession no. GSE134439.
Online supplemental material
Fig. S1 shows STRING analysis of differentially expressed proteins in −77−/− CMP/GMP cells. Fig. S2 shows GATA2 Western blot and hierarchical clustering related to population RNA-seq datasets. Fig. S3 shows scRNA-seq cluster number optimization and cluster-specific Gene Ontology term analysis. Fig. S4 shows a comparison of cell proliferation features of −77+/+ and −77−/− progenitors mined from scRNA-seq data. Fig. S5 shows responsiveness of Tlr9 and Irf8 to IFN treatment and evidence for regulation of Hdac11 by GATA2. Table S1 reports dysregulated expression of immune machinery proteins in −77−/− progenitors. Table S2 shows the relationship between −77-regulated proteins and mRNAs. Table S3 lists resources and reagents used in the study.
E.H. Bresnick was supported by the National Institutes of Health (grant DK68634), the Evans MDS Foundation, Midwest Athletes Against Childhood Cancer, and the University of Wisconsin Carbone Cancer Center (grant P30CA014520). D.J. Conn was supported by a National Library of Medicine training grant (NLM 5T15LM007359) to the University of Wisconsin Computation and Informatics in Biology and Medicine Training Program. This work was also supported by the University of Wisconsin Carbone Cancer Center (support grant P30 CA014520).
Author contributions: Conceptualization: K.D. Johnson and E.H. Bresnick; Methodology: K.D. Johnson, D.J. Conn, S. Shen, E. Shishkova, K.R. Katsumura, J.J. Coon, S.G. Kraus, K.R. Calvo, E.A. Ranheim, and E.H. Bresnick; Investigation: K.D. Johnson, D.J. Conn, E. Shishkova, K.R. Katsumura, K.R. Calvo, W. Wang, and A.P. Hsu; Formal analysis: K.D. Johnson, D.J. Conn, E. Shishkova, K.R. Katsumura, P. Liu, S. Keles, and E.H. Bresnick; Writing (original draft): K.D. Johnson and E.H. Bresnick; Writing (review and editing): K.D. Johnson, D.J. Conn, E. Shishkova, K.R. Katsumura, P. Liu, J.J. Coon, S. Keles, K.R. Calvo, S.M. Holland, and E.H. Bresnick; Funding acquisition: E.H. Bresnick; Supervision: J.J. Coon, S. Keles, S.M. Holland, and E.H. Bresnick.
Disclosures: The authors declare no competing interests exist.
K.D. Johnson and D.J. Conn contributed equally to this paper.