Aging is linked to deficiencies in immune responses and increased systemic inflammation. To unravel the regulatory programs behind these changes, we applied systems immunology approaches and profiled chromatin accessibility and the transcriptome in PBMCs and purified monocytes, B cells, and T cells. Analysis of samples from 77 young and elderly donors revealed a novel and robust aging signature in PBMCs, with simultaneous systematic chromatin closing at promoters and enhancers associated with T cell signaling and a potentially stochastic chromatin opening mostly found at quiescent and repressed sites. Combined analyses of chromatin accessibility and the transcriptome uncovered immune molecules activated/inactivated with aging and identified the silencing of the IL7R gene and the IL-7 signaling pathway genes as potential biomarkers. This signature is borne by memory CD8+ T cells, which exhibited an aging-related loss in binding of NF-κB and STAT factors. Thus, our study provides a unique and comprehensive approach to identifying candidate biomarkers and provides mechanistic insights into aging-associated immunodeficiency.
As we age, our immune system undergoes a broad range of functional changes, including two hallmarks: (a) immunosenescence (i.e., functional decline), which especially affects the adaptive arm of immunity (Pawelec, 2008; Goronzy and Weyand, 2013; Goronzy et al., 2013) and (b) “inflamm-aging” (i.e., a persistent systemic inflammatory state; Franceschi et al., 2000; Pawelec et al., 2014). These changes lead to diminished ability of the immune system to generate protective responses to immunological threats, predisposing older adults to infection and raising the risk of many chronic diseases (Dorshkind et al., 2009; Shaw et al., 2013; Tchkonia et al., 2013). Chromatin accessibility is emerging as an essential component of gene regulation and genome stability. Moreover, changes in chromatin accessibility patterns are thought to play a critical role in human diseases (Philip et al., 2017) and aging (Moskowitz et al., 2017) by altering the accessibility of key proteins to regulatory regions of the genome. Despite this crucial role, assessment of chromatin accessibility in human immune cells lags behind other genome-wide measurements such as transcription or DNA modifications.
Aging-associated changes in epigenomic patterns have been reported across diverse cell types and organisms (Rando and Chang, 2012; López-Otín et al., 2013; Benayoun et al., 2015). In human immune cells, transcriptomic profiling of human PBMCs and purified immune cells revealed genes that are differentially expressed with aging (Cao et al., 2010; Harries et al., 2011; Reynolds et al., 2015). Moreover, DNA methylation studies demonstrated that human immune system aging is associated with methylation changes at specific CpG sites (Rakyan et al., 2010; Martino et al., 2011; Horvath et al., 2012; Tserel et al., 2015; Yuan et al., 2015; Zheng et al., 2016). A recent study (Moskowitz et al., 2017) reported that CD8+ T cells go through significant chromatin changes with aging. However, whether these changes are restricted to the CD8+ T cell population and whether analysis of PBMCs rather than purified CD8+ T cells can be used to detect these changes remains to be determined.
The assay for transposase-accessible chromatin with sequencing (ATAC-seq; Buenrostro et al., 2013; Qu et al., 2015) is a recent technology that enables genome-wide profiling of chromatin accessibility patterns at base pair resolution using limited cell numbers. This technology offers remarkable opportunity to define aging-associated disruptions to transcriptional regulatory programs in human immune cells with increased precision, including changes in noncoding cis-acting sequences (e.g., enhancers) and transcription factor (TF) activity. Studying chromatin accessibility in blood-derived human immune cells should provide the blueprint to better understand how transcriptional programs are disrupted in immune cells with aging and to develop potential treatments for rejuvenation. Thus, herein we profiled and analyzed chromatin accessibility and transcriptome profiles in PBMCs and purified monocytes, B cells, and T cells from 77 healthy volunteers.
An epigenomic signature of aging in PBMCs
PBMCs, a composite of immune cells, represent a tissue resource to assess and monitor an individual’s immune health and responses longitudinally. We have successfully applied PBMC profiling in earlier studies as a means of identifying transcriptomic signatures of autoimmune diseases and of immune responses to infectious agents (Chaussabel et al., 2008; Berry et al., 2010; Banchereau et al., 2016). To examine aging-associated chromatin accessibility profiles, we collected blood and isolated PBMCs from 77 healthy, community-dwelling research volunteers: 51 healthy young (HY, 22–40 yr) and 26 healthy old (HO, 65+ yr) subjects (Fig. 1 A and Table S1). As the changes captured in PBMC epigenomes could be attributable to both differences in the frequency of certain cell types and changes in genomic patterns that are intrinsic to specific cell subsets (Kowalczyk et al., 2015), we also examined the cell composition of PBMCs using flow cytometry from a subset of these subjects (Table S2). Proportions of naive CD4+ T cells, naive CD8+ T cells, and CD19+ B cells significantly decreased with aging, consistent with age-related decline in thymus and bone marrow activity (Fig. S1, A–C). The most significant aging-associated decline was observed in the naive CD8+ T cell population (Fig. S1 C), where the percentage of naive CD8+ T cells in PBMCs decreased from ∼7% to ∼3% with age (P = 1e−04, Wilcoxon rank-sum test).
ATAC-seq profiles were generated from 49 subjects (28 HY, 21 HO) by incubating the purified nuclei with Tn5 transposase to cut and “tag” accessible chromatin and sequencing the resulting “tags” to identify genome-wide open chromatin patterns. This approach identified 140,172 open chromatin sites (i.e., peaks) associated with 22,124 genes based on their distance to transcription start sites (TSSs). Only high-quality samples passing quality control criteria were used in downstream analyses (44 samples, 25 HY and 19 HO; Methods). Using a generalized linear model (GLM), 12,626 differentially accessible peaks (9% of those tested, false discovery rate [FDR] < 0.05) were identified between age groups. Of these, 6,977 showed a decrease and 5,649 showed an increase in chromatin accessibility with aging, hereafter referred to as “closing” and “opening” peaks, respectively (Fig. 1 B). Principal component analysis and hierarchical clustering analyses confirmed that differential peaks discriminated between age groups with high accuracy, where a majority of samples (42 out of 44 in hierarchical clustering) clustered based on the age group alone (Fig. 1, C and D; Fig. S1 D).
The Roadmap Epigenomics Project (Roadmap Epigenomics Consortium et al., 2015) profiled reference PBMC samples and defined functional states such as promoters and enhancers in these cells. To determine the location of ATAC-seq open chromatin sites (e.g., promoters, enhancers) we annotated them using these Roadmap-defined (i.e., ChromHMM) chromatin states. As expected, the most accessible peaks mostly overlapped with promoter and enhancer states (Fig. S1 E). In contrast, less accessible peaks were annotated with repressed or quiescent states (Fig. S1 E). For example, the most accessible 10% among all peaks (14,222) were mostly at promoters (55.6%) and enhancers (36.1%), whereas the least accessible 10% (14,199) were mostly at quiescent sites (61.3%). As shown in Fig. 1 E, there was a remarkable difference in the functional annotation of differential peaks, with closing peaks mostly found at promoters and enhancers and opening peaks mostly found at repressed and quiescent sites. Moreover, closing peaks were more accessible (i.e., larger peaks) on average than opening peaks and all ATAC-seq peaks (Fig. S1 F).
Analysis of the frequency of differentially accessible chromatin regions across the cohort revealed that closing peaks consist of open chromatin sites that are common between subjects (i.e., high-frequency peaks), whereas opening peaks consist of low-frequency and often subject-specific peaks (Fig. 1 F). Thus, examination of chromatin accessibility led to the identification of an epigenomic signature of aging in PBMCs composed of (a) chromatin closing at the most accessible promoter/enhancer regions of the genome across the population and (b) a chromatin opening of less accessible regions of the genome in a subject-specific manner.
Chromatin closing at promoters and enhancers with aging
Differential peaks were annotated to the nearest gene based on their distance to TSS, thereby linking 4,567 and 3,816 genes to “closing” and “opening” peaks, respectively (see Table S3 for genes associated with differentially open peaks). Genes annotated to differentially accessible peaks were further characterized using gene ontology (GO) terms, which revealed 622 and 379 immune-related genes that are closing and opening, respectively (Fig. 2 A). ClueGO (Bindea et al., 2009) enrichment analyses revealed that chromatin closing is significantly associated with genes involved in T cell activation-related GO terms (n = 161 genes) and T cell receptor signaling pathway (n = 59 genes; Fig. 2 B, Fig. S2 A, and Table S4). In contrast, chromatin opening is associated significantly with genes involved in myeloid leukocyte (n = 48 genes) and osteoclast differentiation (n = 29 genes) processes (Fig. 2 B, Fig. S2 B, and Table S4).
To further interpret the immunological implications of these chromatin changes, we compared them against previously described gene sets from transcriptional profiles (Chaussabel et al., 2008), where each module represents a coordinately expressed gene set across many PBMC expression profiles. These modules are functionally characterized and linked to pathways or cell types involved in immune processes, which enabled us to systematically define transcriptional fingerprints of diverse immune diseases and responses (Berry et al., 2010; Guiducci et al., 2010). We further characterized the modules with unknown functionality using ClueGO (Bindea et al., 2009) annotations (Methods; Table S5) and calculated the mean fold change for immune module genes, where negative numbers represent chromatin closing with aging (Fig. 2 C, blue bars) and positive ones represent chromatin opening with aging (Fig. 2 C, red bars). This analysis revealed a systematic loss of chromatin accessibility at the promoters and enhancers of immune module genes. This loss was accompanied by a gain of chromatin accessibility around these genes at loci classified as repressed or quiescent by the Roadmap (Roadmap Epigenomics Consortium et al., 2015) profiles (Fig. 2 C), consistent with observed genome-wide patterns (Fig. 1 E). Modular analyses of the chromatin accessibility data revealed that the T cell module genes exhibit the most significant chromatin closing with aging (Fig. 2 C, first row). Fig. 2 D shows that at the subject level, most genes in the T cell module exhibited chromatin silencing with aging, resulting in a striking separation of subjects into their respective age groups based on the chromatin accessibility of these genes. Focusing on the genes within the inflammation I module revealed that aging has a dual effect on inflammation-related genes (Fig. 2 E). A set of inflammation-related genes are repressed via chromatin closing mostly at their enhancers, including hypoxia-inducible factor HIF1A, which modulates hypoxia responses in immune cells (Palazon et al., 2014). Meanwhile, a mutually exclusive gene set was associated with chromatin opening mostly at quiescent sites, including DUSP10, a molecule that is known to play an essential role in local and systemic inflammation (Lang et al., 2006). A gain in chromatin accessibility was not detected around the IL-6 molecule, a key mediator of systemic inflammation (Ershler and Keller, 2000), suggesting that age-associated increases in serum IL-6 levels might originate from cells other than PBMCs (Maggio et al., 2006). Thus, our analysis shows a widespread chromatin closing at promoters and enhancers related to immune functions, especially T cell functions.
Aging-associated gene expression and chromatin accessibility changes
To link aging-associated chromatin changes to transcript levels, RNA-seq profiles of PBMCs from 39 subjects (24 HY, 15 HO) were generated to match the ATAC-seq samples. Fig. 3 A shows a significant positive correlation between age-related changes in gene expression levels and chromatin accessibility at gene promoters (r = 0.34, P < 1e−16). We identified immune modules that undergo transcriptional, epigenetic, and concordant (i.e., transcription and chromatin accessibility are remodeled together in the same direction) changes with aging (Fig. 3 B). Analysis of genes showing concordant remodeling (Fig. 3 B, third column) revealed the most significant chromatin closing with aging as well as declines in gene expression in the T cell module (Fig. S3, A and B). Many genes associated with T cell functions, including TFs involved in lymphocyte development and activation, such as LEF1 and TCF7, exhibited correlated decreases in chromatin and expression profiles (Fig. 3 C and Fig. S3, C and D). Meanwhile, other immune modules, most notably cytotoxic cells, were activated both at the chromatin and gene expression levels with aging (Chaussabel et al., 2008; Fig. 3 B, red bars in third column). These included activation of granzymes (GZMH and GZMB) and granulysin (GNLY; Fig. 3, D and E) and might originate from natural killer cells (Hayhoe et al., 2010); however, multiple other cell types can express GZMB, including plasmacytoid dendritic cells (Matsui et al., 2009), CD4+ T cells (Namekawa et al., 1998; Appay et al., 2002), and plasma cells (Xu et al., 2014).
Our analyses also revealed chromatin remodeling that was not accompanied by changes in gene expression (Fig. S3 E). For example, chromatin opening at genes associated with inflammation was not accompanied by changes in gene expression, suggesting that transcriptional activation of these genes might depend on additional stimuli. Thus, the combined analyses of the epigenome and transcriptome increased the power of each assay and enabled us to identify immune effector molecules that are activated/inactivated with aging. Moreover, chromatin accessibility profiles provided more precise and genome-wide information than expression data alone, including epigenomic changes at promoters and enhancers.
Epigenomic silencing of T cell signaling pathways with aging
IL7R, a gene critical for lymphocyte development and healthy immune responses (Schluns et al., 2000), was among the top genes linked to multiple closing peaks (n = 12; Fig. 4 A; Fig. S4, A and B). The loss of chromatin accessibility around the IL7R locus was also accompanied by aging-associated decreases in IL7R expression (Fig. 4 B). Moreover, the IL7R expression and the chromatin accessibility of its promoter were significantly correlated at the individual level (r = 0.59, P < 0.001; Fig. 4 C). Additional genes in the IL-7 signaling cascade, including JAK1, JAK3, STAT5A, and STAT5B, also exhibited closing in aged individuals (Fig. 4, D and E), possibly explaining the impaired signaling and responses to IL-7 in the elderly (Kim et al., 2006). Moreover, these results revealed IL7R and the members of the IL-7 signaling pathway as potential biomarkers of healthy aging. WikiPathways enrichment analyses (Kelder et al., 2012) of genes annotated with differentially open peaks confirmed systematic and significant chromatin closing of IL-7 signaling pathway genes and other signaling pathways, including TCR, IL-2, and IL-9 signaling (Fig. 4 F, Fig. S4 C, and Table S4). In addition, 27 out of 70 genes in the “histone modifications” pathway were also associated with significant chromatin closing with aging (Fig. S4 C and Table S4). These included histone genes (e.g., HIST1H3D, HIST1H3E, HIST4H4) as well as histone modifiers (e.g., EZH1, SETD7), in alignment with the known reduced expression of core histones and disruption of histone modification patterns associated with cellular aging (Benayoun et al., 2015). Collectively, our results suggest that aging is associated with the chromatin closing of multiple pathways related to T cell signaling that might explain impaired T cell responses in the elderly. Moreover, these results establish IL7R and the members of the IL-7 signaling pathway as potential biomarkers of healthy aging whose predictive ability as immune health indicators needs to be assessed in longitudinal studies.
CD8+ T cells account for the PBMC aging signature
Flow cytometry data from 23 subjects (12 HY and 11 HO) revealed that the age-related decrease in IL7R expression was limited to CD8+ T cells (Fig. 5, A and B; and Table S6), indicating that chromatin alterations captured in PBMCs were not equal across profiled T cell subsets. Furthermore, when we measured pSTAT5 induction upon IL-7 stimulation, we noted major differences between CD4+ and CD8+ T cells. These data revealed a clear decline in IL-7 responsiveness with aging in CD8+ T cells (both percent and magnitude) that was not observed in CD4+ T cells (Fig. 5 C). Flow cytometry analyses suggested that significant reductions in IL7R-expressing cells occur in both naive and memory CD8+ compartments, including central memory (CM) and effector memory RA (EMRA) subpopulations (Fig. 5 D). Based on these data, we profiled the chromatin accessibility of sorted naive and memory CD4+ and CD8+ T cells from eight donors (4 HY, 4 HO; Materials and methods). A similar number of open chromatin sites was captured in these four subsets (∼45,000–50,000 peaks). However, each subset exhibited different changes in its open chromatin sites with aging. CD4+ T cells showed minimal chromatin remodeling with aging: 44 peaks in memory and 216 peaks in naive CD4+ T cells. In contrast, CD8+ T cells showed extensive chromatin remodeling with aging (Fig. 5 E). Specifically, memory CD8+ T cells displayed 8,503 (19.7% of those tested) differential peaks, whereas naive CD8+ T cells displayed 2,925 (6.4% of those tested) differential peaks. A recent study (Moskowitz et al., 2017) reported aging-associated closing of chromatin accessibility at gene promoters in CM and naive CD8+ T cells. Our data align with this observation and suggest that chromatin at gene promoters closes with aging in both memory and naive CD8+ T cells, although this pattern is more evident in memory CD8+ T cells. More specifically, functional state annotations from Roadmap T cell data sets (Roadmap Epigenomics Consortium et al., 2015) revealed that chromatin closing in memory CD8+ T cells occurred mostly at promoters (>50%) and enhancers (>30%), whereas chromatin opening was associated less with promoters (∼10%) and more with quiescent (∼30%) and enhancer sites (∼40%; Fig. 5 F), similar to the PBMC signature (Fig. 1 E). In naive CD8+ T cells, chromatin remodeling was mostly observed at enhancers, including >75% of closing peaks and >50% of opening enhancer peaks.
Comparing aging-induced chromatin remodeling in T cell subsets to that of PBMCs indicated that chromatin remodeling in PBMCs correlates positively with the chromatin remodeling of CD8+ T cell subsets (Fig. S4 D). These results suggest that CD4+ and CD8+ T cell populations go through very different cellular changes with aging. Analysis of gene promoters known to be expressed in memory and naive CD4+ and CD8+ T cells revealed that the chromatin closing of promoters of genes encoding certain surface (e.g., CD28, IL7R) and signaling (STAT4) molecules was more pronounced in memory CD8+ T cells (Fig. 5 G and Fig. S4 E). Moreover, chromatin closing in PBMCs around T cell signaling pathways, including the IL7 signaling pathway, mostly stemmed from memory CD8+ T cells (Fig. 5 H), including the IL7R locus itself (Fig. 6 A). These results identify memory CD8+ T cells as the subpopulation with the most profound chromatin remodeling with aging. Moreover, the silencing of promoters and enhancers in PBMCs likely stems from CD8+ T cells, even though CD8+ T cells typically constitute <10–15% of PBMCs.
We also profiled the chromatin accessibility of purified monocytes (n = 20) and naive B cells (n = 7) and found that these cell types do not display significant chromatin accessibility changes with aging (Fig. S4 F), further confirming that aging differentially affects distinct immune cell subsets. Moreover, chromatin changes detected in purified immune subsets (i.e., memory and naive CD8+ T cells) suggest that the aging signature of PBMC is not merely a consequence of cell composition changes with aging, which is in agreement with similar observations based on single-cell transcriptomic profiling (Kowalczyk et al., 2015).
Potential regulators of IL7R are silenced in memory CD8+ T cells
Our data revealed that whereas chromatin closing around the IL7R locus (promoters and enhancers) was observed in CD8+ T cells, both memory and naive, chromatin closing at the IL7R promoter was specific to memory CD8+ T cells (Fig. 6 A, gray bar showing the IL7R promoter peak). To uncover potential regulators of these changes, we analyzed TF binding motifs located around the IL7R TSS (10 kb upstream, 1 kb downstream). After filtering out the TFs based on their expression in PBMCs, several TFs and TF families emerged, including LEF1, ETS2, BACH1/2, JUN, the NF-κB family, and the STAT family (Fig. 6 B and Table S7). Among these, NF-κB, JUN, and STATs are known as rapid-acting factors, which can be present in an inactive state not requiring protein synthesis to be activated. Enrichment of these TFs at the IL7R promoter suggests a role for these rapid-acting TFs in ensuring the rapid activation of IL7R and modulating IL-7 responsiveness in T cells. In fact, NF-κB directly controls the expression of the IL7R gene in T cells through an enhancer control region close to the promoter (Miller et al., 2014). Furthermore, the chromatin around the promoters of these factors (e.g., NF-κB and STAT family members) closed with aging, specifically in memory CD8+ T cells (Fig. 6, C–E). Our data and analyses suggest that these TFs are likely to play a role in regulating the activity of IL7R in T cells and lose their chromatin accessibility—and hence functionality—with aging, specifically in memory CD8+ T cells. In alignment with these findings, Moskowitz et al. (2017) also reported aging-related disruptions in TF binding patterns in CD8+ T cells.
To determine whether silencing of TF promoters might correspond to changes in TF binding activity, we conducted TF footprinting analyses using PBMCs and T cell ATAC-seq samples. After pooling the ATAC-seq samples by cell type and age group and normalizing with respect to the library depth, significant TF footprint calls were obtained using the PIQ algorithm (Sherwood et al., 2014). These analyses showed that several TFs with significant footprints around the IL7R promoter, including RXRA, NFKB1, ETS1, and TCF7, lost their footprints with aging, specifically in memory CD8+ T cells (Fig. S5 A). Globally, there was also an aging-related decrease in TF footprinting calls for all TFs in memory CD8+ T cells (Fig. 6 F), including footprints for NF-κB factors, STATs, and TFs with important roles in T cell functions (Fig. 6 G and Table S8). Collectively, these results indicate that memory CD8+ T cells undergo aging-associated silencing of regulatory elements and interactions, as evident from the loss of chromatin accessibility at TF gene promoters and the decrease in TF binding estimates.
Aging-specific chromatin accessibility profiles are not linked with CMV
In chronic infections, most notably with CMV, CD8+ T cells enter an “exhausted” state of reduced functionality and stop responding to further stimulation (Wherry, 2011; Sansoni et al., 2014), and an aging-related increase in CMV seropositivity is associated with increased mortality in the elderly (Fülöp et al., 2013; Savva et al., 2013). To study whether the observed aging-associated chromatin signature in PBMCs is attributable to CMV seropositivity, we measured CMV IgG antibody status in a subset of our cohort (n = 26, 17 HY, 9 HO), of which n = 21 (12 HY, 9 HO) also had ATAC-seq profiles. As reported (Fülöp et al., 2013; Savva et al., 2013), we observed that more elderly subjects were CMV positive (70%) compared with young subjects (45%; Table S9). Moreover, there was a significant correlation between an individual’s age and his or her CMV antibody levels (correlation coefficient = 0.65, Fig. 7 A). Both aging and CMV status were correlated with changes in PBMC composition, most notably with changes involving CD8+ T cell subpopulations (Fig. S5, B–D). The decrease of the naive CD8+ T cell number is more dependent on aging than on the CMV status (Fig. S5, B–D), as observed earlier in large cohorts (Wertheimer et al., 2014). Differential chromatin accessibility analyses between CMV-positive and CMV-negative subjects revealed that CMV seropositivity by itself is not associated with significant chromatin remodeling (Fig. 7 B), even when differential analyses are stratified by age group (Fig. S5 E). Moreover, CMV-related and aging-related changes were not correlated (correlation coefficient = −0.02, Fig. 7 C). Principal variance component analysis (PVCA; Boedigheimer et al., 2008) confirmed that the variation in aging-associated ATAC-seq peaks cannot be explained by CMV status or sex, whereas age contributed ∼30% of the variation in these data (Fig. S5 F). Thus, although elderly subjects are more likely to have higher CMV infection rates, CMV seropositivity is neither associated with significant chromatin accessibility changes nor with the epigenomic aging signature observed in PBMCs.
Aging-associated chromatin accessibility profiles are stable over seasons
Seasonal variation affects immune cell counts, gene expression patterns (Dopico et al., 2015), and immune responses (Aguirre-Gamboa et al., 2016; Ter Horst et al., 2016). PVCA analyses showed that the season is among the biggest factors introducing variation in the chromatin accessibility data (∼20% of the variation; Fig. S5 G); therefore, the season is used as a covariate in our models (Methods). We separately analyzed PBMC samples collected in June–November (“summer”) and December–May (“winter”). By using “season” as a blocking factor in the GLMs, we defined aging-associated chromatin accessibility changes in summer (10 HY, 7 HO) and winter samples (15 HY, 12 HO). This analysis resulted in 8,744 ATAC-seq peaks that are differentially accessible between HY and HO in summer and winter samples (Fig. 7 D), 87% of which were also captured as differential in the combined cohort. Similar to the global signature, closing peaks are mostly found at promoters and enhancers, and opening peaks are at quiescent and repressed sites (Fig. S5 H). Fold changes between HY and HO samples that were obtained from two seasons correlated highly with the fold changes obtained from the whole cohort, although winter samples had a slightly higher correlation score (Fig. 7 E; r = 0.81 for summer, r = 0.86 for winter). These results indicate that the chromatin accessibility aging signature is robust and is not significantly affected by seasonal variation.
This study integrates chromatin accessibility (ATAC-seq) and transcriptomes (RNA-seq) of blood-derived human immune cells and demonstrates, in healthy elderly individuals, an epigenomic alteration that is essentially borne by the memory CD8+T cell compartment. By applying systems immunology approaches, we defined for the first time a chromatin accessibility signature of aging-related reduced immune responses in PBMCs and uncovered genes and proteins that can serve as potential biomarkers of immunodeficiency. The signature contained simultaneous chromatin closing at promoters and enhancers associated with T cell signaling and chromatin opening mostly found at quiescent and repressed sites. Regions associated with chromatin opening were stochastically distributed across the cohort, where they were observed in a single subject or a small number of subjects. This chromatin opening might be associated with the aging-related loss of histone proteins leaving short strands of DNA accessible to the Tn5 transposase (Pal and Tyler, 2016). However, the functional significance of this observation remains to be determined. In contrast, a widespread loss of chromatin accessibility at promoters and enhancers especially affecting T cell signaling pathways was observed across all subjects. Matching chromatin accessibility and gene expression by genomic region (i.e., promoter and transcript) and by subject helped us identify immune effector molecules that exhibit concordant alterations in their expression and chromatin accessibility levels with aging. Among these molecules, a substantial number of genes related to T cell functions were silenced epigenomically and transcriptionally, most notably the IL7R gene and other genes encoding the IL-7 signaling pathway.
Chromatin accessibility profiles of purified cells revealed that this aging-related signature in PBMCs stems from CD8+ T cells, which together account for 10–15% of PBMCs. We found that memory CD8+ T cells undergo a widespread silencing of promoters with aging, whereas naive CD8+ T cells exhibit such a loss mostly at enhancers, a dichotomy that is not yet understood. Parallels between PBMCs and memory CD8+ T cells were particularly notable for genes in the T cell signaling pathways, including IL-7 signaling, where a strong chromatin closing around the promoters of these genes was observed in both PBMC and memory CD8+ T cells. The alteration of the IL-7 pathway might explain the loss of the homeostatic proliferation of CD8+ T cells (Briceño et al., 2016) as well as their reduced antigen-driven proliferation, thus curtailing responses to infectious agents and cancer cells. Indeed, when compared with CD8+ T cells from HY adults, those from elderly individuals responded less well to IL-7, as measured by STAT5 phosphorylation. The defect in the response to IL-7 is specific to CD8+ T cells, as CD4+ T cells from elderly and young individuals responded similarly to IL-7 stimulation.
A previous study established decreased expression of IL7R on naive (CD45RA+CCR7+) and EMRA (CD45RA+CCR7−) CD8+ T cells as one of the hallmarks of aging (Kim et al., 2006). However, the most dramatic impact of reduced IL7R levels, including BCL2 upregulation and STAT5 phosphorylation, was detected in the EMRA subpopulation. This has important functional consequences, as signaling initiated by IL-7 plays a crucial role in the maintenance and homeostasis of naive and memory T cells (Schluns et al., 2000). In CD8+ T cell subsets, IL-7 stimulation leads to a more homogenous response in terms of BCL2 upregulation and STAT5 phosphorylation in naive and CM subpopulations, whereas the response is more heterogeneous in effector memory (EM) and EMRA subpopulations in both young and elderly subjects (Kim et al., 2006). Indeed, IL7R is an important marker of functional heterogeneity observed in mouse effector memory CD8+ T cells (Kaech et al., 2003). Epigenetic mechanisms have been implicated in regulating IL7R levels in human CD8+ T cells. For example, increased DNA methylation levels have been observed around the IL7R promoter in IL-7Ralow cells in comparison to IL-7Rahigh cells (Kim et al., 2007). A similar mechanism was identified in mouse cells where the IL7R expression in memory precursor CD8+ T cells was inhibited by HDAC1-mediated histone deacetylation of the gene promoter (Chandele et al., 2008).
Memory CD8+ T cells also exhibited an epigenomic silencing of TF promoters and a loss in TF footprints with aging. This aging-associated loss of TF footprints included fast-acting factors such as NF-κB and STATs, which might play a role in regulating rapid T cell responses. This loss in TF footprints was particularly noticeable around the IL7R locus. A recently published study (Moskowitz et al., 2017) similarly reported the aging-associated erosion of chromatin accessibility around gene promoters in naive and CM CD8+ T cells and showed that aging disrupts the TF binding patterns in these cells. Similarly, chromatin accessibility changes in CD8+ T cells have been recently reported in cancer (Philip et al., 2017). By establishing a chromatin accessibility signature of aging-related immunodeficiency and delivering potential biomarkers in PBMCs (rather than in purified cells), our study addresses a significant gap and motivates future longitudinal studies.
The epigenomic signature of aging in PBMCs described herein was robust and was associated with neither CMV seropositivity nor seasonal variation. As part of this signature, IL7R emerged as a potential biomarker of reduced immune responses, where aging-associated loss of this molecule is observed at the chromatin, transcriptome, and protein levels, especially in memory CD8+ T cells. Such biomarkers could be instrumental in identifying individuals who might benefit the most from therapies to rejuvenate declining immune functions due to aging or diseases, such as HIV (Feinberg, 2007; Kennedy et al., 2014). Indeed, clinical studies of recombinant human IL-7 (rhIL-7) suggested a possible rejuvenation of the circulating T cell profile upon administration of rhIL-7, especially in individuals with limited naive T cells and diminished TCR repertoire diversity, as in the case of the elderly (Sportès et al., 2008, 2010). Chromatin accessibility profiling of immune cells for individuals would also be helpful in quantifying whether rejuvenation therapies administrated to individuals, such as rhIL-7, are effective and can lead to measurable genomic changes around relevant genes/pathways.
The demonstration that aging has a profound impact on the epigenomes of human CD8+ T cells as presented herein and in the study by Moskowitz et al. (2017) opens the door to profiling of chromatin accessibility in a clinical context. In this study, we show for the first time that measuring chromatin accessibility from whole blood samples is sensitive enough to detect aging-associated changes, even if these changes stem from a subpopulation of cells. PBMCs are easy to obtain and profile; hence, this opens the door to the assessment of healthy immune system responses in diverse clinical conditions such as diseases or response to therapeutic intervention, including vaccination. Our study is the first demonstration that leukocyte chromatin accessibility profiling can serve as an integral and powerful immune monitoring tool for reduced immune responses.
Materials and methods
All studies were conducted after receiving approval by the Institutional Review Board (IRB) of the University of Connecticut Health Center (IRB 14-194J-3). After receiving informed consent, blood samples were obtained from 75 HY (22–40 yr) and 26 HO (65+ yr) research volunteers residing in the Greater Hartford, CT, region using services from the University of Connecticut Center on Aging Recruitment and Community Outreach Research Cores (http://health.uconn.edu/aging/research/research-cores/). Recruitment criteria were selected to identify individuals who are experiencing “healthy” aging and are thus representative of the mean or typical normal health status of the local population within the corresponding age groups (Robertson and Williams, 2009). Selecting this type of cohort increases the generalizability of our studies and the likelihood that these findings can be translated to the general population (Robertson and Williams, 2009).
Subjects were carefully screened to exclude potentially confounding diseases and medications as well as frailty. Individuals who reported chronic or recent (i.e., within 2 wk) infections were also excluded. Subjects were deemed ineligible if they reported a history of diseases such as congestive heart failure, ischemic heart disease, myocarditis, congenital abnormalities, Paget’s disease, kidney disease, diabetes requiring insulin, chronic obstructive lung disease, emphysema, or asthma. Subjects were also excluded if they were undergoing active cancer treatment, took prednisone above 10 mg day, took other immunosuppressive drugs, took any medications for rheumatoid arthritis other than NSAIDs, or had received antibiotics in the previous 6 mo.
In addition to these steps to exclude specific chronic conditions, we also undertook further additional efforts to exclude older adults with any significant frailty. Because declines in self-reported physical performance are highly predictive of frailty, subsequent disability, and mortality (Hardy et al., 2011), all subjects were also questioned as to their ability to walk 1/4 mile (or two to three city blocks). For those who self-reported an inability to walk 1/4 mile (Hardy et al., 2011), the “timed up and go” (TUG) test was performed and measured as the time taken to stand up from the sitting position, walk 10 ft, and return to sitting in the chair (Podsiadlo and Richardson, 1991). A TUG score of >10 s was considered an indication of increased frailty and resulted in exclusion from the study (Rockwood et al., 2000).
Medication usage did increase with age. Nevertheless, these medications all reflected their use for common and controlled chronic conditions unlikely to influence our findings, such as hypertension, hyperlipidemia, hypothyroidism, degenerative joint disease, seasonal allergies, headaches, atrial fibrillation, depression, anxiety, or attention deficit hyperactivity disorder (ADHD). Finally, smoking history data are not typically collected in these studies—including ours—because smoking is a rare habit in the elderly population.
Cell sorting and phenotypic analysis
PBMCs were isolated from fresh whole blood using Ficoll-Paque Plus (GE) density gradient centrifugation. For cell sortings, we used fluorochrome-labeled antibodies specific for CD3 (UCHT1), CD27 (M-T271; Biolegend); CD4 (RPA-T4), CD45RO (UCHL1), CD45RA (HI100), CD19 (HIB19), CD16 (B73.1), and IgD (IA6-2); CD11c (S-HCL-3; BD Biosciences); and CD8 (SCF121Thy2D3) and CD19 (J3-119; Beckman-Coulter). Naive CD4 (CD4+CD8−CD45RO−CD45RA+), naive CD8 (CD4−CD8+CD45RO−CD45RA+), memory CD4 (CD4+CD8−CD45RO+CD45RA−), and memory CD8 (CD4−CD8+CD45RO+CD45RA−) T cells were sorted from the CD19−CD16−CD11c− fraction (DUMP channel). Naive B cells (CD19+IgD+CD27−) were sorted from the CD3−CD16−CD11c−) fraction (DUMP channel). Cell sorting was performed using FACSAria Fusion (BD Biosciences). Monocytes were isolated from fresh PBMCs by positive selection using magnetic CD14 microbeads (Miltenyi Biotech). For phenotypic analysis, PBMCs were stained with fluorochrome-labeled antibodies specific for CD3 (UCHT1), CD4 (RPA-T4), CD8 (SCF121Thy2D3), CD45RA (HI100), CD19 (HIB19), CD14 (MSE2), CCR7 (150503), and CD127 (HIL-7R-M21). For the analysis of the frequencies of naive T cells (CD45RA+CCR7+), CM T cells (CD45RA−CCR7+), EM T cells (CD45RA−CCR7−), EMRA (CD45RA+CCR7−), B cells, and monocytes, PBMCs were stained with fluorochrome-labeled antibodies specific for CD3 (UCHT1), CD4 (RPA-T4), CD8 (SCF121Thy2D3), CD45RA (HI100), CD19 (HIB19), CD14 (MSE2), CCR7 (150503), and CD127 (HIL-7R-M21). The stained cells were acquired with BD Fortessa and analyzed with FlowJo software (Tree Star).
CMV seropositivity measurements
Anti-CMV IgG titers were determined in frozen sera by commercially available ELISA (Genway Biotech Inc.) with an interassay coefficient of variance of 5.2%. A titer of 1.2 ELISA units/ml or greater in a sample was predetermined by the manufacturer as CMV seropositive.
PBMC stimulation and phospho-STAT5 detection
Cryopreserved PBMCs isolated from HY and HO individuals were recovered and allowed to rest in complete media at 37°C for 2 h. Then, cells were washed and incubated for 15 min at 37°C in prewarmed complete culture medium supplemented or not with 100 ng/ml IL-7 (PeproTech). After stimulation, cells were fixed for 10 min with prewarmed fixation buffer (BD Cytofix; BD Biosciences), washed, and permeabilized on ice for 30 min with ice-cold Phosphoflow Perm Buffer III (BD Biosciences). Cells were washed, stained for 60 min with a cocktail containing fluorochrome-labeled antibodies specific for CD3 (UCHT1), CD4 (RPA-T4), CD8 (RPA-T8), and pSTAT5 (pY694; BD Biosciences), and analyzed by flow cytometry. The P value calculations on these values were conducted using a one-sided Wilcoxon test after the medium levels were subtracted from each subject.
ATAC-seq library generation and preprocessing
ATAC-seq was performed as previously described (Buenrostro et al., 2013). 50,000 unfixed nuclei were tagged using Tn5 transposase (Nextera DNA sample prep kit; Illumina) for 30 min at 37°C, and the resulting library fragments were purified using a Qiagen MinElute kit. Libraries were amplified by 10–12 PCR cycles, purified using a Qiagen PCR cleanup kit, and finally sequenced on an Illumina HiSeq 2500 with a minimum read length of 75 bp to a minimum depth of 30 million reads per sample. At least two technical replicates (mean = 2.4 replicates) were processed per biological sample. Table S10 summarizes the depth, peak number, and fragments in reads (FrIP) scores for ATAC-seq samples.
ATAC-seq sequences were quality filtered using trimmomatic (Bolger et al., 2014), and trimmed reads were mapped to the GRCh37 (hg19) human reference sequence using bwa-mem (Li and Durbin, 2009). After alignment, technical replicates were merged and all further analyses were performed on these merged data. For peak calling, MACS2 (Zhang et al., 2008) was used with no-model, 100-bp shift, 200-bp extension, and broad peaks options. Only peaks called with a peak score (q-value) of 1% or better were kept from each sample, and the selected peaks were merged into a consensus peak set using the Bedtools multiinter tool (Quinlan and Hall, 2010). Only peaks called on autosomal chromosomes were used in this study. We further filtered consensus peaks to avoid likely false positives by only including those peaks overlapping more than 20 short reads in at least one sample and peaks for which the maximum read count did not exceed 500 cpm to account for regions that are potential artifacts. Finally, we excluded peaks overlapping blacklisted regions as defined by the ENCODE mappability criteria developed for DNase assays (July 2015 version; http://hgdownload.cse.ucsc.edu/goldenpath/hg19/encodeDCC/wgEncodeMapability/).
An additional quality control step was developed to filter out samples with a consistently poor signal, consisting of an algorithm to discover and characterize a series of relatively invariant “benchmark peaks,” defined as a set of peaks expected to be called in all samples. Samples that consistently miss calls for a significant portion of these benchmark peaks are flagged as having poor quality. A benchmark peak is defined based on three criteria: (a) that it remains approximately invariant between the two groups of interest (i.e., young and old samples); (b) that it captures a substantial number of reads; and (c) that it is called in most samples. For each peak, the absolute value of the log of the ratio of HO:HY mean normalized read counts (log fold change [logFC]) was used to assess the first criteria, whereas the maximum read count over all samples (maxCt) was used to assess the second one. In this study, a peak was considered apt for benchmarking when (a) its absolute logFC was in the bottom decile of the distribution over all peaks; (b) its maxCt was in the top decile of the distribution over all peaks; and (c) the peak was called in at least 90% of the samples. Using these parameters, 273 (out of 169,636) peaks were selected as a benchmark; only samples for which at least 95% of these peaks were called were selected for analyses, which excluded five samples from further analyses. We examined the effects of each of these parameter choices and found that the same samples were consistently chosen as poor quality for a range of values chosen to assess the benchmark criteria. Before performing statistical analyses, ATAC-seq read counts were normalized to each sample’s effective library size (i.e., the sum of reads of overlapping peaks) using the trimmed mean of M-values normalization method (TMM; Robinson and Oshlack, 2010).
RNA-seq library generation and preprocessing
Total RNA was isolated from PBMCs using RNeasy (Qiagen) or Arcturus PicoPure (Life Technologies) kits following the manufacturer’s protocols. During RNA isolation, DNase treatment was additionally performed using the RNase-free DNase set (Qiagen). RNA quality was checked using an Agilent 2100 Expert bioanalyzer (Agilent Technologies). RNA quality was reported as a score from 1 to 10, and samples falling below the threshold of 8.0 were omitted from the study. cDNA libraries were prepared using a TruSeq Stranded Total RNA LT Sample Prep kit with Ribo-Zero Gold (Illumina), a Kapa Stranded mRNA-Seq Library Prep kit (Kapa Biosystems), or NuGEN Ovation RNA-seq v2 (NuGEN) according to the manufacturer’s instructions using 100 or 500 ng of total RNA. Final libraries were analyzed on a Bioanalyzer DNA 1000 chip (Agilent Technologies). Paired-end sequencing (2 × 75 bp or 2 × 100 bp) of stranded total RNA libraries was performed in an Illumina HiSeq2500 using SBS v3 sequencing reagents.
Quality control of the raw sequencing data was performed using the FASTQC tool, which computes read quality using a summary of per-base quality defined using the probability of an incorrect base call (Ewing et al., 1998). According to our quality criteria, reads with more than 30% of their nucleotides with a Phred score less than 30 were removed, whereas samples with more than 20% of such low-quality reads were dropped from analyses. None of the samples used in this study were dropped after quality control. Reads from samples that passed the quality criteria were quality trimmed and filtered using trimmomatic (Bolger et al., 2014). High-quality reads were then used to estimate transcript abundance using RSEM (Li and Dewey, 2011). Finally, to minimize the interference of nonmessenger RNA in our data, estimate read counts were renormalized to include only protein-coding genes. Table S10 summarizes the depth and alignment rate of our PBMC RNA-seq samples.
To identify differentially open chromatin regions from ATAC-seq and differentially expressed genes from RNA-seq data, the R package edgeR was used to fit a GLM to test for the effect of aging between HY and HO samples. In addition to age group (old vs. young), our models included sex and the season in which the sample was collected (summer vs. winter) as covariates (Robinson et al., 2010) because it was determined using PVCA (Boedigheimer et al., 2008) that these factors account for a sizeable fraction of the variance in read counts. Furthermore, we used surrogate variable analysis (SVA; Leek et al., 2012) to capture unknown sources of variation (e.g., batch effects, subject-level heterogeneity, variation in library preparation techniques) statistically independent from age group assignments. SVA decomposes the variation that is not accounted for by known factors such as age group or sex into orthogonal vectors that can then be used as additional covariates when fitting a model to test for differential accessibility or expression. Using the built-in permutation-based procedure in the R package sva, we chose to retain three SVs to include as covariates in the GLM for PBMC ATAC-seq and RNA-seq data analyses (Qu et al., 2015). Further examination of the pattern of variation captured by SVs derived from ATAC-seq data jointly seemed to capture anomalies in libraries with particularly large or small read counts and residual seasonal variation. In the case of RNA-seq data, SVs seemed to capture both differences in library size and differences in library preparation methods.
GLMs were implemented using a negative binomial link function, including both genome-wide and peak-specific dispersion parameters, estimated using edgeR’s “common,” “trended,” and “tagwise” dispersion components, calculated using a robust estimation option. Benjamini-Hochberg P value correction was used to select differentially open peaks at an FDR of 5%. To generate a set of model-adjusted peak estimates of chromatin accessibility (i.e., sex-, season-, and SV-adjusted) for downstream analyses and visualization, we used edgeR to fit a “null” model excluding the age group factor and then subtracted the resulting fitted values from this model from the original TMM-normalized reads.
An equivalent approach was used to analyze the effects of CMV seropositivity and seasonal variation (i.e., winter- vs. summer-acquired samples) in PBMC data. For CMV analysis, the subset of samples for which this information was available (i.e., n = 21, 12 HY and 9 HO) was fit to a model including sex as a factor and CMV status (positive, negative) as a blocking factor. In this analysis, the season factor was not taken into consideration because all subjects for whom CMV status was available were collected in the same season. For seasonal analysis, we used season (summer, winter) as a blocking factor. In both analyses, we tested both separately and jointly for the significance of age group by CMV status or season. In addition, we fitted the converse models (CMV status or aging nested within age group) to test for and calculate fold change estimates for CMV+/CMV− and winter/summer stratified by age group.
Peak annotation and downstream analyses
Multiple data sources were used to annotate ATAC-seq peaks with regard to functional and positional information. HOMER (Heinz et al., 2010) was used to annotate peaks as “promoter” (i.e., within 2 kb of known TSS), “intergenic,” “intronic,” and other positional categories. For functional annotation of peaks, we implemented a simplified version of the 18-state ChromHMM-derived chromatin states obtained from Roadmap Epigenomics data for PBMC and T cell subsets (Roadmap Epigenomics Consortium et al., 2015). We first intersected the Roadmap-generated states with our set of consensus peaks and solved conflicting cases where multiple chromatin states overlap the same ATAC-seq peak so that each peak was assigned a single annotation, according to the following priority rules: Active TSS > Active Enhancer 1 > Active Enhancer 2 > Genic Enhancer 1 > Genic Enhancer 2 > Weak Enhancer > Strong Transcription > Flanking Active TSS > Flanking Upstream Active TSS > Flanking Downstream Active TSS > Weak Transcription > Bivalent Poised TSS > Bivalent Enhancer > Weakly Repressed Polycomb > Repressed Polycomb > ZNF Genes and Repeats > Heterochromatin > Quiescent/Low Signal. Then, to facilitate interpretation and visualization, we simplified the set of 18 chromatin states to a scheme with six pooled meta-states: (a) TSS, collecting active, flanking, and bivalent TSS states; (b) Enhancer, pooling active, weak, and bivalent enhancer states; (c) Repressed Polycomb, combining both weak and strong Polycomb states; (d) Transcription, including both weak and strong transcription states; (e) the quiescent chromHMM state; and (f) other states (ZNF, heterochromatin) combined.
For gene-based analyses, HOMER was used to assign each ATAC-seq peak to the nearest TSS, as measured from the peak center. To improve confidence on these annotations, gene-based analyses were further restricted to include only peaks located within 50 kb of their corresponding TSS. ATAC-seq peaks were also annotated using gene sets provided by curated immune function–related coexpression modules (Chaussabel et al., 2008). These gene sets comprise 28 modules defined from multiple compiled transcriptomic data sets, which were originally annotated based on expert knowledge of representative functions of the gene ensemble in each module. In this study, we have preserved and used these annotations to test for enrichment of these modules in gene sets of interest, such as the set of genes associated to chromatin peaks gaining or losing accessibility with aging. We assessed enrichment using the hypergeometric test followed by Benjamini-Hochberg FDR adjustment for P values. In addition, we summarized the representation of GO terms among gene annotations for all peaks after solving for multiple GO annotations for the same gene by prioritizing terms according to the following order: Immunity > Metabolic > Transcription, Translation > Migration > Mitochondria > Axon > Development.
Further functional enrichment analyses were performed using ClueGO (Bindea et al., 2009) to test for overrepresentation of GO:Immune System Process terms using GO term fusion option and WikiPathways pathways (Kelder et al., 2012) among genes associated to differentially open peaks. In addition to testing for enriched gene sets, ClueGO combines GO terms and pathways into functionally relevant meta-sets based on the rate of shared genes among terms, allowing for an efficient assessment of enriched categories as well as their potential interactions, as inferred from sets of shared genes. We applied these methods separately to peaks significantly closing and opening between age groups to investigate the degree to which these two sets of peaks were associated to unique signatures. We only listed terms that are significant at a P value of 0.05 after Bonferroni step-down correction. In addition, we used ClueGO to annotate the aforementioned immunological coexpression modules that were originally associated to unknown function. The most salient enriched functional categories for these modules are listed in Table S5. Visualization of signaling pathways were generated using ClueGO and PathVisio (Kutmon et al., 2015) tools.
Congruence between chromatin accessibility and transcription data
Gene expression (mRNA-seq, see above) data were generated for a subset of subjects with ATAC-seq profiles (n = 39, 24 HY and 15 HO). These data were normalized to protein-coding transcripts and annotated to ENSEMBL GRCh37 gene symbols. Genes for which at least three normalized reads per million were obtained in at least two samples were considered as expressed, and all others were removed before analysis. This resulted in a total estimate of 11,311 expressed genes in PBMCs.
We built a data set comprising paired ATAC-seq and RNA-seq samples by matching promoter peaks to the nearest gene (TSS) annotations. First, we retrieved the complete list of refSeq TSS coordinates for the hg19 genome reference (n = 34,783) and defined promoters as the regions within 1,000-bp flanks of each TSS. The final set of promoters was defined by merging overlapping flanked TSS regions annotated to the same gene (n = 34,700). We then selected ATAC-seq peaks overlapping these promoters and annotated them to the corresponding gene. Only the peak closest to the TSS was kept. Finally, the resulting data set was filtered to only include promoter peaks for genes that were transcribed, as defined above. Whenever multiple expressed genes were matched to the same promoter peak, all of them were retained for analysis.
To study the concordance between promoter accessibility and gene expression, we subdivided the space defined by aging-related fold changes derived from ATAC-seq and RNA-seq data into gene sets defined by the direction and magnitude of change along both dimensions, such as genes with both upregulated expression and increasing accessibility in elderly subjects or genes for which expression is upregulated but accessibility remains unchanged with aging. To capture enough genes to enable functional enrichment analysis of these gene sets, fold changes between HO and HY subjects for matching promoter peaks and transcripts were estimated empirically as the difference between the mean normalized values of each group and plotted against each other (Fig. 3 A). Specifically, we defined a gene or promoter as being significantly “up” or “down” if the empirical log fold change of the HO mean relative to the HY mean was above or below zero, respectively, and if the adjusted empirical p-value was <0.01 for that gene. Empirical P values were computed by randomly permuting the HO and HY sample labels 1,000 times for each promoter peak and gene. Genes for which P < 0.01 were considered significantly different between age groups, whereas all others were considered to have “stable” expression and/or accessibility relative to aging. Here, we focus on a subset of the combined accessibility–expression gene sets generated by this method: (a) genes with both increased or (b) both decreased promoter accessibility and expression with aging and (c) genes with increased or (d) decreased promoter accessibility but stable aging-related expression. For each gene set, we tested for enrichment in immune modules and WikiPathways pathways using the hypergeometric test against a background defined by the set of genes that are expressed, as determined by RNA-seq data, or potentially expressed, as given by promoter accessibility, in PBMCs. We used the Benjamini-Hochberg FDR multiple test correction to assess the significance of hypergeometric P values.
TF motif and footprinting analysis
ATAC-seq data from PBMCs and T cells were scanned for TF footprints using the PIQ algorithm (Sherwood et al., 2014). This method integrates genome-wide TF motifs (i.e., position weight matrices [PWMs]) with chromatin accessibility estimates profiled at base pair resolution to generate a list of possible footprint matches for a motif. The method also produces a probability estimate for each footprint’s reproducibility, called the “purity score.” Here, we compiled a set of 1,273 distinct motifs comprising the curated (CORE) list available in the JASPAR 2016 database (n = 466, http://jaspar.genereg.net) in addition to the complete set of HT-SELEX motifs made available in Jolma et al. (2013) (n = 819). Altogether, these motifs represent binding sites for 381 distinct TFs. Before footprint calling, we merged samples belonging to the same cell type and age group to maximize our ability to find highly reproducible footprints. In addition, we used SAMtools v. 0.1.19 (Li and Durbin, 2009) to randomly downsample aligned reads from each merged data set to approximately match the mapped library depth of the least deeply sequenced sample (i.e., 113 Mb). This normalization step is included to minimize the impact of the high correlation observed between library depth and footprint purity scores. Only footprints with a purity score of 90% or more were retained for further analysis. Finally, footprint calls were further filtered to include in analyses only those associated to TFs determined that are expressed in immune cells. Footprinting analyses are conducted by pooling young and old samples from each cell type and using random read resampling to ensure that these pooled old and young data have approximately the same sequencing depth (∼113 million reads). Pooling individual samples from a group (i.e., CD8+ memory T cell samples from young subjects) increases the effective library depth; this therefore allows for more controlled and robust footprinting calls and leads to an unbiased starting point for comparative analyses (no impact due to the differences in sequencing depth).
To examine the regulatory landscape of IL7R, a potential aging biomarker, we focused on footprints called on the promoter region (±1 kb from TSS) of this gene separately by age group and cell type. To complement this set of footprints, we also performed de novo motif discovery using HOMER by searching for motifs enriched in peaks annotated to IL7R relative to all peaks in PBMCs and T cell subsets. Each enriched motif was annotated to the best-fitting known TF, as found by HOMER, with the added requirement that the annotated TF should be expressed in the appropriate cell type. We then used the PIQ algorithm to call footprints of the enriched motifs and combined those overlapping IL7R promoters with the previously selected footprints. Finally, in addition to footprint and motif enrichment analyses, known TF motifs were retrieved for the region around IL7R TSS (−10 kb upstream, +1 kb downstream) using the MotifMap tool (Daily et al., 2011) at a 20% FDR.
All sequence data (RNA-seq and ATAC-seq) have been deposited in the European Genome-phenome Archive (EGA), which is hosted by the EBI and the CRG, under accession number EGAS00001002605.
Online supplemental material
Fig. S1 shows cell composition data and changes in these with aging. Fig. S2 shows GO enrichment results for genes associated to closing/opening peaks in PBMCs. Fig. S3 shows chromatin accessibility and gene expression changes associated with aging for genes in immune modules. Fig. S4 shows chromatin accessibility changes associated with aging in immune cell subsets. Fig. S5 shows aging-associated changes in TF footprints as well as CMV- and season-related epigenomic changes. Tables S1–S10 are provided as Excel files. Table S1 summarizes cohort details. Table S2 tabulates cell compositions of PBMC samples. Table S3 shows the genes associated with differentially closing and opening ATAC-seq peaks in PBMCs. Table S4 lists GO and pathway enrichment results for opening/closing peaks. Table S5 lists functional enrichments for immune modules with unknown functions. Table S6 represents IL7R+ protein levels in different cell subsets. Table S7 lists the TF motifs near the IL7R gene promoter. Table S8 lists TF footprint call rates in different cell types. Table S9 tabulates CMV measurements. Table S10 shows the data quality metrics for ATAC-seq and RNA-seq samples.
This work was supported by National Institutes of Health grants R01 AG048023, P01 AG021600, U19AI089987, and R01 AG052608 and the Jackson Laboratory Director’s Innovation Fund JAX-DIF-FY15-DU-JB. G.A. Kuchel also serves as the Travelers Chair in Geriatrics and Gerontology. JAX Genome Technologies provided help with the sequencing data generation.
The authors declare no competing financial interests.
Author contributions: J. Banchereau, D. Ucar, and G.A. Kuchel designed the research. C.-H. Chung, R. Marches, R.J. Rossi, and T.-C. Wu performed the experiments. D. Ucar, E.J. Márquez, and J. Banchereau analyzed the data. A. Uyar and J. George preprocessed the sequencing data. J. Banchereau, D. Ucar, E.J. Márquez, and G.A. Kuchel wrote the paper. M.L. Stitzel and A.K. Palucka assisted with data interpretation and with writing the paper.
D. Ucar, E.J. Márquez, and C.-H. Chung contributed equally to this paper.