The mutational repertoire of cancers creates the neoepitopes that make cancers immunogenic. Here, we introduce two novel tools that identify, with relatively high accuracy, the small proportion of neoepitopes (among the hundreds of potential neoepitopes) that protect the host through an antitumor T cell response. The two tools consist of (a) the numerical difference in NetMHC scores between the mutated sequences and their unmutated counterparts, termed the differential agretopic index, and (b) the conformational stability of the MHC I–peptide interaction. Mechanistically, these tools identify neoepitopes that are mutated to create new anchor residues for MHC binding, and render the overall peptide more rigid. Surprisingly, the protective neoepitopes identified here elicit CD8-dependent immunity, even though their affinity for Kd is orders of magnitude lower than the 500-nM threshold considered reasonable for such interactions. These results greatly expand the universe of target cancer antigens and identify new tools for human cancer immunotherapy.
The idea that neoepitopes created by random mutations in tumor cells, termed as individually specific tumor antigens or unique antigens, are responsible for the immunogenicity of tumors has been around for over 20 yr (Srivastava, 1993). There has been strong experimental evidence for their existence and activity in murine and human tumors (Duan et al., 2009; van der Bruggen et al., 2013), and mathematic modeling has predicted the existence of tens to hundreds of neoepitopes in individual human tumors (Srivastava and Srivastava 2009). The recent revolution in high-throughput DNA sequencing and accompanying bioinformatics approaches has finally made it possible to actually identify the individually specific neoepitopes in individual cancers. Using this methodology, Segal et al. (2008) showed that human breast and colon cancers harbor tens of putative mutational neoepitopes; Rajasagi et al. (2014) have also published similar data for chronic myelogenous leukemia and other human cancers. A genomic/bioinformatic approach to identify such epitopes in a mouse melanoma also led to identification of hundreds of epitopes (Castle et al., 2012). A similar approach led to identification of a neoepitope in a methylcholanthrene-induced sarcoma in an immunocompromised mouse; transplantation of this tumor into an immune-competent animal led to epitope-dependent tumor regression (Matsushita et al., 2012). In human studies, association of favorable clinical course of disease with a dominant immune response to mutated neoepitopes has been demonstrated (Lennerz et al., 2005; Lu et al., 2013; van Rooij et al., 2013). These growing numbers of studies strongly suggest that the host immune response to mutant neoepitopes plays the dominant role in protection of the host from tumor growth.
The opportunity to identify a vast number of putative neoepitopes from individual human tumors creates a corresponding problem: how does one differentiate and identify actual tumor protective neoepitopes from among the large number of putative neoepitopes identified in silico? The problem is daunting in scale because an examination of the tumor transcriptomes and their comparison with normal exomes in the TCGA database shows that many tumors harbor hundreds of putative neoepitopes. Presumably, only a small fraction of these virtual neoepitopes are immunoprotective against cancer.
This question has been addressed before in viral systems. Assarsson et al. (2007) performed a systematic analysis of the putative and real epitopes of the vaccinia virus. This study revealed the magnitude of the problem: starting from all possible 9–10 amino acid peptides encoded by the vaccinia genome, only 2.5% are high-affinity binders to a given HLA allele. Of the high-affinity binders, half elicit a CD8 response. Of these, only 15% are naturally processed and presented. Finally, they observed little correlation between the dominance of an epitope with HLA–peptide affinity, HLA–peptide stability, TCR avidity, or the quantity of processed epitope. Thus, without the benefit of information from T cell responses, one would be unable to start from the vaccinia genome and identify useful epitopes.
The problem is orders of magnitude more complex for identifying useful epitopes from cancer genomes because the mammalian genome is considerably larger than that of vaccinia. Moreover, viral genomes are entirely nonself, whereas the cancer genomes are mutated-self; hence the neoepitopes may be cross-reactive with self, and tolerance or suppression mechanisms are highly likely to come into play.
Here, we make the first systematic analysis of the transcriptomes and CD8 immunomes of tumors, and seek to understand the rules that govern the immunogenicity and tumor-protective ability of mutation-generated neoepitopes. This effort has led to surprising observations regarding MHC I–peptide interactions that distinguish the recognition of neoepitopes from that of viral epitopes, and a recognition that the proportion of putative mutational neoepitopes that is translatable in vivo is far smaller than the corresponding proportion for viral systems.
From transcriptome to immunome
A methylcholanthrene-induced fibrosarcoma, CMS5 (BALB/c, d haplotype) was used as the primary workhorse, and the results were cross-tested to varying degrees with another chemically induced (Meth A) mouse tumor, as well as several human cancers. The CMS5 sarcoma is well characterized, as is the primary host in which this tumor first arose. It is a progressively growing tumor, which is lethal in a syngeneic host. CTLs against a CMS5 line have led to the identification of a single immunogenic and tumor-protective neoepitope (Ikeda et al., 1997; Hanson et al., 2000).
Transcriptome sequencing was chosen over genome or exome sequencing to identify mutations specifically in the genes expressed in CMS5 and Meth A. Broadly speaking, the cDNA sequences obtained were compared with the normal mouse sequences, and single-nucleotide variants (SNVs) were identified (Table 1). This pipeline is named Epi-Seq. The SNVs were analyzed for their potential to generate MHC I–restricted epitopes of the murine H-2 K, D, or L alleles using the NetMHC algorithm. The complete list of epitopes was filtered based on default NetMHC 3.0 PWM peptide binding score thresholds for weak binders of 8.72, 8.08, and 8.19 for Kd, Dd, and Ld, respectively. Using these thresholds, CMS5 and Meth A were observed to harbor 112 and 823 potential epitopes, respectively (see Table S1 for the total output of the pipeline for both tumors); the difference in the number of epitopes identified between these two lines is a reflection of the depth to which their transcriptomes were sequenced (Table 1). The putative neoepitopes are randomly distributed over the entire genome.
We made an attempt to determine if the numbers of MHC I–restricted neoepitopes in these mouse tumors are within the range expected in actual primary human tumors. An analysis of exome sequences of several human melanomas and their comparison with corresponding normal sequences (Wei et al., 2011) through our bioinformatic pipeline reveals hundreds of putative neoepitopes per melanoma (Table 2). Similarly, the list of mutations derived from transcriptome sequencing of 14 human prostate cancers and normal tissues (Ren et al., 2012), led to identification of a median of 14 putative epitopes (range, 2–82) for the common HLA alleles (Table 3). The smaller number of neoepitopes in prostate cancers is related to the fact that they harbor relatively smaller numbers of mutations as compared with melanomas (Vogelstein et al., 2013). The published data for the B16 melanoma line of spontaneous origin also reveal the presence of >100 MHC I–restricted neoepitopes. These measurements indicate that regardless of their murine or human origin, and regardless of etiology, tumors harbor a significant number of candidate MHC I–restricted neoepitopes.
Heterogeneity of neoepitopes
Meth A cells were cloned and 30 distinct clones were tested for four SNVs picked at random (Tnpo3, NFkb1, Prp31, and Psg17). To our surprise, all 30 clones harbored three of the four SNVs and fourth SNV was detected in 29/30 clones. We attribute this apparent lack of antigenic heterogeneity to the relatively shallow depth of sequencing. It is also possible that cancer cell lines show less antigenic heterogeneity than primary tumors. Most importantly, these results suggest it is possible to use a relatively shallow sequencing as a methodology to identify the neoepitopes that are the most broadly distributed among cancer cells.
Immunogenicity of neoepitopes identified in silico
To reduce the complexity of analyses, attention was directed toward the 218 Kd-restricted epitopes (for Meth A and CMS5 combined) from a total list of 935 for all three alleles (Table 1). All the mutations used for immunological analyses were confirmed individually by Sanger sequencing.
The neoepitopes were ranked in descending order of their NetMHC scores for Kd binding. The top 7 neoepitopes from CMS5 and the top 11 from Meth A are shown in Table 4, which also shows the NetMHC score of the WT peptide corresponding to the neoepitope. Peptides corresponding to these 18 putative neoepitopes and their WT counterparts were synthesized, and the affinity of all peptides for Kd (IC50 values) was determined experimentally as described in Materials and methods. (NetMHC scores and experimentally determined IC50 values were significantly correlated; r = −0.44, two-tailed Student’s t test; P < 0.001). 10 of 18 top-ranked neoepitopes (56%) bound Kd with an affinity of 500 nM or better, and 7/18 (39%) bound with an affinity of 100 nM or better.
All 18 peptides were used to immunize BALB/c mice. The draining LNs of immunized mice were harvested 1 wk after the single immunization, and the cells were stimulated in vitro for 16 h without any added peptide, the mutant peptide used for immunization, or the corresponding WT peptide. The CD8+ cells were analyzed for activation (CD44+) and effector function (intracellular IFN-γ+; see Materials and methods for details of FACS analysis.) All possible patterns of immunoreactivity were observed (Fig. 1 A): no immune response (12/18), a mutant peptide-specific, i.e., tumor-specific immune response (5/18), and a cross-reactive response between the mutant and corresponding WT peptides (1/18). Altogether, 6/18 or 33% of the neoepitope candidates identified in silico actually elicited functional effector T cells in vivo. When analyzed by an IFN-γ ELISpot assay, four additional neoepitopes showed immunogenicity, bringing the total to 10/18 or ∼56%.
Of the 10 peptides with a Kd-binding affinity of 500 nM or better, 7 or 70% were determined to be immunogenic experimentally. Of the 7 peptides with a Kd-binding affinity of 100 nM or better, 4 or 57% were immunogenic. Three peptides with a Kd-binding affinity of 500 nM or worse were immunogenic.
Lack of immunogenicity of WT peptides
While testing for immunogenicity of neoepitopes, their WT counterparts were similarly tested (Fig. 1 B). Surprisingly, as with the mutant neoepitopes (Fig. 1 A), all possible patterns of immunoreactivity were observed (Fig. 1 B): no immune response (11/18), a WT peptide-specific immune response (5/18), and a cross-reactive response between the mutant and corresponding WT peptides (2/18). Altogether, 7/18 or 39% of the WT counterparts of neoepitope candidates identified in silico elicited functional effector T cells in vivo (Alms1.1, Alms1.2, Abr, Ccdc85c, Farsb, Mapk1.2, and Ufsp2; Table 4).
This was still a surprisingly large proportion of self-reactive peptides in view of the strong role of negative selection in sculpting of the T cell repertoire. We tested the possibility that the WT peptides that are immunogenic are functionally tolerized, even though they show immunogenicity after a 16-h stimulation in vitro. We reasoned that if a small proportion of low affinity autoreactive T cells escaped negative selection, they may still be clonally expanded to a degree upon immunization with a self-peptide, but that such an expansion would prove self-limiting. Hence, naive mice were immunized with the peptides, followed by a second immunization, and the responses in naive mice, once-immunized mice, and twice-immunized mice were compared. The ovalbumin-derived Kb-binding epitope SIINFEKL was used as a positive control in C57BL/6 mice, and indeed the magnitude of the anti-SIINFEKL CD8 response was amplified by a second immunization (Fig. 1 C). This same phenomenon was observed with the mutant neoepitope Tnpo3 in BALB/c mice. However, second immunization with 4/4 WT peptides tested did not elicit an amplification of the response. Indeed responses detected after the second immunization were significantly diminished as compared with the response after the first immunization. By this stringent criterion, not a single WT peptide was observed to be immunogenic.
Lack of immunoprotective activity of the strongest Kd-binding neoepitopes
All 18 neoepitopes (Table 4) were tested for their ability to elicit tumor rejection of CMS5 or Meth A. BALB/c mice were immunized with the individual peptides and were challenged with the appropriate tumor 1 wk after the last immunization. None of the peptides elicited tumor rejection (Fig. 2 A). One of the CMS5 neoepitopes FarsB shows significant protection from tumor growth in this panel; however, this result could not be reproduced unambiguously, leading us to assign this as a negative neoepitope with respect to protection from tumor growth. Interestingly, one of the neoepitopes identified by us Mapk1 (listed as Mapk1.1, 1.2, and 1.3; Table 4), was also identified by Ikeda et al. (1997) in the CMS5 sarcoma as a tumor rejection antigen. Immunization with it does not elicit tumor rejection in our hands, just as it did not in the original paper. The authors of the original paper noted that “IL-12 treatment was essential to show antitumor immunity in this system, because mice vaccinated with 9m-pulsed spleen cells in the absence of exogenous IL-12 showed no resistance to CMS5 challenge.” This is an unintended validation of our pipeline and also highlights the fact that we have used a very stringent tumor rejection assay in our analyses.
From the aforementioned results, it is evident that the NetMHC score is not a valuable predictor of immunogenicity or tumor rejection. A close examination of the data in Table 4 suggests an underlying possibility: for each neoepitope, both the neoepitope and its WT counterpart have similar NetMHC scores characteristic of high-affinity peptide binding. Moreover, examining the experimental IC50 values, 4/7 WT peptides of CMS5 and 7/11 WT peptides of Meth A have stronger affinity for Kd than the mutant peptides. Thus, unless the mutations alter TCR contacts or the structural properties of the peptides in the Kd binding groove, T cells potentially reactive to the neoepitopes may have been centrally deleted or peripherally tolerized.
We therefore created a new algorithm wherein the NetMHC scores of the unmutated counterparts of the predicted mutated epitopes were taken into consideration by subtracting them from the corresponding NetMHC scores of the mutated epitopes. We refer to this new property of an epitope as the differential agretopicity index (DAI; agretopicity is the ability of a peptide to act as an agretope), and we expect it to reflect the degree to which the peptide-binding determinants of the neoepitopes differ from those of their WT counterparts. In the search for the rules for immunogenicity of viral or other clearly nonself-epitopes, such a parameter is not necessary, nor possible, and has therefore never been sought. The putative epitopes were ranked on basis of the DAI (Table 5). A review of this DAI-ranked list for both tumors shows curiously that all the neoepitopes in this new ranking are mutated at one of the two primary anchor residues at position 2 or the C terminus (we did not identify any neoepitope with changes at both anchor residues.) Consistent with the Kd preferences gleaned from structural analyses (Mitaksov and Fremont, 2006), all of these mutations involve aspartic acid to tyrosine at position 2 or proline/arginine to leucine at the C terminus. Although the DAI was not crafted with this intent, it is a perfectly reasonable outcome in hindsight.
The top DAI-ranking 20 epitopes of CMS5 and 28 epitopes of Meth A were tested in tumor rejection assays. The results (Fig. 2 B) show that 6/20 or 30% CMS5 epitopes and 4/28 or 14% Meth A epitopes showed statistically significant tumor protective immunogenicity. The six CMS5 neoepitopes are particularly impressive in that they elicited near complete or complete protection from a lethal tumor challenge. Fig. 2 C shows representative tumor rejection curves of the two protective and one unprotective CMS5 epitope from Table 5. Corresponding WT peptides did not mediate any tumor rejection. Detailed data on tumor rejection elicited by a Meth A neoepitope Tnpo3 are shown in Fig. 5. Statistical comparison of the NetMHC alone versus the DAI algorithms in predicting antitumor protective immunity, using one-tailed Fisher’s exact test, shows the DAI to be far superior (0/18 vs. 10/48, for NetMHC and DAI, respectively, one-sided Fisher’s exact test, P = 0.031).
Although the DAI algorithm yielded a far richer harvest of tumor-protective epitopes than the reliance on the highest NetMHC or MHC-binding scores, most epitopes identified by DAI still fail to elicit tumor protection. Further, the DAI-ranked neoepitopes (Table 5) that elicit protection from tumor growth are not necessarily the highest ranking in DAI. Clearly, other properties of the putative epitopes (see Discussion) contribute to the tumor rejection activity of individual neoepitopes. Regardless of its imperfection, the DAI is a statistically significant and novel predictor of tumor-protective immunogenicity, and more importantly, permits a dissection of the other potential criteria for antitumor immunogenicity in vivo.
The DAI algorithm also uncovers a new paradox of fundamental significance. All the six neoepitopes that elicit protection against CMS5 have NetMHC scores between 6.8 and 1.2, which are well below 8.72, the PWM peptide binding score threshold for weak binders for Kd (NetMHC3.0). Consistent with that observation, their measured IC50 values for binding to Kd are >70,000 nM (for all except Slit3, for which is IC50 is ∼50,000 nM). This observation is surprising because epitopes are typically considered good MHC I–binders if they have an IC50 value of <100 nM, or at least <500 nM. The IC50 values for the CMS5 neoepitopes are so high (i.e., their binding to Kd is so poor) that these neoepitopes would never be considered suitable candidates for being epitopes based simply on their Kd-binding characteristics. For Meth A as well, all the four neoepitopes that elicit tumor immunity have a NetMHC below the 8.72 threshold for weak binders for Kd; the measured Kd-binding affinities of only two of the four Meth A neoepitopes are <100 nM. To explore the possibility that these neoepitopes may be binding to another allele, Dd or Ld, the six tumor-protection eliciting neoepitopes for CMS5 were tested for binding to these two alleles by direct peptide-binding studies; none of the peptides showed significant binding (unpublished data). Because the requirement for MHC I binding to be <500 nM is so well established (Assarsson et al., 2007), the possibility that the peptides identified as being potent in eliciting protection from tumor challenge (Fig. 2, B and C) may do so through nonimmunological means was investigated.
CD8 dependence of tumor protection
Mice immunized with five of the six active neoepitopes of CMS5 (Alkbh6.2, Slit3, Atxn10.1, Atxn10.2, and Ccdc136) were tested for immunogenicity; as shown in Table 5, Alkbh6.2, Slit3, and Atxn10.1 elicited modest CD8 T cell response that was undetectable by the intracellular cytokine assay, and detectable only by the ELISpot assay (Fig. 3 A), whereas Atxn10.2 and Ccdc136 did not elicit a detectable response at all. Immunized mice were depleted of CD8 or CD4 cells in the effector phase only (i.e., after immunization but before tumor challenge) and were challenged with CMS5 cells as in Fig. 2. The results (Fig. 3 B) show that, compared with naive mice, each mutant peptide elicited potent tumor rejection. Depletion of CD8 cells completely abrogated the tumor-rejecting ability of each peptide. Depletion of CD4 cells had no such effect. These data show that the CMS5 neoepitopes that elicit tumor rejection do so through elicitation of specific CD8+ T cells, and not through nonimmunological means. However, the data do not imply that CD4+ T cells do not have a role in tumor rejection; an examination of the kinetics of tumor rejection shows that in the absence of CD4+ cells, tumors actually do begin to grow in almost all the immunized mice, but begin to regress by days 7–10.
Conformational stability as an indicator of immunogenicity
The CD8-dependence of neoepitopes that have low predicted and measured affinity for their restricting allele led us to seek other determinants of immunogenicity. Although the DAI algorithm selects for mutations at primary anchor positions that improve peptide–MHC binding affinity, unless the mutations alter the structural properties of the peptide in the binding groove as discussed above, potentially responding T cells may be centrally or peripherally tolerized. Studies have shown that anchor modification can have a range of effects on MHC-bound peptides, in some cases having no apparent influence (Borbulevych et al., 2005, 2007), and in others leading to alterations in structural and motional properties (Sharma et al., 2001; Borbulevych et al., 2007; Insaidoo et al., 2011).
To examine the consequences of anchor modification, we used computational modeling to examine the structural properties of pairs of mutant and WT peptides bound to Kd. We adapted an approach recently used to model the structures of peptides bound to HLA-A*0201 (Park et al., 2013), using the crystallographic structure of a viral peptide–H-2Kd complex as a template (Mitaksov and Fremont, 2006). As described in Materials and methods, we used a workflow that included homology modeling, simulated annealing, and molecular dynamics simulations to predict structural properties. Because the conformations of peptide backbones in MHC I proteins vary considerably with peptide length (Borbulevych et al., 2007), we restricted the modeling to pairs of nonamers in Table 5, matching the length of the peptide in the template structure. This necessary restriction leads to noninclusion in our modeling analyses of some of the more immunogenic and protective neoepitopes in Table 5, simply because they are 8- or 10-mers, and not 9-mers. To ascertain how well the modeling procedure was transferable from HLA-A*0201 to H-2Kd, we applied the modeling procedure to the complex of the immunodominant and highly immunogenic HBV core peptide with Kd, a complex for which the crystallographic structure is known (Zhou et al., 2004).
We began by looking for the presence of structural differences between the mutant and WT peptides. In each instance, there were differences between the predicted conformations of the mutant and WT peptides bound to H-2Kd. In all cases, these differences propagated away from the site of the mutations (Fig. 4 A). These observations were not surprising, as the mutations were all quite drastic, and even conservative mutations at class I MHC-presented peptide anchor residues can influence downstream conformation (Sharma et al., 2001; Borbulevych et al., 2007).
However, although T cell receptors can be exquisitely sensitive to changes in peptide conformation (Ding et al., 1999), peptide immunogenicity (defined as having positive ELISpot or ICS results) did not correlate with the magnitude of structural differences, expressed as root mean square deviations (RMSD) in Ångstroms when all common atoms of each peptide pair were superimposed (Fig. 4 B). Examining the modeling data more closely, however, revealed a correlation, albeit imperfect, with conformational stability. In 10 out of 14 cases, the mutant peptides were more rigid, sampling fewer conformations during the molecular dynamics phase of the modeling, expressed as the average root mean square fluctuations (RMSF) of all nine α carbons of each peptide (Fig. 4, C–D; and Fig. S1). Of the 10 instances in which the mutant peptides were more rigid, eight of these had positive ICS or ELISpot results. Of the four instances in which the WT peptides were more rigid, one did not yield any positive immunological outcomes. Altogether, greater structural stability was a predictor of immunogenicity in 9 out of 14, or 65% of the nonameric, high DAI-ranking neoepitopes.
In some instances though, the differences in the fluctuations between the mutant and WT peptides were very small (for example, Ckap5, Dhx8.3, and Zfp236.2 have differences ≤0.03 Å). This led us to question the fidelity of the mean RMSF as a predictive tool. Examining the data for all peptides more closely, we observed a tendency for high structural instability in the peptide C terminus, particularly in the WT peptides (Fig. 4 C and Fig. S1). Previous studies have shown that weak binding peptides can detach and dissociate from MHC I proteins at the C terminus (Winkler et al., 2007; Narzi et al., 2012), suggesting that in these cases the high structural instability reflects at least partial peptide dissociation. However, some of the mutant peptides also had high instability at the C-terminus. Most notably, all three mutant peptides which failed to elicit an immunological response had high C-terminal instability. To quantitate this, we calculated the mean C-terminal RMSF for all mutant peptides. The C-terminal RMSF of the nonimmunogenic peptides were all above this mean value (Fig. 4 E). Only three of the 14 immunogenic peptides had C-terminal RMSF values above the mean (Stau1.4, Dhx8.3, and Tpst2.2). Thus, the presence or absence of C-terminal instability was a predictor of immunological outcome in 11 out of 14, or 79% of the nonameric high DAI ranking neoepitopes. (We note that the HBV control peptide was relatively stable in the Kd binding groove, with a mean Cα RMSD from the starting coordinates of 1.7 Å and an average C-terminal RMSF of 0.57 Å, as would be predicted from an immunogenic viral peptide).
Natural presentation of a neoepitope
The tumor-protective immunogenicity of the epitope syMLQALCI (WT LDMLQALCI), the mutated Transportin 3 (Tnpo3)-derived epitope, the highest ranking (by DAI) epitope of Meth A (Table 5), was investigated in more detail. Tnpo3 is a nuclear import receptor and is not a driver protein for any tumor type reported thus far (Brass et al., 2008). The mutant Tnpo3 epitope was shown to elicit strictly tumor-specific CD8+ immune response, as seen by the ability of mutant Tnpo3-immunized mice to show strong tumor-specific CD8+, CD44+, IFN-γ+ response to the mutant but not the WT peptide ex vivo or after stimulation in vitro (Fig. 5 A). A Tnpo3-specific immune response was also detectable ex vivo upon staining of cells with a Tnpo3-specific tetramer (Fig. 5 B). Conversely, CD8+ CD44+ IFN-γ+ cells isolated from mice immunized with Meth A cells recognize mutant Tnpo3-pulsed cells but not cells pulsed with an irrelevant Kd-biding peptide Prpf31 ex vivo, as well as after stimulation in vitro (Fig. 5 C). Although <0.1% Tnpo3-specific CD8+ T cells seen ex vivo is a truly small response, we consider it real because ex vivo responses are bound to be weak, the response is statistically significant, and the response amplifies very significantly (>1.2%), while maintaining specificity, after weekly in vitro stimulation with Tnpo3 peptide for 19 d (Fig. 5 C, right). Interestingly, Meth A tumor-bearing mice (day 21 after inoculation) harbor a low frequency of T cells recognizing two Kd-binding peptides (Tnpo3 and Nfkb1, measured by tetramer staining) in the tumor-draining LNs (Fig. 5 D). These observations confirm that the mutant Tnpo3 peptide is naturally presented by Meth A cells and also that immune response to it is elicited upon immunization by whole tumor cells, as well as in tumor-bearing mice. Attempts to identify this mutant peptide by mass spectroscopic analysis of MHC I–eluted peptides from Meth A were unsuccessful, presumably because of the higher sensitivity of the T cell assays as compared with mass spectroscopy. The structural modeling predicts that the mutant Tnpo3 peptide is substantially more stable the WT peptide across the center of the peptide (Fig. 4, A and C).
Enhancement of tumor-protectivity of neoepitopes by immune modulators
Combination of immunization with mutant neoepitopes was tested using the Meth A neoepitope Tnpo3. This neoepitope is only modestly tumor protective in monotherapy, thus allowing more dynamic range for testing of an enhanced effect by combination therapy. Combination of immunization with mutant Tnpo3 with antagonistic antibodies to CD25, which has been shown to target regulatory T cells, showed synergy; the anti-CD25 alone showed complete regression in all mice (P = 0.008) and Tnpo3 alone also elicited significant protection (P = 0.03). The combination showed more significant protection than either agent alone (P = 0.016; Fig. 5 E, left): although tumors eventually regressed in all mice in both groups, the kinetics of tumor regression was significantly steeper in the combination group. A similar result was obtained with anti-CTLA4 antibody, which releases T cells from checkpoint blockade (Egen et al., 2002; Callahan et al., 2010). Each agent alone elicited statistically significant protection and the combination was significantly more effective than Tnpo3 alone (P = 0.04) or anti-CTLA4 antibody alone (P = 0.04; Fig. 5 E, right). Only a single tumor regressed in the anti-CTLA4 antibody group, and no tumors regressed in the Tnpo3 alone group (although the tumor growth was very significantly retarded); the combination group showed complete regression of two tumors and a sustained trend toward regression in two additional tumors.
Our results uncover a trove of truly tumor-specific antigenic epitopes. Using novel tools reported in this study, we are able to identify with high accuracy the small number of neoepitopes (among the vast numbers of potential neoepitopes) that truly elicit immunological protection against tumor growth. The application of our method is described here for two independent tumors. In actuality, the pipeline, including the DAI algorithm, was first derived empirically on the data from the Meth A tumor, and was then tested on CMS5. Clearly, the antitumor activity predicted from the DAI algorithm is significantly stronger in CMS5 than in Meth A; this variation is most likely a reflection of the immuno-suppressive mechanisms unique to the Meth A tumor (Levey et al., 2001), and thus unrelated to the merits of the DAI algorithm per se. The DAI algorithm has since been tested in yet another mouse tumor, the B16 melanoma, and data on T cell responses in this line as well, are consistent with significant superiority of DAI over NetMHC alone (unpublished data). Although the present study is focused on identification of MHC I–restricted epitopes of CD8 T cells, the analysis can also be extended to MHC II-restricted epitopes of CD4 T cells.
Although T cells play an unambiguously central role in cancer immunity, they have been poor probes for identification of immunoprotective epitopes thus far. Extensive and laborious analyses of T cell–defined tumor-specific antigens of Meth A and CMS5 sarcomas over the years managed to yield a total of five epitopes (Buckwalter and Srivastava, 2008), none of which elicit particularly robust tumor rejection; in contrast, this single study has uncovered nearly a dozen potent tumor-protective epitopes of these two tumors. It is instructive to ponder the reason for this discrepancy. The use of T cells as probes inherently requires generation of T cell lines or clones, which is itself a highly selective process. It would appear that the diversity of effector T cells in vivo is not readily captured by the T cell lines or clones generated in vitro, leading to a distorted, and sparse, view of the T cell immunomes of tumors. The genomics-driven analysis of the immunome described in this study cuts through the bias in selection of T cells, and thus illuminates the entire field of neoepitopes.
We introduce here the DAI score (the numerical difference between the NetMHC scores of the mutated epitope and its unmutated counterpart), which allows significant enrichment for the extremely small number of truly immunoprotective neoepitopes from among the hundreds of putative neoepitopes identified by the NetMHC algorithm. The demonstrated utility of the DAI score underscores the validity of its premise: a tumor-protective immune response requires neoepitopes that differ from their WT counterparts, and the DAI score is a means to quantify and rank such differences. Understandably, because our existing ideas about immunogenicity are derived entirely from the study of viral and model antigens, which have no self-counterparts, there was no necessity to devise a DAI for their studies. As follows from the design of the NetMHC algorithm, amino acid substitutions at primary anchor residues make for the biggest contributions to the DAI. Indeed, every neoepitope with a high DAI ranking replaces aspartic acid with tyrosine at position 2 or proline/arginine at the C terminus with leucine. From structural considerations, these substitutions would be expected to significantly impact peptide binding, as tyrosine at P2 and leucine at the peptide C terminus are the most optimal Kd anchor residues (indeed, aspartic acid at P2 or arginine at the C terminus would be expected to be considerably unfavorable due to substantial charge repulsion; Mitaksov and Fremont, 2006). Peptide conformational stability, expressed as the fluctuations observed during molecular dynamics simulations, is another tool that suggests a novel correlate with immunogenicity. The majority of the neoepitopes with high DAI rankings are predicted to interact with Kd in a more stable fashion than their wild-type counterparts; in these cases, alteration of the anchor residues yields a more rigidly bound peptide. For mutations that replace charges at the P2 or C-terminal positions, greater conformational stability may follow from elimination of charge repulsion as noted above. Effect of anchor modification on peptide conformational stability has been noted previously, and notably, increased peptide flexibility correlates with a loss of immunogenicity. This may occur by reducing the opportunities for productive interactions with T cell receptors (Sharma et al., 2001; Borbulevych et al., 2007; Insaidoo et al., 2011) or increasing the lifetime of the MHC-bound peptide.
In a recent study, Fritsch et al. (2014) have performed a retrospective of 40 neoepitopes of human cancers, “that induced immune responses, most of which were associated with regression or long-term disease stability.” Interestingly, in all of these neoepitopes, the mutations were located in the TCR-facing residues of the neoepitopes rather than the anchor residues. At first look, these data appear to contradict our finding that the vast majority of bonafide neoepitopes arise due to changes in anchor residues. However, a careful look at the data shows that the comparison cannot really be made: the criterion “associated with aggression or long-term disease stability” as determined subjectively by a dozen different groups of clinical investigators, is a very different criterion from actually testing tumor regression, (as we have done experimentally with mice). The true test of this question in a human study would require a randomized trial where patients would be immunized based upon one set of criteria or the other. Fritsch et al. (2014) have just begun a study using their criteria in human melanoma, and we are starting a clinical study in epithelial ovarian cancer based on our criteria. These studies will allow us to compare the two approaches.
A most surprising observation that emerges from our study is that 10/10 neoepitopes that elicit protective immunity are classified as nonbinders of Kd by NetMHC (cut-off value of 8.72). Correspondingly, the affinity of 8/10 neoepitopes for Kd is well over 500 nM, the traditional threshold for fruitful interaction of viral epitopes with MHC I molecules. In five of five instances tested, these presumed “non-binders” elicit classical CD8 T cell–dependent tumor immunity. This observation challenges some of our basic assumptions about MHC I–peptide–T cell receptor interactions, and exposes a far wider universe of potential neoantigens than assumed thus far.
The observed dissociation between detectable CD8 responses and immunoprotection (Table 5) from tumor growth merits comment. The neoepitopes that elicit immunoprotection and a CD8 response are straightforward and require no comment. The neoepitopes that elicit immunoprotection but not a detectable CD8 response (Fig. 3) may also be understood with the explanation that the CD8 response elicited is too weak to be detected by the ELISpot assay, thus highlighting the need for developing more sensitive assays for CD8 cells and their activities. It is, however, the epitopes that elicit potent CD8 responses but not immunoprotection that are difficult to understand. However, data in Table 5 may provide guidance in thinking about this dissociation. Note the neoepitopes Tnpo3.1, 3.2, 3.3, and 3.4 for Meth A in Table 5. They share the same N-terminal mutations (sy/LD), but differ in the extent of their extension on the C termini. Whereas the first three elicit strong CD8 responses, Tnpo3.4 does not. More interestingly, of the three neoepitopes that are immunogenic, only one, Tnpo3.1, elicits protection from tumor growth. It is conceivable that Tnpo3.1 is the only neoepitope that is naturally presented, while the others are not. This entirely testable hypothesis provides a framework for testing the dissociation between T cell responses and immunoprotection.
We note with satisfaction, but also with some surprise that not a single WT epitope among the more than 100 tested (66 epitopes listed in Tables 4 and 5, and >35 additional epitopes) elicited a measurable, amplifiable CD8 immune response. The immune responses, when detected after a first immunization, were abrogated rather than enhanced after a second immunization, consistent with them being peripherally tolerized responses. This study represents perhaps the largest in which the immune responses to such a large number of self-epitopes have been systematically tested, and testifies strongly to the powerful scope of mechanisms of negative selection and peripheral tolerance.
These results present clear opportunities for clinical translation in human cancer immunotherapy. With the advent of high-throughput and inexpensive DNA sequencing, it is now possible to routinely sequence the exomes of cancers and normal tissues of each cancer patient, and compare the two to identify cancer–specific mis-sense mutations. The NetMHC or other such commonly available algorithms can then be used to identify the potential neoepitopes generated by the mis-sense mutations, for each of the three to six HLA I alleles of each patient. Peptides corresponding to the neoepitopes can then be chemically synthesized and used to immunize patients. However, the numbers of potential neoepitopes can be vast, and it is impractical to immunize patients with such vast numbers of peptides. The pioneering study of van Rooij et al. (2013) is an excellent case in point. Using exome comparison, indexed to the tumor transcriptome, these authors identified 448 potential neoepitopes in a melanoma patient; of these, only two were truly immunogenic. The combination of the NetMHC algorithm with the DAI and the C-terminal stability algorithms, as identified here, now makes it possible to reduce the large numbers of potential neoepitopes to a much smaller number of truly immunogenic epitopes, which can now be used to immunize patients in a realistic manner.
Our observations also present new opportunities to address some long-standing questions in immunology. Antigenic heterogeneity of cancers has been the subject of much discussion, but due to the lack of bona fide tumor-specific antigens in significant numbers, the debate has been largely theoretical. Uncovering of a large repertoire of true tumor-specific neoepitopes now allows the questions regarding antigenic heterogeneity (and antigen escape) to be asked in an unprecedentedly robust manner. Inherently linked to the issue of antigenic heterogeneity is the role of immunoediting of growing cancers. Since the classical studies on immune surveillance against tumors (Burnet, 1970), Dunn et al. (2004) have suggested that tumors go through elimination, equilibrium, and escape phases of immunoediting and have demonstrated evidence supporting the idea (Matsushita et al., 2012). The availability of a large repertoire of tumor-specific neoepitopes of any given tumor allows immuno-editing to be addressed in far more granularity. Epitope spreading is one of the more exciting and underexplored ideas in cancer immunity (Markiewicz et al., 2001; Corbière et al., 2011). The major reason that it is underexplored is the paucity of a suitable number of antigenic neoepitopes. Identification of a large number of neoepitopes for practically any mouse or human tumor now permits a vigorous exploration of epitope spreading, along with its clinical utilization.
MATERIALS AND METHODS
Mice and tumors.
The BALB/cJ mice (6–8-wk-old female) were purchased from The Jackson Laboratory. Mice were maintained in the virus-free mouse facilities at the University of Connecticut Health Center and their use was approved and monitored by the Institutional Animal Care and Use Committee.
Samples were prepared using the Illumina protocol outlined in “Preparing Samples for Sequencing of mRNA” (Part# 1004898 Rev. A September 2008). The protocol consists of two parts: cDNA synthesis and paired-end library preparation. First, mRNA was purified from total RNA using magnetic oligo(dT) beads, and then fragmented using divalent cations under elevated temperature. cDNA was synthesized from the fragmented mRNA using Superscript II (Invitrogen), followed by second strand synthesis. cDNA fragment ends were repaired and phosphorylated using Klenow, T4 DNA Polymerase, and T4 Polynucleotide Kinase. Next, an ‘A’ base was added to the 3′ end of the blunted fragments, followed by ligation of Illumina Paired-End adapters via T–A-mediated ligation. The ligated products were size selected by gel purification and then PCR amplified using Illumina Paired-End primers. The library size and concentration were determined using an Agilent Bioanalyzer.
GAII run conditions.
The RNA-seq library was seeded onto the flowcell at 8 pM, yielding ∼282–384 K clusters per tile. The library was sequenced using 61 cycles of chemistry and imaging.
Analysis of sequencing data.
Initial data processing and base calling, including extraction of cluster intensities, was done using RTA (SCS version 2.6 and SCS version 2.61). Sequence quality filtering script was executed in the Illumina CASAVA software (ver 1.6.0, Illumina).
Epi-Seq bioinformatics pipeline.
The pipeline starts by mapping RNA-Seq reads against the strain-specific genome sequences downloaded from the Sanger Mouse Genomes Project (Keane et al., 2011) and a strain-specific haploid transcript library derived from CCDS annotations (Li et al., 2009). We used BALB/c genome/transcriptome sequences for CMS5 and Meth A cell lines. DatabaseSNP polymorphisms were removed. Instead of comparing to the mm9 reference genome, and then excluding SNPs in dbSNP, we created a strain-specific genome by applying strain SNVs to the mm9 reference genome. The SNVs were downloaded from the mouse genomes. The created genome was used to map the reads and call the mutations. Reads were mapped using Bowtie (Langmead et al., 2009) with the default seed length of 28, maximum of 2 mismatches in the seed, and maximum sum of phred quality scores at mismatch positions of 125. After an initial round of mapping, we calculated mismatch statistics for each read position and each sample. Based on this analysis, 2 bases from the 5′ end and 10 bases from the 3′ end were clipped from all aligned reads. The resulting read alignments were merged using the HardMerge algorithm (Duitama et al., 2012). HardMerge discards reads that align to multiple locations in the genome and/or transcriptome, as well as reads that align uniquely to both, but at discordant locations. To reduce the effect of bias introduced by PCR amplification during library preparation (Aird et al., 2011), we replaced multiple reads with alignments starting at the same genomic location with their consensus. The SNVQ algorithm (Aird et al., 2011) was then used to call single nucleotide variants (SNVs) from the filtered set of aligned reads. SNVQ uses Bayes’ rule to call the genotype with the highest probability while taking base quality scores into account. High confidence SNVs were selected by requiring a minimum phred quality score of 50 for each called genotype, a minimum of 3 reads supporting the alternative allele, with at least one read mapping on each strand. Haplotype inference over called SNV genotypes was performed using the RefHap Single Individual Haplotyping algorithm in (Duitama et al., 2012) that uses read evidence to phase blocks of proximal SNVs. Because residual heterozygosity in the inbred mice used in our experiments is predicted to be low (Bailey 1978 [specifically, Table 1 therein]), unique heterozygous SNVs were considered to be novel somatic mutations. Homozygous SNVs as well as heterozygous SNVs shared by more than one tumor with the same genome background were assumed to be germ-line mutations and were not used for epitope prediction unless located near a unique heterozygous SNV. For each unique heterozygous SNV, reference and alternative peptide sequences were generated based on the two inferred haplotypes for each CCDS transcript. Generated amino acid sequences were then run through the NetMHC 3.0 epitope prediction program (Lundegaard et al., 2008) and scored using the Profile Weight Matrix (PWM) algorithm with default detection thresholds.
Binding of peptides to H-2 Kd was determined using quantitative assays based on the inhibition of binding of a radiolabeled standard peptide to purified MHC molecules essentially as described previously (van der Most et al., 1996; Sidney et al., 2013). Peptides were typically tested at six different concentrations covering a 100,000-fold dose range, and in three or more independent assays. Under the conditions used, where [label] < [MHC] and IC50 ≥ [MHC], the measured IC50 values are reasonable approximations of the dissociation constant values (Gulukota et al., 1997).
Intracellular IFN-γ assay by FACS and ELISpot.
Lymphocytes were incubated either with or without 1–10 µg/ml peptide. GolgiPlug (BD) was added 1 h later. After incubation of 12 to 16 h, cells were stained for CD44 (clone IM7), CD4 (clone GK1.5) and CD8 (clone 53–6.7; BD), fixed and permeabilized using the Cytofix/Cytoperm kit (BD), and stained for intracellular IFN-γ using Phycoerythrin-conjugated anti-mouse IFN-γ (clone XMG1.2, BD). Cells were stained with 1 µl antibody/million cells in 50 µl staining buffer (PBS with 1% bovine serum albumin) and incubated for 20 min at 4°C in the dark, or according to the manufacturer’s instructions. Cells without peptide stimulation were used as a negative control, and the values for these controls was very close to the values seen with the negative control peptide. Typically, 95,000–129,000 lymphocytes (14,500–17,000 CD8+CD4− cells) were acquired. Our background is consistently very low (10% of the signal).
For the ELISpot assays, our negative controls are CD8+ cells from immunized mice without peptide stimulation. We consider the peptides to be positive or immunogenic when spots from peptide-stimulated wells are significant higher by Mann-Whitney test, compared with wells without cognate peptide stimulation. We rate the magnitude of responses by mean spot numbers per million CD8+ cells: 5–10(+); 11–20 (++); 21–50 (+++); 51–100(++++); and >100(+++++).
Tumor challenge and representation of tumor growth.
AUC as a tool to measure tumor growth has been described previously (Duan et al., 2012). In brief, AUC was calculated by selecting “Curves & Regression” and then “Area under curve” from the “analyze” tool, using the Prism 5.0 (GraphPad). Grubbs’ test was used to remove up to one outlier from each group.
Depletion of T cell subsets.
Immunized mice were depleted of CD8 cells using anti-CD8 rat IgG2b monoclonal antibody 2.43, or depleted of CD4 cells using anti-CD4 rat IgG2b monoclonal antibody GK1.5. Depleting antibodies were given in PBS i.p. 2 d before tumor challenge and every 7 d for the duration of the experiment. The first 3 injections of depleting antibodies were 250 µg per mouse; later injections were 500 µg per mouse. For treatment with antagonistic antibodies, mice were treated with anti-CD25 antibody (clone PC61, 250 µg, 2 d before tumor challenge) or anti–CTLA-4 antibody (clone 9D9; 100 µg; 7 d before and every 3 d after tumor challenge). The appropriate T cell subsets were depleted by >95%.
Modeling of peptide/H-2Kd complexes.
Models of peptide/H-2Kd complexes were built by adapting a previous method by Park et al. (2013) used to identify immunogenic epitopes. Although developed on HLA-A*0201, the approach is generally applicable to class I MHC proteins in general, taking advantage of common class I MHC structural features.
An initial model of each complex was generated by MODELLER and the ‘build mutants’ functionality implemented in Accelrys Discovery Studio (Feyfant et al., 2007; Webb and Sali 2014). The protocol uses a heavy-atom representation of the protein and includes homology-derived restraints combined with energy minimization and molecular dynamics/simulated annealing. The structure of the Flu peptide bound to H-2Kd was used as a template (Mitaksov and Fremont 2006). Atoms within 4.5 Å of each altered residue were allowed to repack during the modeling. For each pMHC, one hundred initial models were generated, and the lowest energy model from this first set was subjected to a more exhaustive, second phase of fully atomistic simulated annealing and molecular dynamics. For the second phase, after adding hydrogens, the structure was heated to 1500 K over 200 ps of simulation followed by cooling to 300 K over 800 ps. The annealed structure was then subjected to five independent 10 ns molecular dynamics runs at 300 K, each time beginning with the structure that resulted from the second phase annealing step.
All second phase dynamics calculations were performed using AMBER 12 running on Nvidia GPU accelerators (Case et al., 2005). The ff99SB force field was used with a 2 fs time step. Solvent was treated implicitly using the generalized Born model to accelerate sampling (Götz et al., 2012). The SHAKE algorithm was used to constrain all bonds to hydrogens. A 20-kcal/mol harmonic restraint was applied to the α1 and α2 helices (residues 56–85 and 138–175). Two additional 20 kcal/mol distance restraints were applied to hydrogen bonds at the N- and C-terminal ends of the peptide. The first was between the P1 backbone oxygen and the hydroxyl of Tyr 159 of H-2Kd. The second was between the P8 backbone oxygen and the ring nitrogen of Trp 147 of H-2Kd. As a positive control, we performed the simulated annealing and molecular dynamics steps of the procedure on the structure of an HBV peptide presented by Kd (Zhou et al., 2004). As shown in Figs. 4 B and 4D, the viral peptide was predicted to be relatively rigid in the Kd binding groove.
For the data in Fig. 4 B, average peptide structures were calculated from the 50 ns of simulation data for each pMHC, and all common atoms of pairs of mutant and WT peptides superimposed to generate the RMSD. For the data in Figs. 4 (C–E) and Fig. S1, RMSFs were computed for the α carbons of the peptides from the 50 ns of simulations.
P-values for group comparisons were calculated using a two-tailed nonparametric Mann–Whitney test, using GraphPad Prism 5.0 (GraphPad). For tumor rejection assays, Grubb’ test was used to remove up to one outlier from each group. Fisher’s exact test was used to test association between pairs of categorical parameters. Statistical significance of a Pearson correlation coefficient was computed using two-sided Student’s t test as described in (Cohen et al., 2003).
Online supplemental material.
Table S1 contains the entire output of the EpiSeq pipeline for Meth A and CMS5 tumors. Fig. S1, which accompanies Fig. 4, shows the complete data for root mean square fluctuations for the α carbons of all top DAI ranked nonamers from the structural modeling. Figs. S2 and S3 show the FACS gating strategies and representative primary data for Fig. 1 and Fig. 5 C, respectively.
We acknowledge Drs. Anthony T. Vella and Adam Adler of UConn School of Medicine for helpful discussions.
This research was funded in part by an award from the Cancer Research Institute, New York (P.K. Srivastava), the Northeastern Utilities Chair in Experimental Oncology (P.K. Srivastava), the Personalized Immunotherapy Core Interest Group of the Connecticut Institute for Clinical and Translational Science (P.K. Srivastava), a SPARK award (P.K. Srivastava, I.I. Mandoiu, F. Duan, S. Al Seesi), award IIS-0916948 (I.I. Mandoiu) from National Science Foundation, awards GM067079 and GM103773 from National Institutes of Health (B.M. Baker), Collaborative Research Grant AG110891 on “Software for Robust Transcript Discovery and Quantification” from Life Technologies (I.I. Mandoiu), and the Agriculture and Food Research Initiative Competitive Grant no. 2011-67016-30331 from the USDA National Institute of Food and Agriculture (to I.I. Mandoiu).
This paper is dedicated to the cherished memory of Lloyd J. Old for his mentorship and friendship to P.K. Srivastava.
P.K. Srivastava has a significant equity interest in Accuragen Inc., which has obtained an option to license the intellectual property relating to the results described in this manuscript. The authors have no additional financial interests.
F. Duan, J. Duitama, and S. Al Seesi contributed equally to this paper.
J. Duitama’s present address is Agrobiodiversity Research Area, International Center for Tropical Agriculture (CIAT), 6713 Cali, Colombia.