The recent discovery of dideoxymycobactin (DDM) as a ligand for CD1a demonstrates how a nonribosomal lipopeptide antigen is presented to T cells. DDM contains an unusual acylation motif and a peptide sequence present only in mycobacteria, but its discovery raises the possibility that ribosomally produced viral or mammalian proteins that commonly undergo lipidation might also function as antigens. To test this, we measured T cell responses to synthetic acylpeptides that mimic lipoproteins produced by cells and viruses. CD1c presented an N-acyl glycine dodecamer peptide (lipo-12) to human T cells, and the response was specific for the acyl linkage as well as the peptide length and sequence. Thus, CD1c represents the second member of the CD1 family to present lipopeptides. lipo-12 was efficiently recognized when presented by intact cells, and unlike DDM, it was inactivated by proteases and augmented by protease inhibitors. Although lysosomes often promote antigen presentation by CD1, rerouting CD1c to lysosomes by mutating CD1 tail sequences caused reduction in lipo-12 presentation. Thus, although certain antigens require antigen processing in lysosomes, others are destroyed there, providing a hypothesis for the evolutionary conservation of large CD1 families containing isoforms that survey early endosomal pathways.
CD1 proteins are expressed together with β2-microglobulin on the surface of B cells, myeloid DCs, Langerhans cells, and other APCs, where they capture and display lipid antigens to T cells. The types of lipid antigens presented by the CD1 system represent many classes of lipids that are biosynthetic products of mammalian cells or bacteria, including diacylglycerols (1–4), sphingolipids (5, 6), polyisoprenols (7), polyketides (8), mycolic acid dervatives (9–11), sulfoglycolipids (12), and other antigens (13). Most of these antigens are glycolipids that insert their aliphatic hydrocarbon chains into the hydrophobic groove of CD1 so that the carbohydrate moieties protrude from the groove, making them available for direct contact with TCRs (14–18). A recent ternary crystal structure of an αβ TCR bound to a CD1d–α-galactosyl ceramide complex shows how the variable regions of the TCR contact a galactose moiety lying on the surface of CD1d (19). Although T cells had previously been thought to recognize mainly linear primary amino acid sequences, these studies show how T cells with rearranged αβ TCRs can also discriminate the structure of carbohydrate rings.
However, the larger family of CD1 antigen–presenting molecules (CD1a, CD1b, CD1c, and CD1d) presents not only antigens with head groups that are sugars, but also antigens that are composed of free fatty acids (10), polyaromatic hydrocarbons (13), hydrophobic peptides (20), and a lipopeptide. The only known lipopeptide antigen is a mycobacterial product that binds to CD1a and is named dideoxymycobactin (DDM) (21). The crystal structure of human CD1a bound to a DDM analogue shows that its single acyl group is inserted into the groove of CD1a. Its peptide chain, composed of four amino acids and two organic acids, protrudes from the groove such that the termini of the peptide are available to contact TCRs (22). These studies provide proof of principle that a CD1 protein, like MHC proteins, can present a peptide sequence to T cells. However, it is currently unclear whether the range of CD1-presented peptide antigens is limited only to this unusual and invariant mycobacterial sequence. We hypothesized that lipopeptide recognition by αβ T cells might involve any of the multitude of structurally varied self- or viral proteins that are encoded in DNA, assembled by ribosomes, and posttranslationally lipidated (23).
In considering the possible diversity of CD1-presented lipopeptides, there are relevant evolutionary and biochemical differences between ribosomal and nonribosomal peptide synthesis pathways. DDM is made by the sequential action of enzymes known as mycobactin synthases (24). The resulting peptide is invariant in sequence. However, the chemical linkages are diverse and include esters, oxazaline rings, and amide bonds. In contrast, ribosomal translation of peptides produces a nearly infinite diversity of peptide sequences by linking each of the 20 standard amino acids by repeated amide (peptidic) bonds. Such peptide sequences are then modified in a limited number of ways on glycine, lysine, or cysteine side chains by enzymatically mediated processes known as thioacylation, lysine acylation, or myristoylation.
According to the crystal structure, the A′ pocket of CD1a accepts the fatty acyl group, and the F′ pocket can bind either a second fatty acyl chain or accept the peptidic moiety of DDM (15, 22). Because of the low levels of polymorphism in CD1 heavy chains, the F′ pocket is thought to be invariant in structure among individual humans, so it might be specialized to bind only this 5-mer peptide of DDM. Alternatively, the F′ pocket of CD1a or other CD1 isoforms instead might bind to structurally diverse molecules, as suggested by the ability of CD1a to present sulfolipids using a mechanism whereby an alkane chain resides in the F′ pocket (15, 25). Ribosomally encoded peptides are highly diverse and, therefore, potentially represent a more general immune recognition pathway that could involve any peptide that contains a myristoylation, thioacylation, or prenylation site. Toll-like receptor 2-1 heterodimers recognize N-terminal cysteine-based triacylation motifs on peptides, but there is currently no evidence that acylated proteins are the targets of T cell responses (26, 27). As a first step toward testing this general hypothesis, we prepared several synthetic lipopeptides that mimic acylated sequences contained within larger proteins produced by mammalian cells (IL-1α) or viruses (HIV and EBV), and carry an acyl group on the N-terminal glycine or cysteine or internal lysine residues.
Although antigen screens initially focused on CD1a because it is the only protein known to present a lipopeptide, we unexpectedly found that cellular CD1c proteins presented an N-terminal acylated glycine dodecamer peptide to T cells, which we designate lipo-12. This antigen mimics structures of proteins made through cellular myristoylation of proteins. We found that intact APCs deactivated antigen recognition in a process that involved peptidases. This discovery raised questions about how endosomal peptidases, which are known to control MHC-restricted peptide recognition, might also control CD1 and lipopeptide antigen complex recognition. Functional studies provided evidence for two means by which antigens can be protected from degradation in late endosomes or lysosomes. First, DDM provides an example of evolution of atypical, peptidase-insensitive linkages that allow it to persist in infected endosomes. Second, the escape of CD1a and CD1c from early endosomes to the surface allows peptidase-sensitive antigens to avoid exposure to the activated proteases in lysosomes. Most models of endosomal lipid antigen processing emphasize the sequential passage of CD1 through early and late endosomal compartments en route to the cell surface. In contrast, these results point to a model whereby the ability of CD1c to exit the endosomal pathway, before traversing lysosomes, improves antigen recognition. Although recent studies of CD1b and CD1d emphasize the importance of lysosomal factors in promoting lipid presentation, these results may explain why most mammalian species have evolved large families of CD1, including those that do not efficiently enter lysosomes.
CD1c presents a lipopeptide to T cells
To determine whether T cells can respond to acylated peptide sequences, we synthesized pools of peptides that were modified on N-terminal glycine, internal cysteine, or internal lysine moieties to mimic structures produced by myristoylation, thioacylatoin, or lysine acylation pathways (Fig. S1). These candidate antigens were synthesized using standard solid-phase peptide synthesis techniques, followed by acylation with saturated (C16:0 and C18:0) or unsaturated (C18:1) fatty acids. After purification by HPLC, mass spectrometry (MS) analysis demonstrated that products were of the expected mass (Fig. S1 and not depicted). After stimulating human T cells with acylpeptides, we screened human T cells from patients with or without viral infection for responses to acylpeptide-pulsed, monocyte-derived DCs. One of the resulting T cell lines (1A3) was found to respond to an N-acylated lipopeptide mixture when presented by DCs expressing all five human CD1 proteins (unpublished data).
Because CD1a is the only CD1 isoform previously known to bind or present a lipopeptide, testing for the cell-surface molecule that mediates the response focused on the use of anti-CD1a blocking antibodies. Unexpectedly, we found that activation was blocked by anti-CD1c antibodies but not by anti-CD1a, -CD1b, or -CD1d (Fig. 1 A). Further studies using lymphoblastoid cells transfected with any of the four human CD1 antigen-presenting proteins confirmed that T cell activation was dependent on lipopeptide dose and absolutely dependent on expression of CD1c on APCs (Fig. 1 B). Last, we found that 1A3 T cells did not cross react with other lipopeptides bearing distinct sequences and lysine acylation and thioacylation modifications, demonstrating specificity of the T cell response for the pool of N-terminal glycine acylation products (Fig. S1).
We next determined the TCR variable segments of the T cell clone by PCR using a set of primers that covered most known TCR human variable segments. This analysis reproducibly led to a strongly positive PCR signal for the TRBV12-3 segment, used in the β chain, but products obtained with TRAV segment–specific primers were not observed (unpublished data). Because the PCR primer sets do not cover all possible variable segments, we performed inverse PCR on circularized cDNA primed with TCR α and β constant region primers, followed by cloning and sequencing of the PCR product. This method confirmed that the TRBV12-3 segment was present and identified the TCR α chain as containing TRAV25. Determination of the complete sequences of the 1A3 TCR α and β chains (available from GenBank/EMBL/DDBJ under accession nos. EU599571 and EU599572) confirmed that they were distinct from the TCR α and β chains of previously reported CD1c-restricted T cell lines (28, 29), and different from the Vα24 chains that define invariant CD1d-restricted NK T cells (Fig. S2).
Identification of the antigen as an N-terminally acylated 12-mer peptide
To identify the precise structure of the lipopeptide antigen, the mixture of synthetic compounds was fractionated by preparative HPLC with a split interface to allow for simultaneous monitoring by UV absorbance and electrospray ionization MS (ESI-MS) and collection of samples for T cell assays. Three major lipopeptides were detected at 26, 27, and 29 min of elution time (Fig. 1 C) and were initially named lipopeptide 1, 2, and 3, respectively. Lipopeptides 1 and 3 were detected as doubly charged species at mass-to-charge (m/z) 867.4 and 868.5, corresponding to monoisotopic molecular weights of 1,732.8 and 1,735 u, respectively. Lipopeptide 2 was observed as a singly charged species with [M+H]+ at m/z 900.7 u. Testing of the purified compounds found that only lipopeptides 1 and 3 stimulated a T cell response, with lipopeptide 3 being approximately fivefold more potent on a molar basis (Fig. 1 D).
To assign the compositions and elucidate the structures of the antigenic lipopeptides, we performed collision-induced dissociation MS using quadrupole ion trapping (QIT; Fig. 2 and Fig. S3) and Fourier transform ion cyclotron resonance (FTICR) MS (Table I). In the QIT MS2 spectrum for lipopeptide 3, the [M+2H]2+ of m/z 1,735 generated product ions that constitute a series of peptide sequence–related fragments at m/z 1,316.8, 1,502.9, and 1,589.9. These ions correspond to the b9, b10, and b11 fragments of a C-terminal sequence, WSK (Fig. 2 B). MS3 analysis of the product ion at m/z 928.6 was consistent with an N-stearoyl hexapeptide with the amino acid sequence GGKWSK and a C-terminal hydroxyl terminus (Fig. 2 B, bottom). Formation of this fragment is best rationalized by the elimination of a peptide with kynurenine, an oxidized form of tryptophan, at its N terminus that had been released from an ester linkage to lysine at position six. Thus, the structure of lipopeptide 3 was assigned as C18:0-GGKWSKXSKWSK, where X is kynurenine with ester linkage (Fig. 2 A).
This conclusion is supported in more detail and with higher mass resolution by the FTICR MS, which includes the series b3–b6 and b8–b11 and their secondary fragments that arise via loss of the acyl moiety as the corresponding ketene (CHRCO, R = C14H29 or C16H33), as well as the y9, y10, and y11 fragments accompanied by water loss peaks from all of these and from the y6 ion. Internal fragment ions that include the modified, ester-linked tryptophan residue (W*) are also plentiful (Table I).
An analogous interpretation of both FTICR and QIT spectra from lipopeptide 1 indicates that it is the same peptide as found in lipopeptide 3, except that the Gly1 of lipopeptide 1 is acylated with C18:1 rather than C18:0 (Fig. S3). This interpretation is in agreement with the synthesis scheme in which C18:0 and C18:1 fatty acids were included in the mixture used to generate the lipopeptides. The fivefold lower potency of lipopeptide 1 (Fig. 1 D) correlates with the presence of an unsaturation in its fattyl acyl chain, indicating that the lipidic portion of the molecule strongly influences its antigenicity. More generally, we conclude that CD1c can be considered the second member of the CD1 family that has been shown to present a lipopeptide to T cells. It is noteworthy that the position, amino acid residue, and chemical linkage of the fatty acyl substituent on this synthetic lipopeptide are chemically identical to that formed in N-terminal glycine acylation reactions (myristoylation) in mammalian cells.
Lipid, peptide, and linkage specificity of the T cell response
After solving the complete structures of lipopeptides 1, 2, and 3, we next determined T cell specificity for analogues that differed in the lipid saturation, peptide length, and peptide sequence using additional synthetic lipopeptide analogues that were purified and analyzed by HPLC-MS (Figs. S4 and S5). Lipopeptide 2 was similar to the core structure of lipopeptide 3, except for a small difference in lipid length and that six residues of its C terminus were absent. Therefore, the failure of T cells to recognize lipopeptide 2 likely indicated that the longer peptide was required for recognition. The terminal amino acids may have contributed to recognition in some general way, such as promoting solubility, or represent an amino acid sequence–specific recognition event. To directly test this hypothesis, we generated a lipopeptide that retains the acyl unit and overall 12-mer length but differs from lipo-12 (C18-GGKWSKXSKWSK) in its C-terminal sequence (C18-GGKWSKSSIVGW; Fig. S4). Also, we produced a point mutant at position 7 in which kynurenine is substituted with tryptophan (C18-GGKWSKWSKWSK) along with the nonacylated version (Fig. S5). T cells did not detectably respond to the multiply substituted 12-mer lipopeptide or the isoleucine-acylated analogue, demonstrating the importance of N- and C-terminal elements of sequence in the response (Fig. 2 C).
Interestingly, the tryptophan point mutant was recognized in a dose-dependent manner, albeit at a lower titer when tested in sensitive single-cell cytokine capture ELISA (ELISPOT) assay (Fig. 2 D). This result provides evidence from a second, independently synthesized molecule that 12-mer lipopeptides can activate T cells. We conclude that a single point mutation strongly influences the potency of the response, but kynurenine is not absolutely required. Further, testing of the nonacylated peptide (GGKWSKWSKWSK) produced no detectable response (Fig. S5 and not depicted), further emphasizing the role of the acyl group in recognition.
To determine whether these lipopeptides interact with CD1c, we measured their ability to compete against a known CD1c antigen, mannosyl phosphomycoketide (MPM), for presentation by recombinant CD1c proteins. Based on a construct for a CD1d-Ig fusion protein (2), we produced CD1c-Ig fusion protein, bound it to plastic, treated it with MPM, and washed before adding MPM-reactive T cells. MPM-treated CD1c proteins activated the MPM-reactive line CD8-1 to release IFN-γ at high levels, providing evidence that the response requires MPM contact with CD1c (7, 30). Lipopeptide antigens did not activate CD8-1 (not depicted) but could completely inhibit the response to CD1c-MPM (Fig. 2 E). Specifically, lipo-12, as well as the tryptophan point mutant, can effectively diminish the presentation of MPM, albeit at different molar ratios. In contrast, the nonacylated form of the tryptophan mutant does not inhibit T cell response at any of the molar ratios tested. We conclude from these experiments that the acyl modification of the lipopeptides is necessary for binding of lipopeptides to CD1c, and that the difference in potency between lipo-12 and its tryptophan mutant is mainly caused by altered recognition by the 1A3 TCR.
Antigenic glycine-1–acylpeptides are sensitive to protease degradation
The loading of exogenous antigens by CD1 proteins occurs within the endosomal network of APCs. This process requires that antigens are taken up by phagocytosis or receptor-mediated endocytosis, and that CD1 proteins are transported from the surface to endosomes after sorting interactions mediated by clathrin adaptor protein (AP) complexes (31–36). Although human CD1a proteins transit endosomes less extensively than other isoforms, they do reach sorting endosomes and can present lipids from intracellular infection (32, 37, 38). CD1c proteins recycle to endosomes based on the interactions of their cytoplasmic tail sequences with the AP-2 complex, but they enter lysosomes and localized with lysosome association membrane protein 1 (LAMP1) only at very low levels compared with human CD1b and mouse CD1d (34, 35, 39). These endosomally localized loading reactions serve to concentrate antigens, lower pH, and provide loading cofactors that promote presentation of lipid antigens other than those studied in this paper. However, endosomal antigen processing necessarily exposes both CD1 and antigens to endosomal proteases, so the discovery of CD1 antigens that contain peptide sequences raises questions about how endosomal proteases may generate or even destroy antigenic lipopeptides. Therefore, we undertook studies to measure the influence of defined proteases and natural proteolytic compartments on lipopeptide antigen recognition by T cells.
When the CD1a-presented, 6-mer lipopeptide DDM was pretreated with either of two broad spectrum proteases, proteinase K and pronase, it retained its ability to stimulate T cells at the same level as mock-treated samples (Fig. 3 A). In contrast, pronase and proteinase K nearly abolished recognition of the N-terminal N-acyl lipopeptide antigen presented by CD1c (Fig. 3 B). The reduced T cell activation after protease treatment was likely caused by action on the 12-mer lipopeptides rather than nonspecific or toxic effects on cells, because adding fresh lipo-12 to the protease–antigen mixture after protease inactivation restored T cell responses (Fig. 3 C). Therefore, we conclude that DDM has intrinsic resistance to broad-spectrum proteases but the 12-mer lipopeptides, which contain typical peptide bonds, were inactivated.
We next considered whether the naturally occurring proteases, which are normally expressed within endosomes of B cells or immature myeloid DCs, might also influence lipopeptide recognition. First, we treated DCs with the endoprotease inhibitor LHVS, an inhibitor of the lysosomal protease cathepsin S (40, 41). We reproducibly found two- to fourfold increases in the presentation of lipo-12 (Fig. 3 D). Although the augmentation in recognition was relatively small in magnitude, it was considered to be important because LHVS does not block all types of endoproteases and also because any nonspecific effects of protease inhibitors would be expected to reduce rather than to augment recognition. This result indicated that endosomal endoproteases can degrade this antigen and raised questions about how proteases distributed among the diverse subcompartments of the endosomal pathway might influence recognition.
Redirection of CD1c trafficking to lysosomes reduces lipo-12 presentation
Proteinase K, pronase, and the targets of inhibition by LHVS may not produce the same effects as the more complex array of endogenous proteases normally present in the endosomal network. To test the role of naturally occurring endosomal factors in lipopeptide presentation, we devised a means to reroute CD1c trafficking from its normal steady-state location in early endosomes so that it more efficiently reaches late endosomal and lysosomal compartments. AP-3 complexes localize broadly within the endosomal pathway and promote directed trafficking of cargo proteins to lysosomes (42). Previous studies had shown that the cytoplasmic tail sequence in CD1c (KKHCSYQDIL) binds to AP-2 but not AP-3 (32, 35). In contrast, the cytoplasmic tail sequence in CD1b (RRRSYQNIP) binds to both AP-2 and AP-3. Therefore, to reroute the antigen binding domain of CD1c to lysosomes, we produced a construct encoding a chimeric protein with CD1c extracellular and transmembrane domains fused to the CD1b cytoplasmic tail sequence (CD1cextra/CD1btail).
To facilitate visualization by two-color immunofluorescence, we cotransfected a plasmid encoding this chimera along with one encoding wild-type CD1b into C1R B lymphoblastoid cells. As a control, we produced a construct encoding the converse chimera with CD1b extracellular and transmembrane domains fused to the CD1c cytoplasmic tail (CD1bextra/CD1ctail) and coexpressed this with wild-type CD1c proteins. Confocal immunofluorescence studies were then performed using antibodies specific for the extracellular domains of CD1b or CD1c. For wild-type CD1c, we found that antibodies recognizing the extracellular domain stained predominantly at the rim of lymphoblastoid cells, which is a pattern consistent with the known distribution of CD1c at the cell surface and early endosomes. In contrast, antibodies against the extracellular domain of CD1b stained with a rim pattern as well as at many punctuate bodies located adjacent to the rim, which likely represented late endosomes (Fig. 4, top row). This interpretation is in agreement with previous studies performed in myeloid cells and HeLa cells showing that the cytoplasmic tail of CD1b confers efficient trafficking from the surface to endosomes and a higher endosomal-to-cell surface ratio than is seen for CD1c (31, 34). Using anti-CD1c to evaluate the staining pattern of the CD1cextra/CD1btail chimera, we found that the chimera had clear increases in the nonrim, punctate staining as compared with wild-type CD1c (Fig. 4 A, left column). In fact, the chimera strongly colocalized with coexpressed CD1b wild-type proteins, demonstrating that the chimeric CD1c protein was rerouted to compartments normally enriched with CD1b (Fig. 4 A, bottom row). Conversely, the CD1bextra/CD1ctail chimera lost the punctate staining pattern and appeared mainly at the rim of cells (Fig. 4 A, middle row).
To better visualize the compartments and compare CD1c and CD1cextra/CD1btail steady-state localization in lysosomes, we performed higher resolution confocal studies in comparison to LAMP1. We analyzed stably expressing clones that were matched for equivalent levels of surface anti-CD1c staining in flow cytometry. As predicted by previous studies (31, 32, 34, 35), we observed a higher ratio of punctate-to-rim staining in the chimeric protein, suggesting redistribution from the surface to endosomes. More importantly, addition of the CD1b tail to the antigen binding domain of CD1c causes much stronger colocalization with LAMP1, indicating that this mutation drives the antigen binding domain of CD1c into lysosomes Fig. 4 B. Therefore, chimeric CD1c proteins represent a tool to measure lysosomal effects on lipopeptide antigen presentation.
To measure antigen presentation, we used transfected C1R clones with equivalent cell-surface levels of wild-type or chimeric CD1c (Fig. 4 C, right). T cell stimulation assays showed that chimeric proteins were 5- to 10-fold less efficient in presenting lipo-12. This was not caused by a general loss of antigen-presenting capacity of the C1R clone or the mutant CD1c protein because MPM, a glycolipid that is not sensitive to proteases, was presented equally well by wild-type and mutant CD1c (Fig. 4 C, middle). We conclude that rerouting CD1c proteins to later endosomal compartments, which are normally used by CD1b, reduces efficiency of presentation of a lipopeptide antigen.
Further controls using wild-type and chimeric CD1b proteins showed that expression of CD1bextra/CD1ctail provides ∼100-fold less efficient presentation of C80 glucose monomycolate (GMM) when compared with presentation by wild-type CD1b proteins. The C80 GMM antigen is a long-chain lipid whose loading is known to depend on low pH and, thus, to occur more efficiently in late endosomal and lysosomal compartments with pH <5.5 (43, 44). The presentation efficiency loss resulting from expressing the CD1b with the CD1c tail was smaller when tested with C32 and C54 GMM antigens, whose loading is less dependent on the low pH of lysosomes based on the less stringent requirements for loading antigens with smaller lipid tails. Collectively, these additional controls provide evidence for two conclusions. First, the presence or absence of AP-3 interactions and the resulting redirection to either early or late endosomes substantially influences the antigen-presenting function of CD1b and CD1c, confirming the idea of functional segregation of antigen presentation events into early and late endosomal compartments. Second, whether redirection increases or decreases the absolute efficiency of presentation depends on the identity of the antigen and the extent to which late endosomal and lysomal events influence its processing and loading.
The discovery of lipo-12 represents a new class of lipopeptide antigen presented by CD1 and provides the first evidence that CD1c presents lipopeptides to T cells. lipo-12 resembles ribosomally produced peptides modified by N-myristoyl transferase. In fact, the linkage of the N-terminal acylation and the first six amino acids are chemically identical to the naturally occurring acylated-Nef sequence produced by HIV-1 during cellular infection (45). Thus, at least two members of the CD1 family (CD1a and CD1c) bind and present lipopeptides. Because all human CD1 isoforms (CD1a, CD1b, CD1c, CD1d, and CD1e) traffic through endosomal compartments containing peptidases, and cathepsin S and cathepsin L have been shown to affect CD1d-restricted T cell activation, these findings raise questions about how endosomal peptidases might interact with CD1-presented antigens composed of peptides (46, 47).
The observed inability of even broadly acting peptidases to cleave DDM likely results from atypical (nonpeptidic) linkages between individual amino and organic acids, including an oxazaline ring and an internal ester. DDM and structurally related mycobactins and carboxymycobactins undergo transport to the mycobacterial surface and release into the phagosomal space, where they directly interface with the host and bind iron for uptake into the mycobacterium (48). These unusual chemical linkages among amino acids were likely evolutionarily selected for their ability to remain intact within the phagosomal space while carrying out DDM's normal iron scavenging function.
Although the site of loading of DDM onto CD1a within infected cells and tissues has not been formally established, infection of intact human DCs with live Mycobacterium tuberculosis leads to activation of CD1a-restricted and DDM-specific T cells (21, 49). Further, CD1a is normally expressed on myeloid cells in lepromatous and tuberculous lesions in humans (50), and CD1a is up-regulated upon mycobacterial infection of immature DCs in vitro (51). These considerations and new data shown in this paper suggest a model whereby an intrinsically protease-resistant lipopeptide antigen is shed from the mycobacterial surface, traverses the phagosomal space, and contacts CD1a proteins. Such protease resistance derives directly from the nonribosomal nature of DDM biosynthesis and evolutionary pressure to function as an iron scavenger in a protease-rich compartment. Such considerations do not apply to ribosomally translated proteins containing repeating amide bonds formed between the C and N termini of amino acids, as in lipo-12.
The discovery of T cell recognition of the synthetic lipopeptide lipo-12 expands the known reactivity of CD1 to include an N-terminally acylated peptide. This synthetic molecule is related in structure to naturally produced lipopeptides made through N-terminal glycine acylation (myristoylation) of proteins that are widely distributed in eukaryotic cells and viruses. N-terminal N-acyl glycine–modified proteins comprise 0.7% of mammalian proteins and 3.5% of viral proteins (http://mendel.imp.ac.at/myristate/myrbase/). Such acyl proteins participate in intercellular signaling (Wnt and integrins), intracellular signaling (G proteins and neurotransmitter receptors), cell-cycle control (Src family members), and other processes. Also, viral genomes encode proteins that are normally acylated, including Nef proteins, which are important for viral budding (52).
In contrast to DDM, lipo-12 is sensitive to degradation by proteases. This feature is not shared by previously known CD1-presented lipid antigens and raises the question of a possible role of endosomal peptidases in generating or destroying epitopes in larger acyl-proteins. Many studies of antigen processing by MHC class II found that protein degradation is not rapid and not complete during the early phases of uptake into macrophages or DCs, but instead occurs in a stepwise fashion as proteins transit from sorting to early and late endosomes and lysosomes (53–55). Immature myeloid DCs partially preserve peptide structure so that they are cleaved internally but not fully destroyed (56). In humans, immature myeloid DCs normally express CD1 proteins, so we speculate that this phenomenon may also influence lipopeptide generation and production of antigens for the CD1 system. Our studies of lipo-12 show that a lipopeptide is composed of conventional peptidic linkages and that the peptide sequence determines T cell activation. Further, lipo-12 recognition is abolished by proteases, enhanced by a cathepsin inhibitor, and reduced after redirecting CD1c protein trafficking to lysosomes.
Individual members of the human CD1 family (CD1a, CD1b, CD1c, and CD1d) differ with regard to the extent that they enter early, intermediate, or late compartments of the endosomal network (57, 58). Most previous studies have emphasized how factors present in late endosomes and lysosomes contribute in a positive way to antigen loading and recognition by T cells. Specifically, the late endosomal antigen presentation pathway facilitates antigen recognition through pH-mediated changes in CD1 conformation that increase antigen access to CD1 grooves (9, 44, 59), pH-activated lipid transfer proteins (3, 60–62), and pH-activated glycosidases that trim antigens (63, 64). Also, in theory, if all CD1 isoforms were to use late endosomal pathways, thereby traversing all endosomal compartments, this would enable them to encounter the widest possible range of antigens and load them under all available conditions.
Contrary to this simple model, CD1a and CD1c do not efficiently enter late endosomes and lysosomes, where they might take advantage of such specialized lipid loading or processing cofactors (32, 34, 35). Instead, CD1a and CD1c predominantly encounter antigens in the secretory pathway and early endosomes and then return to the surface without localizing in lysosomes at steady state (32, 34, 35). Furthermore, inspection of the sequences of nonhuman CD1 genes suggests that most mammals have large CD1 families containing examples of two types of CD1 proteins: those that are and those that are not predicted to enter lysosomes. What, then, are the potential selective advantages that underlie the evolutionary retention of isoforms that enter early but not late endosomal pathways?
One candidate mechanism that is illustrated in studies shown in this paper is that early endosomes might concentrate antigens and CD1 proteins for loading, yet expose the antigens to chemically and enzymatically mild conditions that allow antigens to remain intact during and after loading reactions. Certain antigens, such as lipopeptides and sulfatides, may be most efficiently presented if they load at the cell surface (25, 32) or if the complex escapes from the early endosomal environment to the surface before entry into the degradative environment of late endosomes and lysosomes.
More so than the peptide ligands for MHC class II, the ligands for CD1 proteins are diverse in their structures and chemical reactivity. They can be composed of lipids, peptides, phosphate esters, sulfoesters, and carbohydrates. Known antigens range from lipopeptides, which are cleaved by ubiquitous enzymes, to free mycolic acid, which is a chemically inert compound that is catabolized by specialized, multistep enzymatic systems. As contrasted with models in which CD1 is thought to sequentially sample all endosomal compartments, these studies emphasize the distinction between functionally separate early pathways used by CD1a and CD1c, and late pathways used more extensively by CD1b and CD1d. This dual system could allow presentation of those antigens requiring chemically mild loading conditions, and other antigens needing loading factors and pH-driven changes in CD1 or antigen structure found in late endosomes. We speculate that the chemical diversity of CD1-restricted antigens provides evolutionary pressure for any given mammalian species to retain larger numbers of CD1 genes, including CD1 proteins that sample lysosomes and those that do not.
Collectively, the discovery of lipo-12 as a CD1-presented antigen for human αβ T cells and the identification of the N-terminal acylation as the key unit that facilitates presentation of the antigen by CD1c raise the possibility that other N-terminally acylated peptides, including posttranslationally modified products of ribosomal translation, might also be antigens for the CD1 system. All previously known CD1-presented antigens are not encoded by a host or pathogen genome, but instead are generated by series of enzymatic steps to produce antigens of nearly invariant structure, which has led to the view that CD1-presented antigens always have conserved structures. The data we present in this paper raise the possibility that chemically more diverse and mutable antigens composed of amino acid sequences encoded in host or pathogen genomes could also be antigens for the CD1 system. Interestingly, the viral N-terminally acylated proteins HIV Nef and hepatitis B virus large surface antigen lose their function if mutated in a way that they lose their N-terminal acylation (52, 65). This provides a candidate mechanism for coupling of pathogenicity and binding capacity to the antigen-presenting element.
MATERIALS AND METHODS
Lipopeptides were prepared by automated peptide synthesis followed by N-acylation of a peptidic backbone with a mixture of fatty acids (C14:0, C16:0, C18:0, C18:1, and C20:0; Anaspec Corporation). The unsaturated fatty acids were from Sigma-Aldrich, and C18:1 (trans-2-octadecenoic acid) was from Matreya. Analogues of lipo-12 were prepared by first synthesizing the peptide backbones on a solid-phase peptide synthesizer (ABI 433A; Applied Biosystems) using Wang resin (Dana-Farber Cancer Institute Core Facility). Lysine and tryptophan were Boc protected, whereas the rest of the amino acids were Fmoc protected on their side chains. Finally, the N-stearoyl modification (Sigma-Aldrich) of the peptidic backbone was performed, and the lipopeptide was released, dried under nitrogen gas, resuspended in CHCl3/CH3OH (1:1 vol/vol), and analyzed by HPLC-MS for purity and confirmation of the expected mass.
Antigenic compounds and analogues of lipo-12 were fractionated by HPLC with a sample-splitting interface to allow for simultaneous preparative collection of samples for T cell assays and detection based on ultraviolet light absorbance (254 and 280 nm), and ESI-MS with QIT (Finnigan LCQ Advantage [Thermo Fisher Scientific] or Accurate Mass QTOF [Agilent Technologies]). A C18 (Vydac) column was used with a gradient elution based on solvent A (80:20 vol/vol H2O/CH3CN/H2O with 0.02% TFA and 0.1% HCOOH) and solvent B (50:30:20 methanol/acetonitrile/water with 0.02% TFA and 0.1% formic acid) using a flow rate of 0.7 ml/min, and a gradient starting at 50% B and running to 95% B over 20 min and holding at 95% B for the final 10 min. More detailed MSn experiments were performed by nano-ESI using borosilicate glass capillaries pulled to a final orifice of 1–2 µm and an internal stainless steel electrode. For biological testing, fractions were collected at 15-s intervals with an automatic fraction collector, evaporated to dryness under nitrogen, and tested for stimulation of T cells. To yield highly pure analogues of lipo-12, fractions with the desired m/z value were combined, and dried under nitrogen gas, resuspended in solvent A/solvent B (9:1), and again subjected to the described HPLC method to yield samples with the highest purity. DDM was purified from M. tuberculosis, as previously described (21).
Structural characterization of lipo-12
The lipo-12 mixture obtained in the initial synthesis was subjected to nano–ESI-MS analysis using a home-built qQq FTICR MS instrument (66) fitted with a 7-Tesla actively shielded magnet (Cryomagnetics Inc.). Accurate masses were measured for the [M+3H]3+, [M+2H]2+, and/or [M+H]+ ions, and the products were obtained for selected precursors that were subjected to collision-induced decomposition. The front-end quadrupoles were controlled using the program LC2Tune 1.5 (MDS Analytical Technologies), and the program IonSpec99 (IonSpec Corp.) controlled data acquisition in the ion cyclotron resonance cell. Spectra were analyzed using the Boston University Data Analysis software, developed in house.
The lipopeptide sample was diluted 1:40 from its original concentration using 1:1 CH3OH/H2O (vol/vol), 1% HCOOH solution, and was sprayed using a home-built nanospray source. The solution was loaded into a Kwik-Fil borosilicate glass capillary tip (1-µm orifice diameter) pulled in house with a micropipette puller (P-97 Flaming/Brown; Sutter Instrument Co.). Kwik-Fil borosilicate glass capillaries, HPLC-grade methanol, and formic acid were obtained from VWR International. All water was filtered using a filtration system (Milli-Q Gradient A10; Millipore).
The [M+3H]3+ ions of lipopeptides 1 and 3 and the [M+H]+ ion of lipopeptide 2 were isolated in the resolving quadrupole and accelerated at 10 eV into the LINAC quadrupole for collision with N2 gas. Fragment ions were accumulated for 1000 ms and transmitted to the ICR cell for detection. All spectra were analyzed without apodization and with two zero fills, and were internally calibrated based on the m/z values calculated for the [M+3H]3+ ions, the y9 and c6 fragments, and their isotopes.
Compounds of interest were analyzed by isolating the parent ions in the QIT MS, collisionally activating them with the helium buffer present at low pressure (∼10−5 Torr), and, finally, sequentially ejecting the product ions from the trap for mass analysis. These experiments were performed both during the HPLC-MS runs with ESI for initial MS/MS analysis and also using offline nano–ESI-MS for more detailed MSnth analysis with multiple stages of fragmentation.
Derivation of T cell lines
To generate CD1-restricted T cells, primary human lymphocytes from HIV+ patients were handled in biosafety level 2+ conditions by stimulation with monocyte-derived immature DCs and a synthetic antigen mixture at a concentration of 1 µg/ml. 1.25 × 105 T cells and 0.25 × 105 DCs per well were cultivated in round-bottom 96-well plates. For the first three rounds of stimulation, autologous DCs were used, and heterologous DCs were used for subsequent stimulations. Lines were cultivated in T cell medium made by supplementing 500 ml RPMI 1640 medium with 50 ml of fetal calf serum (Hyclone), penicillin (Invitrogen), streptomycin (Invitrogen), 20 mM Hepes (Invitrogen), and 4 ml 1 N NaOH solution. The IL-2 concentration was initially 0.1 nM and gradually increased to 1 nM during subsequent rounds of stimulation. T cell clones were derived by limiting dilution, using 0.6 × 105 EBV-transformed and irradiated B cells (10,000 R) and 1.3 × 105 heterologous irradiated PBMCs (3,300 R) as feeder cells, and 1 µg/ml PHA (Difco) in medium containing 2 nM IL-2.
T cell assays
T cell activation was measured by incubating 5 × 104 T cells with 3 × 104 DCs or CD1-transfected C1R cells. Proliferation was measured after co-culture for 3 d with antigen, followed by a 6-h pulse of 1 µCi [3H]thymidine before harvesting and counting β emissions. Alternatively, supernatants were tested for the presence of IL-2 using the HT-2 bioassay. Anti-CD1a (OKT6), -CD1b (BCD1b.3), -CD1c (F10/21A3.1), -CD1d (CD1d42), and -IgG1 control (P3) were used for blocking studies at 20 µg/ml. For studies with C1R cells expressing wild-type or chimeric CD1 proteins, a panel of clones was tested for surface expression by flow cytometric analysis after staining with antibodies to CD1c (F10/21A3.1), or CD1b (BCD1b.3) followed by goat anti–mouse PE on nonpermeabilized cells. Clones with equivalent levels of CD1 expression as measured by mean fluorescence intensity (MFI) of nonpermeabilized cells within 1d of the assay were used to compare antigen presentation function.
For enzyme-linked immunosorbent spot (ELISPOT), polyvinylidene difluoride–backed 96-well plates (Millipore) were coated with 2.5 µg/ml anti–IFN-γ mAb 1-D1K (Mabtech) at 4°C for 16 h and blocked with RPMI 1640 (Invitrogen) supplemented with 10% fetal bovine serum for 1 h at 37°C. 2.5 × 104 monocyte DCs were added to wells with serial dilutions of lipopeptide antigens and coincubated with 104 T cells for 16 h at 37°C. After incubation, the plates were washed six times with PBS containing 0.05% Tween 20, and incubated with 0.3 µg/ml of biotinylated anti–IFN-γ mAb 7-B6-1-biotin (Mabtech) for 2 h at room temperature. Plates were washed again and incubated with streptavidin-conjugated alkaline phosphatase (Sigma Aldrich) for 1 h. Individual cytokine-producing cells were identified as spots after a 10–20-min reaction with 5-bromo-4-chloro-3-indolyl phosphate and nitro blue tetrazolium (SIGMAFAST BCIP/NBT; Sigma Aldrich). An ImmunoSpot S5 Macro Analyzer (Cellular Technology Ltd.) was used for enumeration of the spots.
Competition assays for recombinant CD1c
Soluble human CD1c fusion proteins covalently linked to human β2-microglobulin and to the Fc portion of mouse IgG2a were constructed, produced, and purified as previously reported (2). 96-well protein G–coated plates (Thermo Fisher Scientific) were incubated with 1.25 µg of CD1c fusion protein and 125 ng anti-CD11a (AbD Serotec) per well in PBS, pH 7.4, overnight at room temperature. After the plates were washed three times with PBS buffer, lipoprotein competitors were sonicated into PBS and added to the wells in molar excess. The plates were incubated for 8 h at 37°C before adding MPM for additional 18-h incubation at room temperature. The plate was then washed three times with 200 µl/well of sterile PBS before adding 105 CD8-1 T cells in a total volume of 200 µl of T cell medium per well. The plates were incubated for 24 h at 37°C, after which culture supernatants were collected for IFN-γ ELISA analysis (Thermo Fisher Scientific). Control experiments demonstrated that CD8-1 was restricted by CD1c, reactive to MPM, and not directly reactive to lipopeptides (unpublished data).
Generation of C1R (double) transfectant cell lines and confocal microscopy
To create tail swap mutants, human CD1c cDNA was amplified with the 5′ primer VB06 (5′-ATCAGCAAACAGCTTTTCTGAGAG-3′) and the 3′ primer VB05 (5′-TCATGGGATATTCTGATATGACCGGCGCCTCATAAACCATAACACAAGGACTATTAG-3′) to generate the CD1cextra/CD1btail mutant, and human CD1b cDNA was amplified with the 5′ primer VB07 (5′-ACCAGCTCTGCCAGTAAGAAGTTGC-3′) and the 3′ primer VB04 (5′-TCACAGGATGTCCTGATATGAGCAGTGCTTCTTATACCATAATGCAAGGCATAG-3′) to generate the CD1bextra/CD1ctail mutant. PCR reactions were performed using High Fidelity Platinum Taq DNA polymerase (Invitrogen), and the PCR fragments were cloned into the pEF6-TOPO TA vector (Invitrogen). The cloned inserts were confirmed by DNA sequencing. Transfection of C1R cells expressing either wild-type CD1b or CD1c was performed via electroporation, as previously described (31), and cells expressing the new chimeric or point mutant CD1 molecules were selected in tissue culture medium containing 10 µg/ml blasticidin.
The transfected C1R cells were attached to cover slides (poly–l-lysine coated), fixed with 4% paraformaldehyde, and permeabilized with 0.05% saponin, as previously described (34). For double staining of CD1b and CD1c, the extracellular domain of CD1b was detected with 5 µg/ml of the mAb 4A7 and revealed with a mouse IgG2a-specific, Alexa Fluor 647–conjugated antibody (Invitrogen). The CD1c extracellular domain was detected with 5 µg/ml of the monoclonal mouse antibody F10/21A3.1 and revealed by the mouse IgG1-specific, Alexa Fluor 555–conjugated antibody (Invitrogen). The slides were analyzed with a laser scanning confocal microscope (Radiance 2000; Bio-Rad Laboratories), and representative images are shown in the figures. For CD1c and LAMP1 double staining, the F10/21A3.1 antibody and the anti-LAMP1 H4A3 antibody (BD) were directly conjugated to Alexa Fluor 546 and Alexa Fluor 647, respectively (using a labeling kit; Invitrogen), as well as the IgG1 isotype control antibody P3. Slides were analyzed on a confocal laser scanner (TE2000-U C1; Nikon).
In vitro protease treatment
Pronase is a mixture of endopeptidases and exopeptidases (carboxypeptidases and aminopeptidases) that cleave denatured and native proteins down into individual amino acids. Proteinase K is an endopeptidase that has a preference for cleavage between an aliphatic, aromatic, or hydrophobic and any other amino acid; however, it will digest any peptidic bond if added in excess and/or allowed to interact over long incubation periods. Pronase and proteinase K were used to digest lipo-12 and DDM in protease buffer (10 mM CaCl, 10 mM Hepes buffer, 25 mM ammonium bicarbonate) for 4 h at 40°C, followed by 10 min of inactivation at 85°C. Mock treatment of antigens was performed in the same buffer and at the same temperatures, but without addition of the proteases.
mRNA was isolated from 106 clonal or polyclonal T cells using an Oligotex Direct mRNA kit (QIAGEN), followed by first-strand cDNA synthesis using SuperScript RT (Invitrogen). PCR primer sets as described on the Immunogenetics website (http://imgt.cines.fr) covering most of the TCR Vα and Vβ families were used to determine the Vα and Vβ usage. Second-strand cDNA synthesis was performed using Escherichia coli DNA ligase (Invitrogen), E. coli DNA pol I (Invitrogen), and RNase H (New England Biolabs, Inc.) in E. coli ligase buffer (Invitrogen), followed by blunting of the material with T4 polymerase and circularization using T4 ligase. Inverse PCR using the constant regions of the TCR α and TCR β (Cα and Cβ) was performed using the following primers: CircularCαForward, 5′-GACCTCATGTCTAGCACAGTTTTG-3′; CircularCαReverse, 5′-GCCCTGCTATGCTGTGTGTCT-3′; CircularCβForward, 5′-ACACAGCGACCTCGGGAGGG-3′; and CircularCβReverse, 5′-GATGGCCATGGTCAAGAGAAAGGA-3′. Primers for full-length TCR chains were: FLTRBV12-3, 5′-GCCATGGACTCCTGGACCTTCTGCT-3′; and FLTRAV25, 5′-GGGAGATGCTACTCATCACATCAATGTTG-3′. PCR products were cut from an agarose gel, purified, and ligated in a Topo4blunt vector that was used to transform one-shot Top10 cells (Invitrogen). Vector DNA of single colonies was sequenced by Baseclear.
Online supplemental material
Fig. S1 shows T cell recognition of synthetic lipopeptides mimicking acylated proteins from mammalian cells or viruses. Fig. S2 depicts the amino acid sequence of the 1A3 TCR. Fig. S3 shows MS2 data supporting the structure of lipopeptides 1 and 2. Fig. S4 depicts the synthesis and analysis of 12-mer lipopeptides. Fig. S5 shows the synthesis and analysis of tryptophan point mutant at position 7.
We thank W. Peng for providing the construct for the CD1c-Ig fusion protein.
This work was supported by the National Institute of Allergy and Infectious Diseases (grant R01 AI049313 to D.B. Moody and grant AI45889 to S.A. Porcelli), the National Institute of Arthritis and Musculoskeletal and Skin Diseases (grant R01 AR048632 to D.B. Moody), the American Lung Association (grant RT-95-N to I. Van Rhijn), the National Institutes of Health/National Center for Research Resources (grant P41 RR10888 to C.E. Costello), the Pew Foundation Scholars in the Biomedical Sciences, and the Burroughs Wellcome Fund Clinical Scientist Award in Translational Research.
The authors have no conflicting financial interests.
Abbreviations used: AP, adaptor protein; DDM, dideoxymycobactin; ESI, electrospray ionization; FTICR, Fourier transform ion cyclotron resonance; GMM, glucose monomycolate; LAMP1, lysosome association membrane protein 1; m/z, mass-to-charge; MFI, mean fluorescence intensity; MPM, mannosyl phosphomycoketide; MS, mass spectrometry; QIT, quadrupole ion trapping.
I. Van Rhijn's present address is Division of Infectious Diseases and Immunity, Faculty of Veterinary Medicine, Utrecht University, 3584CL Utrecht, Netherlands.