Lymphocyte antigen receptors are not encoded by germline genes, but rather are produced by combinatorial joining between clusters of gene segments in somatic cells. Within a given cluster, gene segment usage during recombination is thought to be largely random, with biased representation in mature T lymphocytes resulting from protein-mediated selection of a subset of the total repertoire. Here we show that T cell receptor Dβ and Jβ gene segment usage is not random, but is patterned at the time of recombination. The hierarchy of gene segment usage is independent of gene segment proximity, but rather is influenced by the ability of the flanking recombination signal sequences (RSS) to bind the recombinase and/or to form a paired synaptic complex. Importantly, the relative frequency of gene segment usage established during recombination is very similar to that found after protein-mediated selection, suggesting that in addition to targeting recombinase activity, the RSS may have evolved to bias the naive repertoire in favor of useful gene products.
TCR diversity is generated by gene rearrangement in somatic cells 1,2,3 using clusters of V, D, and J gene segments (for reviews, see references 4, 5). Together with enzymatic modification of the coding ends before joining 4,5, the permutational nature of this process permits extensive diversity while using minimal genetic resources. Many details of the somatic recombination process have recently been elucidated 6. Nonetheless, it is not clear how gene segments within a given cluster are chosen for use by the recombinase, or even whether such usage is random or preferential. Although the distribution of gene segments used by mature T cells is known to be nonrandom, this is thought to result mainly from intrathymic screening based on receptor specificity 7,8,9. However, recent evidence has suggested that certain receptor specificities can be enhanced or diminished before protein-mediated screening 10,11,12,13,14, implying the presence of genetic influences on this process, although potential mechanisms have not been revealed.
To assess whether gene segment usage and receptor specificity can be influenced during recombination, we focused on the D-J region of the TCR-β locus (sequence data are available from EMBL/GenBank/DDBJ under accession no. AE000665). Several factors make this gene region ideal for such analysis. First, the small size of this region (∼11 kb) facilitates direct DNA analysis by Southern blotting. Second, the upstream Dβ segment (Dβ1) can rearrange to either the proximal Jβ1 cluster (seven coding segments) or to the more distal Jβ2 cluster (also seven segments), thus allowing the influence of gene segment proximity to be directly assessed. Finally, partial (D-J) rearrangements are not apparently translated into protein; thus, any patterns found in D-J rearrangement should not be influenced by selection, and therefore must be imposed during recombination. Using this system, we show that individual TCR-β gene segments are used in strict hierarchical patterns during recombination. The proximity between recombining segments plays no role in the patterning process. Rather, variations in degenerate positions of the recombination signal sequences (RSS) appear to determine the frequency with which individual gene segments are used. Most importantly, comparison of our data with those of others shows that these recombinatorial biases skew the naive repertoire in favor of selectable TCR-β proteins, suggesting that the RSS have evolved to enhance the efficiency of the selection process.
Materials And Methods
Probes hybridizing to noncoding sequences upstream of Dβ1 (5′Dβ1, 460 bp) or Dβ2 (5′Dβ2, 250 bp) were cloned by PCR amplification using C57BL/6 kidney DNA as template. Primer sequences were as follows (5′→3′): 5′Dβ1 forward, gagggatccaccgttctaagaagt; 5′Dβ1 reverse, ggcggatcctcccataggtcta; 5′Dβ2 forward, tgtggagtctcctggtagggacc; 5′Dβ2 reverse, gactgagaggggctgggaaaag. Genomic DNA was prepared from purified thymocytes embedded in low melting point agarose as described 15. DNA was digested with either ApaLI-SacI or ApaLI-ClaI (10 U/μg) as indicated in the figures. Digested DNA (5–10 μg per lane) was electrophoresed in agarose gels, transferred to nylon membranes, hybridized with [α-32P] dCTP–labeled probes, and quantitated as described 15; in this case, percent hybridization was calculated by the formula [Xt (Cg/Ct)]/Xg × 100, where C represents the intensity of the DNA loading band, X represents 5′Dβ1 or 5′Dβ2 intensity, and g or t represents germline or thymocyte samples, respectively. The probe used for DNA quantitation recognizes a nonrearranging intronic sequence located 3′ of the murine TCR-α locus, as described 15.
Substrates for In Vitro Cleavage Assays.
The parental substrate pLP3 (the gift of L. Ptaszek, Yale University, New Haven, CT) was synthesized to contain canonical 12-mer and 23-mer RSS 16 and consensus spacer sequences 17. These were ligated onto ∼700 bp of intervening sequence lying between Dδ2 and Jδ1 18, which was amplified by PCR and cloned into pBSK. For the studies described here, this parental construct was further modified by replacing the canonical 23-mer with synthetic oligomers corresponding to the Dβ1 23-mer RSS, and likewise replacing the canonical 12-mer with a pair of synthetic 12-mers corresponding to various Jβ2 RSS and separated by ∼100 bp. When native RSS were used, six bases of the corresponding endogenous coding sequence were included adjacent to the RSS heptamer. In some cases, chimeric RSS were constructed, using heptamer/nonamer sequences homologous to one Jβ sequence and the spacer from another; in these cases, the chimeric RSS were all flanked by the same six bases of the Jβ2.5 coding sequence. Cleavage substrates were released from pBSK before cleavage assays using PvuI and AflIII, followed by gel purification.
In Vitro Cleavage Assay.
The enzymatic cleavage reaction was carried out as described previously 19,20,21. In brief, epitope-tagged murine recombination activating gene (RAG)-1/2 were expressed in M12 B lymphoma cells, followed by chromatographic purification. Purified RAG-1/2 and high mobility group 2 proteins 21 were added to 2–5 ng of purified substrate and incubated in the presence of 10 mM Mg2+ for 2 h at 37°C. Reaction products were deproteinated and resolved by agarose gel electrophoresis, followed by transfer to nylon membranes and hybridization with an [α-32P]dCTP–labeled probe corresponding to the Dδ-Jδ intervening sequence.
Results And Discussion
The general strategy for analysis of D-J recombination at the TCR-β locus is illustrated in Fig. 1. To exclude protein-mediated influences resulting from complete (V-DJ) rearrangements, we took advantage of the finding that excised DNA circles produced by recombination contain an ApaLI site not encoded in the germline 22, formed by blunt-ended ligation of RSS heptamers. Probes hybridizing to noncoding sequences upstream of Dβ1 (5′Dβ1) or Dβ2 (5′Dβ2) were used to detect various products of DJβ rearrangement by Southern blotting. In germline genes, 5′Dβ1 hybridizes to a 4.7-kb fragment flanked by germline ApaLI and SacI sites (Fig. 1). Rearrangements between Dβ1 and the Jβ1 cluster yield six progressively shorter fragments (4.2–2.5 kb), corresponding to excision of D-J intervening sequences; no rearrangements to Jβ1.7 are expected, as this gene segment has a known RSS defect 23. Rearrangement of any Vβ segment to Dβ1 would result in hybridization of 5′Dβ1 to a non–germline-encoded fragment, flanked at the 5′ end by the germline ApaLI site, and at the 3′ end by a de novo ApaLI site formed by RSS heptamer ligation (Fig. 1). A similar strategy was also devised to assess rearrangements at the second D-Jβ cluster (Fig. 1).
Results from several of these types of analyses, using DNA from early precursor thymocytes or mature lymph node T cells, are presented in Fig. 2. Rearrangements involving Dβ1 and the Jβ1 cluster are analyzed in Fig. 2 a. Consistent with previous findings 15,24, almost no detectable TCR-β rearrangement occurs in DN2 cells (CD4−CD8−CD25+CD44+); thus, only the germline fragment is apparent. Cells at the next stage of development (DN3; CD4−CD8−CD25+CD44lo) show all of the predicted products of TCR-β recombination (see Fig. 1), including the de novo product of V-DJ recombination. This excised circle is neither degraded after excision nor replicated during cell division, as it persists in peripheral T cells (Fig. 2, a–c), but is lost after cell proliferation in vitro (data not shown). This reconciles our present estimate of the extent of V-Dβ recombination in DN3 precursors (44% of alleles; see the legend to Fig. 2) with the lower value published previously 25, as most V-Dβ rearrangements generate large excised products that cannot be distinguished from chromosomal DNA using the previous approach. Partial rearrangements involving Dβ1 to the Jβ1 cluster are clearly resolved by ApaLI-SacI digestion (Fig. 2 a); to resolve rearrangements between Dβ1 and the Jβ2 cluster, which migrate as a group after ApaLI-SacI digestion (Fig. 2 a), ApaLI-ClaI digests were used, followed by hybridization to the same probe (Fig. 2 b). Finally, rearrangements between Dβ2 and the Jβ2 cluster were also resolved, using probe 5′Dβ2 and ApaLI-ClaI digestion, as shown in Fig. 2 c. Unexpectedly, neither Dβ1 nor Dβ2 was found to rearrange to Jβ2.6; the significance of this finding is discussed below.
In Fig. 2 d, the D-J regions from the blots in Fig. 2, a–c are shown, aligned and scaled to the same vertical dimensions. Densitometric analysis (Fig. 2 e) reveals the clear presence of nonrandom gene usage patterns. Dβ1-Jβ1 rearrangements exhibit a bias towards proximal gene segment usage, whereas Dβ2-Jβ2 rearrangements are generally more distal. Most remarkably, the profile of Jβ2 gene segment usage is indistinguishable for rearrangements involving either the nearby Dβ2 or the more distal Dβ1. These findings reveal two important characteristics of the recombination process. First, gene segments are not selected randomly during recombination, but rather are used in distinctly hierarchical patterns. Second, as the distance between the Jβ2 cluster and Dβ1 versus Dβ2 is quite different (Fig. 1), these findings exclude a preeminent role for gene segment proximity in the patterning process, and thus compel the presence of other regulatory influences.
Other than the coding sequences, the only structured motifs known to exist in D-J gene clusters are the RSS. To determine whether the RSS might directly influence the frequency of gene segment usage during recombination, competitive RAG-mediated cleavage assays were conducted in vitro (Fig. 3). Artificial DNA substrates (∼1.4 kb) were synthesized to recapitulate part of the D-Jβ cluster, including the Dβ1 23-mer RSS, as well as two tandem 12-mer RSS (Fig. 3 a). The 12-mer RSS were from Jβ2.2 and 2.5, which are the least versus most frequently used segments in the Jβ2 cluster, respectively (see Fig. 2), in various permutations. When subjected to RAG protein–mediated cleavage in vitro (Fig. 3 b), these substrates gave results consistent with those obtained ex vivo (Fig. 2), i.e., RAG-mediated cleavage at the Jβ2.5 RSS predominated over that of Jβ2.2 (Fig. 3 b). These results were further confirmed by additional experiments in which other Jβ2 RSS were included in such substrates (Fig. 3 c); the efficiency of coupled cleavage between Dβ1 and various Jβ2 RSS was found to be nearly identical to that seen ex vivo (i.e., Jβ2.2 < 2.1 < 2.7 < 2.5). Taken together, these ex vivo and in vitro data show that properties related to the RSS are critical for establishing the relative frequency of RAG-mediated cleavage, and thus gene segment usage, during recombination.
Examination of the differences between the Jβ2.2 and 2.5 RSS (Table) revealed no obvious explanation for the patterns of recombination seen, as each substitution found in the less used Jβ2.2 RSS (position 7 in the heptamer and positions 1, 4, and 9 in the nonamer) can also be found in other more frequently used RSS. Given this finding, it was of interest to test the possibility that sequence motifs within the nonconserved RSS spacer might influence the frequency of gene segment usage during recombination. Consequently, chimeric RSS were constructed, using the heptamer/nonamer from Jβ2.2 and the spacer from Jβ2.5 or the reverse combination, and these were inserted into the second 12-mer position of the cleavage substrate diagrammed in Fig. 3 a. As is shown in Fig. 3 d, cleavage by RAG proteins very clearly correlated with the heptamer/nonamer sequences, but not with the spacer. Taken together, these results show that in addition to the critical role played by certain highly conserved residues of the heptamer/nonamer (in bold in Table) for RAG recognition, the overall sequence acts in a higher order fashion to modulate the frequency at which gene segments within a given cluster are used during recombination.
What is the biological significance of hierarchical patterns of gene segment usage? Important clues are provided by the findings of others, which indicate that the relative frequencies of Jβ gene segment usage found in postselection thymocytes and peripheral T cells 26,27 are remarkably similar to those established during recombination (Fig. 2). Taken together, these findings indicate that RSS-mediated influences on recombination frequency may serve to bias the preselection repertoire in favor of useful gene products, indicating evolutionary selection for beneficial mutations in these noncoding segments. This conclusion is also consistent with the absence of rearrangements involving gene segment Jβ2.6, both in vivo (Fig. 2) and in vitro (Fig. 3 e), despite the presence of an apparently intact RSS (Table). As the coding sequence for this gene segment is sterile 23, rearrangements to this segment would be nonproductive; consequently, not only would deleterious mutations in this RSS not be selected against, they may in fact be favored by evolutionary pressure.
Successful enrichment of useful receptors during recombination would ultimately require that similar mechanisms operate at other gene clusters, e.g., in the TCR-β V regions. The size of this region (>300 kb) precludes direct quantitative analysis at present. However, analogous mechanisms have already been predicted to operate at the immunoglobulin 28,29,30 and TCR-γ/δ loci 10, suggesting that this principle may be widely applicable. Consequently, our data also provide evidence to suggest that evolutionary pressure may operate at the level of non-coding sequences to select proteins that confer a biological advantage.
The authors thank D. Sant'Angelo (Memorial Sloan-Kettering Cancer Center) for manuscript review, J. Wayne (Memorial Sloan-Kettering Cancer Center) for help in cloning probes, L. Ptaszek (Yale University, New Haven, CT) for the LP3 recombination substrate and helpful advice, C.L. Tsai and I. Villey (Yale University) for purified RAG proteins, and J. Ashwell (National Institutes of Health, Bethesda, MD) for invaluable advice.
This work was supported by Public Health Service grants AI33940 (to H.T. Petrie) and CA08748 (to Memorial Sloan-Kettering Cancer Center), funds from the DeWitt Wallace Foundation (Memorial Sloan-Kettering Cancer Center), a Presidential Award from the National Science Foundation (to D.G. Schatz), and a Bressler Intramural award (to F. Livak). D.G. Schatz is an Associate Investigator of the Howard Hughes Medical Institute.
F. Livak and D.B. Burtrum contributed equally to this work.
F. Livak's present address is the Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201.