Activation-induced cytosine deaminase preferentially deaminates C in DNA on the nontranscribed strand in vitro, which theoretically should produce a large increase in mutations of C during hypermutation of immunoglobulin genes. However, a bias for C mutations has not been observed among the mutations in variable genes. Therefore, we examined mutations in the μ and γ switch regions, which can form stable secondary structures, to look for C mutations. To further simplify the pattern, mutations were studied in the absence of DNA polymerase (pol) η, which may produce substitutions of nucleotides downstream of C. DNA from lymphocytes of patients with xeroderma pigmentosum variant (XP-V) disease, whose polymerase η is defective, had the same frequency of switching to all four γ isotypes and hypermutation in μ-γ switch sites (0.5% mutations per basepair) as control subjects. There were fewer mutations of A and T bases in the XP-V clones, similar to variable gene mutations from these patients, which confirms that polymerase η produces substitutions opposite A and T. Most importantly, the absence of polymerase η revealed an increase in C mutations on the nontranscribed strand. This data shows for the first time that C is preferentially mutated in vivo and pol η generates hypermutation in the μ and γ switch regions.
Immunoglobulin diversity is achieved in mammals at three molecular levels: joining of variable (V), diversity, and joining gene segments; hypermutation of rearranged V genes; and switching of heavy chain constant (C) genes. The latter two processes depend on the activation-induced cytosine deaminase (AID) protein. Mice and humans without AID have neither V gene mutations nor C gene switching (1, 2). This implies that the two processes, which involve different mechanisms that generate point mutations and double strand breaks, share common enzymes. Biochemical and genetic experiments indicate that AID functions to deaminate cytosine in DNA to uracil (3–10). Using gene-deficient mice and humans, several other proteins have been shown to alter the pattern of mutation and switching. First, uracil DNA glycosylase is required to remove uracil lesions in DNA. Mice and humans deficient for the enzyme have altered V gene mutations and deficient heavy chain switching (5, 11). Second, the mismatch repair proteins Msh2, Msh6, Pms2, and Mlh1, participate in an unknown way to change the pattern of V gene mutations and C gene switching in gene-deficient mice (12–18). Third, DNA polymerase (pol) η plays a role in generating mutations in V genes at A and T nucleotides. Humans with xeroderma pigmentosum variant (XP-V) disease whose pol η is defective have fewer A:T substitutions (19).
Mutations have also been detected in introns containing the recombination sites of switched C genes (20). Mutations are even found in the μ switch region before switching in murine B cells stimulated in culture, and they are dependent on AID expression (21, 22). Although mutations in V genes have the potential to change the coding sequence to produce high affinity antibodies, mutations in the switch regions likely reflect the footprints of AID deamination that precede DNA strand breaks. Actual recombination requires additional factors because defects in the carboxy-terminal region of the AID protein do not affect hypermutation but eliminate switching (23, 24).
In vitro, AID preferentially deaminates C on single stranded DNA or on the nontranscribed strand during transcription (6–10, 25). This leads to the conundrum that there should be a large increase in mutations of C compared with those of G, A, and T, as recorded from the nontranscribed strand. However, in vivo, mutations of all four nucleotides in V genes are approximately equal in frequency (26), suggesting that other proteins, e.g., mismatch repair proteins and pols, generate mutations downstream of deaminated cytosines. In this study, we looked for a C bias in mutations from switch regions, which can form stable secondary structures that expose single strands. To further simplify the pattern, we removed pol η and analyzed mutations in the μ-γ switch joins from three XP-V patients. We show that XP-V patients have normal switching, but as in V genes, there were fewer mutations of A and T nucleotides. Furthermore, the absence of pol η revealed preferential targeting of C bases for mutation.
Materials And Methods
DNA Preparation from Peripheral Blood Lymphocytes.
Libraries of μ-γ Switch Regions.
Hybrid switch sites containing μ and γ switch (S) regions were amplified by PCR using primers flanking the repetitive core regions of Sμ and Sγ. The Sγ primers were specific for a conserved region downstream of all Sγ sequences, and thus would detect γ3, γ1, γ2, and γ4 switches (28). The following sets of nested primers were used: first set, Sμ forward (nucleotides 91–110 from GenBank/EMBL/DDBJ under accession no. 54713), 5′CAAGCAGGTCTGGTGGGCTG; Sγ reverse (nucleotides 2911–2933 from GenBank/EMBL/DDBJ under accession no. U39737), 5′CTTGCCAACTGCTCAGTGGGATG; second set, Sμ forward (nucleotides 118–141 from GenBank/EMBL/DDBJ under accession no. X54713) with EcoRI addition in italics, 5′GCCGGAATTCCTGGCCATGACAACTCCATCCAGC; Sγ reverse (nucleotides 2861–2882 of U39737) with BamHI addition in italics, 5′GCGGGATCCGGCTGCACTGCACTTTCACCAG. 15 ng genomic DNA was amplified with Pfu polymerase in 50 μl volume using the first set of primers for 30 cycles of 95°C for 45 s, 55°C for 1 min, and 72°C for 2 min, followed by a final incubation at 72°C for 10 min. 5 μl of this reaction was then reamplified in 50 μl with the second set of primers for an additional 30 cycles. The PCR products were cloned into EcoRI and BamHI-digested pBluescript and plasmids containing unique inserts were sequenced.
Determination of PCR Error.
To assess PCR error, unrearranged Sγ1 regions from all six subjects were sequenced from the same DNA using the same procedures as described above. The following sets of primers were used to generate a 1.37-kb product: first set, Sγ forward (nucleotides 1430–1449 from GenBank/EMBL/DDBJ under accession no. U39737), 5′AAGCAGAAAGATCAGGGGTC; Sγ reverse as above; second set, Sγ forward (nucleotides 1505–1523 from GenBank/EMBL/DDBJ under accession no. U39737) with EcoRI addition in italics, 5′CGGAATTCCTCAGCCTCAGGGAGCCAGG; Sγ reverse with BamH1 as above. 20 ng genomic DNA was amplified with Platinum Pfx polymerase and PCRx enhancer (Invitrogen) in 50 μl volume using the first set of primers for 30 cycles of 95°C for 30 s, 55°C for 1 min, and 68°C for 2 min, followed by a final incubation at 68°C for 10 min. Nested PCR was performed with 5 μl of the first reaction and the second set of primers with an annealing and extension step of 68°C for 2 min for another 30 cycles. Products were digested, cloned, and sequenced.
Online Supplemental Material.
Figs. S1–S6 present the data from three XP-V and three control groups of clones. They contain a summary of clone length, isotypes, and microhomology, sequences of the μ and γ switch regions, and examples of microhomology at the μ-γ junctions. Figs. S1–S6 are available.
XP-V Patients Have Normal Class Switch Recombination.
To analyze Cμ to Cγ switching and identify mutations in the switch regions from DNA pol η–deficient humans, peripheral blood was obtained from three XP-V patients. The DNA repair defects in these patients have been described, and the mutations in their POLH genes were expected to inactivate pol η (19). Recombined switch sites associated with Sμ and Sγ regions were sequenced from cloned PCR products derived from the DNA of the XP-V patients and three control individuals. To avoid nonspecific priming in the repetitive sequences, DNA was amplified with nested primers. Inserts ranged in size from 100 to 600 bp. The γ isotypes and mutations in the switching sites were identified by comparing the recombined Sμ-Sγ regions to the germline Sμ sequence and to all four Sγ sequences (28). As shown in Table I
, usage of the γ isotypes for XP-V and control clones was similar, with over half of the clones containing μ to γ1 switches. Breakpoints in the μ and γ regions occurred randomly, and there was no difference between the two groups in the lengths of microhomology at the joining sites. Approximately half of the clones had insertions or no microhomology at the μ-γ joins, and half had homology of one to five nucleotides. Summaries of all the clones, sequences, and examples of microhomology are included in Figs. S1–S6.
Mutations Are Located throughout μ and γ Switch Regions.
The sequenced Sμ region was nonpolymorphic among the six subjects, so that mutations were readily identified. The Sγ regions were polymorphic, and germline Sγ1 sequences were identified for each human. To avoid errors in assigning mutations for the other isotypes, only unique mutations were counted. Over two thirds of the clones from both study groups had mutations. As seen in Table II
, the overall frequency of mutation was similar between XP-V and control clones, with an average of 0.5% mutations per bp. This was significantly higher than the mutation frequency in unrearranged Sγ1 regions (P < 10−4, Fisher's exact test), which is reported to be similar to the frequency of PCR error (29). As in V genes, the majority of mutations were base substitutions, although the frequency of deletions was much higher in the switch region (∼13% of mutations) compared with that in introns flanking rearranged V genes (1%). Approximately 2% of the mutations within the switch sequences were insertions and 16% of the clones had insertions at the junction site. The distance and frequency of mutations from the recombination sites of Sμ and Sγ are shown in Fig. 1
. Mutations were scattered throughout the switch regions and occurred as far as 200 bp from the joining site. Preferential clustering of mutations around the break points was not observed.
Mutations May Occur Before, During, and After Switching.
Related clones were identified in all the libraries and likely derive from clonal progeny that have undergone sequential mutations. Approximately one third of the clones were related as defined by the same length of μ and/or γ sequences and the same site of joining. Based on the pattern of shared and unique mutations in related clones, it is possible to estimate when the mutations occurred. As shown in Fig. 2
A, mutations could happen before switching in the unrecombined μ region, as demonstrated by two clones with identical substitutions that were joined to different Sγ sequences. Mutations in the Sμ region before switching have been well documented in the literature (21, 22, 29), indicating that they precede recombination. In Fig. 2 B, mutations could occur during switching as shown by insertions of nontemplated nucleotides at the site of joining. In Fig. 2 C, mutations probably occurred after switching in three sets of clones that had identical μ and γ joins and shared mutations, as well as unique mutations.
Pol η Is an A-T Mutator in the μ-γ Switch Regions.
The types of base substitutions were examined to see if there was a difference between XP-V and control clones. All mutations were recorded from the nontranscribed strand. There was a decrease in mutations of A and T, and a corresponding increase in mutations of G and C in the XP-V clones (Fig. 3
A). All three XP-V patients had clones with decreased A:T mutations (Fig. 3 B), indicating that pol η is involved in generating mutations in the switch regions. Comparing mutations of A or T versus mutations of G or C, there is a highly significant difference among the XP-V and control clones (P = 0.0035, using a Monte Carlo analysis; reference 30). In addition to transitions of G and C, there was a strong bias for G to C and C to G transversions in both groups of clones. For example, of the mutations of G and C, 58% were G:C to A:T transitions, 8% were G:C to T:A transversions, and 34% were G:C to C:G transversions. There was no difference in the types of mutations located within 15 bp of the switch junction and those located farther away.
C on the Nontranscribed Strand Is Targeted for Mutation in XP-V Clones.
Because the Sμ sequence was nonpolymorphic, we analyzed mutations there in more detail. Some 130 nucleotides of the Sμ region located upstream of the repetitive core sequences are shown in Fig. 4
. The location of mutations reveals a hotspot at C in position 39 (position 180 in GenBank/EMBL/DDBJ under accession no. X54713) in both XP-V and control clones. The position is in a WRC motif (31), although it is unclear why this particular position is favored over the other WRC sequences in Fig. 4. When all the substitutions in the 350-nucleotide Sμ region were tabulated (Fig. 5
A), it became apparent that C was preferentially mutated compared with G in the XP-V clones (Fig. 5 B). Some 70% of the 52 mutations are at C nucleotides in XP-V clones compared with 47% of 50 mutations in the control clones (P = 0.05). Targeting of C bases is also observed when the Sγ substitutions are included (Fig. 3 A). 58 and 45% are at C in XP-V and control clones, respectively, compared with 29 and 28% at G, respectively. Half of the C mutations in XP-V clones were C to T transitions and half were C to G transversions.
Switch Regions Are Hotspots for Hypermutation.
The importance of the switch region as a hypermutation target has only recently been recognized. Mutations have been reported in Sμ regions from mice and humans (21, 22, 32), and as we describe here, extensive mutations are found in the Sγ regions as well. The overall frequency of mutation in these human clones is 0.5% mutations per bp, which is identical to the frequency of mutation seen in intron regions adjacent to rearranged VH genes from humans (unpublished data). Therefore, the question of how AID is targeted must take into account the sequences of two very different regions of DNA. Targeting to the V region might be initiated by association with the transcription complex near the V gene promoter (33). Similarly, targeting to the distal switch sites may occur by association with the transcription complexes that form upstream of unrearranged switch regions (34). The location of mutations in the 130-nucleotide Sμ sequence from both XP-V and control clones (Fig. 4) indicates a hotspot at 39C, which is also observed in another report (32) and may represent a major entry point for AID. This hotspot, located at the 5′ beginning of the Sμ pentamer core motifs, is not in a palindromic stem-loop structure, but might be favored in the formation of other secondary structures.
Successive mutations in clonal progeny have been documented in V gene clones and indicate that mutations are generated over several rounds of division of B cells in germinal centers. Here we show that mutations also occur sequentially in the switch regions. As demonstrated in Fig. 2, related clones have the same site of joining in Sμ and Sγ, and contain both shared and unique mutations. It appears that mutations can occur before, during, and after joining in Sμ and Sγ. Mutations before switching have been reported by others (21, 22, 29) and suggest that switch regions undergo multiple deaminations, and repair of the lesions produces mutations and occasional deletions caused by strand breaks. Internal deletions in switch regions have been previously noted (35–37). Among the mutations in this study, 13% were internal deletions in Sμ and Sγ. Presumably breaks of this type could also trigger recombination between different switch regions. Mutations during switching (20, 38) are clearly identified at the junctions between Sμ and Sγ, where insertions are commonly found. Even after recombination, the hybrid Sμ-Sγ sequences continue to sustain more mutations, more breaks, and likely sequential switching (28, 39) to downstream C genes. Successive mutations would explain why the mutational pattern is spread out over 200 bp from the junction site (Fig. 1; reference 32) in these human clones. In contrast to short-term cultures of murine lymphocytes where the mutational pattern is more clustered around the junction site (40), human peripheral B cells are long-lived and could accumulate mutations long after switching.
Absence of DNA Pol η Reveals Increased C Mutations.
DNA pol η is a low fidelity polymerase that preferentially generates mutations opposite A and T nucleotides in vitro (41, 42). Humans with XP-V disease are deficient in pol η, and their V genes have fewer mutations of A and T, which is consistent with pol η being an A-T mutator in hypermutation (19). To determine if pol η is also involved in generating mutations in switch regions, we examined the spectra of mutations in clones from three XP-V patients and three control subjects. Both groups had the same frequency of mutation and similar γ isotypes, but the XP-V clones had fewer mutations of A and T bases, suggesting that pol η synthesizes mutations in both the V and switch regions. The mutation frequency is not decreased in the XP-V clones, perhaps because AID repeatedly deaminates C bases and produces more C:G substitutions in the absence of pol η.
Biochemical evidence indicates that AID preferentially deaminates C on the nontranscribed strand, which would be single stranded during transcription (7, 8, 10). However, the mutational spectra in V gene introns show an equal frequency of C and G mutations on the nontranscribed strand (5, 43). This suggests that the separation of DNA in V genes during transcription might be too transient to favor deamination on one strand versus the other, or that pols insert mutations downstream of the initial C lesion. In contrast, DNA in switch regions is rich in G and C nucleotides and has been suggested to form stable secondary structures including R loops (44, 45), G-quartets (46), and stem loops (47). These structures might be induced by transcription (48) and expose C on the nontranscribed strand for deamination for a length of time. In support of this model, we observed an increase in mutations of C on the nontranscribed strand in human Sμ and Sγ regions. A slight increase was seen in the mutations from control clones, and this effect was dramatically augmented in XP-V clones where some 70% of mutations were at C bases in Sμ (Fig. 5).
The three major categories of hypermutation in the switch regions from control subjects are substitutions of A and T, transversions of C:G to G:C, and transitions of C:G to T:A. We propose a model to explain how these mutations are generated (Fig. 6)
. Substitutions of A and T could occur during gap-filling repair of the abasic site on the nontranscribed strand by pol η, which synthesizes DNA downstream of the initial C lesion to generate mutations opposite all four nucleotides. Pol η could synthesize a gap produced by exonuclease 1 (49) and/or perform strand displacement (43). In the absence of pol η, mutations would occur primarily at sites of deaminated cytosines, which is why C is targeted in the XP-V clones. In the second category, transversions of C:G to G:C could occur during replication past the abasic site by a translesion polymerase. Half of the mutations of C in XP-V clones were C to G transversions, and a high frequency of transversions was also found in the control clones (Fig. 5; reference 32). C to G transversions could be specifically generated by Rev1, which is a deoxycytidyl transferase that inserts C opposite an abasic site (50, 51). The chicken DT40 B cell line makes a preponderance of C:G to G:C transversions in immunoglobulin V genes (52), and deletion of Rev1 eliminates these mutations (53), indicating that this polymerase is involved in chicken hypermutation. Our data support the notion that Rev1 participates in human switch mutations as well. Rev1 associates with several polymerases, including pol η, pol ι, and pol ζ (50, 54), and might be recruited to the mutation site in a multiprotein complex. The third category of mutations is C:G to T:A transitions. Transitions could arise during error-prone repair by preferential insertion of T opposite G by pol η, or during replication past uracil by any polymerase.
In addition to pol η, the mismatch repair proteins Msh2 and Msh6 participate in generating mutations of A and T bases in V genes. It remains to be determined how the low fidelity pols and mismatch repair proteins are recruited to deaminated cytosines in the immunoglobulin locus, and how pol η and Msh2/Msh6 produce A:T mutations.
We are grateful to Robert Tarone and Igor Rogozin for statistical analyses. We also thank Ed Max for PCR strategies, Roger Woodgate and John Tainer for comments, and Vilhelm Bohr for support.
The online version of this article contains supplemental material.
Abbreviations used in this paper: AID, activation-induced cytosine deaminase; pol, DNA polymerase; XP-V, xeroderma pigmentosum variant.