Expression of killer cell Ig-like receptors (KIRs) diversifies human natural killer cell populations and T cell subpopulations. Whereas the major histocompatibility complex class I binding functions of inhibitory KIR are known, specificities for the activating receptors have resisted analysis. To understand better activating KIR and their relationship to inhibitory KIR, we took the approach of reconstructing their natural history and that of Ly49, the analogous system in rodents. A general principle is that inhibitory receptors are ancestral, the activating receptors having evolved from them by mutation. This evolutionary process of functional switch occurs independently in different species to yield activating KIR and Ly49 genes with similar signaling domains. Selecting such convergent evolution were the signaling adaptors, which are older and more conserved than any KIR or Ly49. After functional shift, further activating receptors form through recombination and gene duplication. Activating receptors are short lived and evolved recurrently, showing they are subject to conflicting selections, consistent with activating KIR's association with resistance to infection, reproductive success, and susceptibility to autoimmunity. Our analysis suggests a two-stage model in which activating KIR or Ly49 are initially subject to positive selection that rapidly increases their frequency, followed by negative selection that decreases their frequency and leads eventually to loss.
NK cells are effector lymphocytes of innate immunity that respond to infection (1, 2), malignancy (3), and allogeneic hematopoietic transplantation (4); they also facilitate placentation in reproduction (5). NK cell responses are determined by batteries of activating and inhibitory receptors (6). Ligands for several NK cell receptors are MHC class I and structurally related molecules. The NK cell receptors that recognize polymorphic MHC class I molecules are themselves encoded by diverse, polymorphic, and rapidly evolving gene families that contribute to the diversity and repertoire of NK cell populations and T cell subpopulations (7, 8). Further emphasizing the evolutionary plasticity and versatility of these NK cell receptors, the analogous functions are performed by structurally unrelated glycoproteins in different species, as exemplified by the killer cell Ig-like receptors (KIR) of primates and the Ly49 receptors of rodents (9).
In contrast to MHC polymorphism, KIR polymorphism can affect a receptor's signaling function as well as its binding to ligands. Activating function is effected by a positively charged residue in the transmembrane region, whereas inhibitory function is conferred by inhibitory tyrosine-containing immunomotifs (ITIM) in the cytoplasmic tail. Of the 14 human KIR, seven are inhibitory, six are activating, and one has dual function. The balance between activating and inhibitory receptors at the NK cell surface is reflected in the population genetics: KIR haplotypes divide into two functionally distinct groups according to their complexity and the content of genes encoding activating KIR (10). Group A haplotypes have only one activating KIR gene (KIR2DS4), and it is frequently disabled (11, 12). The more complicated group B haplotypes can have up to five of the six genes encoding activating receptors (KIR2DS1–5 and KIR3DS1) as well as additional genes encoding inhibitory receptors (KIR2DL5A and KIR2DL5B). Consequently, human genotypes vary widely in their content of activating KIR, as do the frequencies in human populations (13). These distributions point to distinctive and balancing selection for haplotypes that are rich or poor in activating KIR.
Certain pairs of KIR have similar extracellular domains but differ in their signaling function. Inhibitory receptors specific for HLA-C, KIR2DL1 and KIR2DL2/3, pair with the activating receptors KIR2DS1 and KIR2DS2, respectively. Likewise, the inhibitory receptor for HLA-B, KIR3DL1, is paired with the activating receptor KIR3DS1. These relationships suggest how KIR signaling function might be switched, from activating to inhibitory, or vice versa, in the course of evolution (14). The ligand-binding specificities of the inhibitory receptors are well characterized, but similar studies of the activating receptors have met with limited success (15–18), although clinical correlations point to their interaction with HLA class I. For HIV infection, the combination of KIR3DS1 and certain HLA-B allotypes was correlated with slower progression to AIDS (19), and certain combinations of HLA-C with KIR2DS1 and KIR2DS2 correlate with autoimmune conditions (20–22). For infectious disease, the plausibility of such mechanisms is shown by the demonstration that the activating Ly49H variant provides specific resistance to cytomegalovirus infection in a mouse model (23).
Because of the evolutionary plasticity of MHC class I–specific NK cell receptors and the unprecedented species-specific differences they exhibit, it becomes important to understand the general principles by which these receptors—particularly the activating receptors—evolve. The overall similarity of activating and inhibitory KIR demonstrates their common origin but leaves unanswered questions as to which came first—activating or inhibitory KIR—and how the signaling function is switched. Using phylogenetic analysis and evolutionary reconstruction, we have examined the natural history of KIR and can now answer these questions with confidence. Similar analysis of rodent Ly49 demonstrated that these evolutionary processes are remarkably parallel.
Hominoid-specific evolution of activating KIR
Signaling functions are determined by basic residues of the transmembrane region (TM) that bind to activating adaptor molecules and of ITIMs in the cytoplasmic tail (CYT). In this paper, we define the combination of TM and CYT as the KIR signaling domain. Phylogenetic comparison of the sequences encoding the signaling domains of mammalian KIR shows that all primate KIR signaling domains form a monophyletic group (Fig. 1
A) and thus derive from one unique common ancestor. Within the primate KIR, the signaling domains of all the activating, short-tailed KIR (STK) of the hominoids (i.e, KIR2DS and KIR3DS) form a monophyletic group (Fig. 1 B). In contrast, the signaling domains of the long-tailed KIR (LTK; i.e., KIR2DL and KIR3DL) segregate into several groups, corresponding chiefly to the lineages defined from cDNA sequences (24). In the same phylogenetic tree, the STK all group with the LTK of lineage III, which includes the HLA-C–specific KIR (Fig. 1 B). In summary, the signaling domains of all modern hominoid KIR2DS and KIR3DS derive from a unique hominoid ancestor that was a lineage III KIR.
Activating KIR evolved from inhibitory KIR
Trees constructed for the primate signaling domain comprise two deep clades, one containing KIR2DL4 and related KIR (lineage I-A) and the other that divides into related groups of KIR corresponding to the lineage I-B, II, III, IV, and rhesus monkey KIR2DL5 and KIR1D (Fig. 1 B). Whereas all these subclades contain inhibitory LTK, only lineage III also contains activating STK. That the STK are deeply nested within the lineage III KIR indicates that the ancestral lineage III KIR was a LTK rather than a STK. The ancestral STK would then have evolved from a lineage III LTK with accompanying functional switch. To assess the validity of this model, we reconstructed the sequences for the signaling domain of the last common ancestor of the STK, the lineage III LTK, and of all lineage III KIR (Fig. 2)
. Because the relative positions of three sequences (KIR3DL3 and Popy-KIR2DLA/B) within the lineage III were uncertain, two alternative trees were used to reconstruct the ancestral sequences. For each alternative tree, the results of two independent methods of analysis indicated that the last common ancestor of the lineage III KIR was an inhibitory LTK (Fig. 2).
These reconstructions of ancestral sequence were also used to predict the amino acid substitutions that formed the ancestral STK. Two amino acids were identically predicted by the two analytical methods (Fig. 2). These substitutions are precisely those that switch the receptor's function from inhibition to activation. Activating function was gained by introduction of a positive charge (K) into the TM, through which STK interact with the DAP12 adaptor (25, 26). Inhibitory function was lost by introduction of a stop codon within the membrane proximal ITIM motif of the CYT. Because both changes occurred in the same evolutionary branch, it is not possible to determine which occurred first. That the evolution of the ancestral STK involved only substitutions that directly affected signaling function is indicative of selection.
To assess the impact of natural selection on KIR signaling domains, we estimated their mean number of synonymous substitutions per synonymous nucleotide site (dS) and number of nonsynonymous substitutions per nonsynonymous nucleotide site (dN) (Table I)
. Although seven of the nine groups analyzed show an excess of synonymous substitutions (dN/dS < 1), for six groups it did not reach statistical significance. Such evidence for weak purifying selection can be attributed to the small size of the signaling domain (63 aa) and the relatively few residues in the TM and CYT domains that are functionally critical. Many other positions, however, are likely to tolerate a range of substitutions, for example hydrophobic substitutions in the TM. The KIR group showing statistically significant purifying selection (P < 0.05) was the STK. Because the analysis was performed on present-day STK sequences, it did not include the functionally important substitutions that produced the functional shift; rather, it involved only substitutions that occurred after the shift. The result shows that, in the subsequent evolution and expansion of the STK family, purifying selection has preserved the structure of the signaling domain and its activating function.
To date, STK have been characterized only in hominoids. To assess the likelihood that other primate species possess STK, we estimated a time range for occurrence of the functional switch that gave rise to the STK ancestor (Table II)
. This analysis was performed for the two models shown in Fig. 2. For both models, the lower limit of the range, which corresponds to the last common ancestor of all STK, was estimated at ∼13.5 million years ago (MYA). The upper limit of the range, which corresponds to the last common ancestor of all lineage III KIR, was estimated at ∼18 MYA. The range of 13.5 to 18 MYA indicates that STK may exist in hylobatids (gibbons and siamangs), because their divergence from other hominoids is estimated to have occurred 16.5–18.5 MYA (27). On the other hand, STK are unlikely to be found in Old World monkeys. Although estimates of the Old World monkey–hominoid divergence time vary, most evidence points toward a minimum of 23 MYA (28), corresponding to the upper limit of the 95% confidence interval for the upper range of the STK emergence (Table II).
Expansion of the STK occurred mainly by recombination
After the formation of the first STK gene in a hominoid ancestor, a family of hominoid STK genes was formed by expansion, as exemplified by the presence of several STK genes in modern humans, orangutans, and common chimpanzees. Although duplication of a complete gene is a necessary mechanism for the expansion of gene families, we previously showed that recombination between genes has been the main mechanism generating new KIR (24). To investigate the mechanisms that diversified and expanded the STK, we performed a domain-by-domain phylogenetic analysis. Trees constructed from the signaling domain of the lineage III KIR revealed four subclades within the STK clade. These subclades divide along species-specific lines: human, gorilla, orangutan, and the two chimpanzees (Fig. 3
A). Within each species or pair of species, the modern STK derive from a single common ancestor. The tight clustering of the STK in the signaling-domain tree does not extend to trees constructed for the extracellular domains D2 (Fig. 3 B), D1 (Fig. 3 C), and D0 (Fig. 3 D). Here the STK are distributed among various branches of the lineage III KIR and are also found in other KIR lineages (Fig. 3). From domain to domain, the phylogenetic relationships between the STK differ, evidence for their diversification by recombination.
Three STK—bonobo Pp-KIR3DSA, orangutan Popy-KIR3DS2, and human KIR3DS1—have extracellular domains that cluster with lineage II KIR. Each of these KIR was formed by an interlineage recombination in which the Ig domains of a lineage II KIR combined with the signaling domain of an activating lineage III KIR2D. That Pp-KIR3DSA, Popy-KIR3DS2, and KIR3DS1 are not nearest neighbors in the signaling domain tree (Fig. 3 A) argues against their being orthologous KIR and for their formation by independent recombination events in bonobo, orangutan, and human evolution. That independent acquisition of an activating KIR3D occurred in each of these species provides a striking example of parallel (convergent) evolution. A further product of interlineage recombination is Pt-KIR3DS2, for which the D0 domain clusters with lineage V KIR (Fig. 3 D).
For the remaining 10 STK, all the domains are of lineage III. Reconstruction of their histories is hindered by lack of resolution in the D2 domain tree (Fig. 3 B) and, to lesser extent, in the D1 domain tree (Fig. 3 C). The D0 domain tree (Fig. 3 D) is well resolved and shows that five STK (Gg-KIR2DSa, KIR2DS3, KIR2DS5, Pt-KIR2DS4, and KIR2DS4) form a cluster resembling that for the signaling domain (Fig. 3 A). Gg-KIR2DSa, Pt-KIR2DS4, and KIR2DS4 may represent orthologues, whereas KIR2DS3 and KIR2DS5 are probably products of gene duplication. In contrast, KIR2DS1 and KIR2DS2 are recombinants, as seen from their distinctive positions in the trees for the D0 domain and the signaling domain. In summary, of the 14 STK, six are clearly products of recombination, four are possibly orthologues, and two the products of gene duplication. The remaining two STK are the orangutan KIR2D, for which the genomic sequences corresponding to the D0 domains are not known; they could be products of either gene duplication or recombination.
Distinctive signaling domain in activating monkey KIR
In hominoids, KIR2DL4 seems to be unique because it is a LTK with activating function (29–31) and because of its ubiquitous expression by NK cells. KIR2DL4 is also the only KIR that is shared by hominoids and Old World monkeys, represented in our study by the rhesus macaque. Besides the KIR2DL4 orthologue, the rhesus macaque has several Mm-KIR3DH (32, 33) for which the signaling domain resembles that of KIR2DL4 (Fig. 4
A). This similarity does not extend to the Ig-like domains (Fig. 4, B–D) where the four Mm-KIR3DH cluster in lineage IV with the other Mm-KIR3D. The Mm-KIR3DH are activating receptors formed by recombining the signaling domain of Mm-KIR2DL4 with the extracellular domains of Mm-KIR3D. Unlike KIR2DL4, they do not express exon 8 (32, 33), which causes a frame shift that eliminates the two ITIMs of the Mm-KIR2DL4 tail and leaves Mm-KIR3DH with only activating potential. The similarity of the signaling domain in the four Mm-KIR3DH shows that recombination between Mm-KIR2DL4 and Mm-KIR3DL occurred only once. We estimate this event took place ∼10.5 MYA (Table II), a time close to the divergence of Cercopithecini (guenons, green monkeys, and patas monkeys) from Papionini (macaques and Papionina), ∼8–9.5 MYA (27), probably after the separation of colobines (leaf-eating monkeys) and cercopithecines (cheek-pouched monkeys), ∼13–14 MYA (27), and before the macaques diverged from the Papionina (baboons, mangabeys, and mandrills), ∼7–8 MYA (27). We predict, therefore, that KIR3DH genes are present in all Papionini and possibly in all living cercopithecines but not in colobines. The Ig domain sequences of Mm-KIR3DH, particularly D0 (Fig. 4 D), are not monophyletic, suggesting the existence of more than one Mm-KIR3DH locus. Because the Mm-KIR3DH sequences do not cluster tightly in the Ig domain trees, the different loci probably evolved by recombination rather than gene duplication.
Activating KIR coopted more ancient signaling pathways
Parallel evolution formed activating NK receptors from inhibitory ones in hominoids and Old World monkeys. For hominoid STK, DAP12 is the signaling adaptor (25, 26), whereas for rhesus monkey KIR3DH it is probably FCER1G, which is also the signaling adaptor for KIR2DL4 (34). To determine whether the adaptor molecules evolved before, after, or at the same time as the activating KIR, we performed phylogenetic analyses on sequences from various species that were reported to resemble the signaling adaptors of human and murine immune receptors. This analysis revealed DAP12, FCER1G, and related DAP10 and CD3Z are present and well conserved in several orders of mammals, as well as in amphibians and bony fishes (Fig. 5)
. These data strongly support a scheme in which DAP12 and FCER1G emerged before separation of mammals and bony fishes, more than 400 MYA. Consequently, when activating KIR were formed in primates, ∼13.5–18 MYA for the STK and ∼10.5 MYA for the rhesus monkey KIR3DH (Table II), they were able to coopt a preexisting signaling pathway. The implication is that the constraints imposed by the existing signaling pathways guided the evolution of the activating KIR by selecting for variants that could coopt such pathways.
Parallel evolution of activating rodent Ly49 and primate KIR
Although rodent genomes contain KIR genes, their properties and functions seem to be dissimilar to the hominoid KIR gene family (35). In mice and rats, the functional equivalent of KIR is Ly49, a structurally divergent family of lectinlike glycoproteins that includes both inhibitory and activating receptors (36). Although Ly49 receptors have their signaling domain (CYT-TM) at the amino-terminus, compared with the carboxy-terminus for KIR, their activating and inhibitory motifs are similar. It was therefore intriguing to see how the evolution of activating Ly49 compared with that of the activating KIR.
Phylogenetic analysis of the signaling domains of mouse and rat Ly49 family members also included sequences from other species, including the single, nonfunctional human Ly49 gene. With two exceptions, all of the activating Ly49 genes form a unique monophyletic group that includes no inhibitory Ly49. (The two exceptions are the rat Ly49s3 and the predicted activating Ly49C of the horse; Fig. 6
A). Thus, almost all signaling domains of the activating Ly49 receptors of mouse and rat derive from a single ancestral activating signal domain. Reconstruction of ancestral sequences for the four nodes indicated by arrows in Fig. 6 A supports a model in which this ancestral activating signaling domain was produced by mutation and functional switch of a rodent inhibitory signaling domain. We estimate that this functional switch occurred 18.5–31 MYA (Table II), indicating it was rodent specific, because the earliest divergence within rodents occurred ∼75 MYA (37, 38). The activating Ly49 could also be specific to Murinae, because they separated from their closest relatives, the Cricetidae and the Gerbillidae, ∼31 and ∼34 MYA (38), respectively, a time at the upper range of our estimate for the functional switch (Table II).
An independent and more recent functional switch must be proposed to account for the formation of the activating rat Ly49s3 receptor. The clustering of the Ly49s3 signaling domain with four other rat Ly49 sequences that have activating (charged residue in the TM) and inhibitory (ITIM in CYT) motifs points to this evolution having first involved the acquisition of a charged residue in the transmembrane region followed by loss of ITIM from the cytoplasmic tail. This model is also supported by reconstruction of the ancestral signaling domain shared by Ly49s3 and its close relatives. That these are all rat sequences raises the possibility of a rat-specific emergence. However, an analysis of the time of emergence of this receptor indicates a range of ∼3.5 to 15.5 MYA (Table II), leaving open the alternative possibility that receptor emergence preceded the mouse and rat separation, ∼13–21 MYA (37).
The tight clustering observed for the signaling domains of most rat and mouse activating Ly49 receptors (Fig. 6 A) does not extend to trees made from the sequences of the extracellular stalk (not shown) or lectinlike domain (Fig. 6 B). The overall pattern is very similar to that seen for the activating KIR (Figs. 1 and 3) and shows that new activating Ly49 receptors have been formed principally through recombination that replaces the signaling domain of an inhibitory receptor with the signaling domain of an existing activating receptor. In this and all other aspects of their evolution, the parallels between the activating receptors of the rodent Ly49 family and the primate KIR family are striking.
Before the evolution of adaptive immunity, lymphocyte-like cells probably existed and, like modern NK cells, used receptors encoded by nonrearranging genes (39). Also, the diverse functions of MHC class I–like molecules point to their existence before adaptive immunity. It is therefore plausible that lymphocytes of innate immunity have been regulated by class I–like receptors for more than 500 million years. Although this general function was conserved, its manifestations in different mammalian species are notably divergent. The functionally analogous but structurally dissimilar KIR and Ly49 receptors seem to be extreme in this regard: both the receptors and their MHC class I ligands are highly polymorphic and rapidly evolving (36, 40). Despite the differences, many aspects of the biology of the primate KIR and rodent Ly49 families are remarkably similar, including a membership of both inhibitory and activating receptors (6, 7).
Here we demonstrate that the inhibitory KIR and Ly49 receptors were ancestral and that the activating receptors are derived from them. Thus, all modern primate KIR derive from a common ancestral primate KIR that had inhibitory function. Of more recent origin are the activating STK, which are specific to the hominoids and have notably expanded in the human species to give the diversity of group B KIR haplotypes. The STK share a common ancestor that was an inhibitory KIR. Reconstructing such evolution of an activating KIR from an inhibitory KIR shows that it occurred only once in hominoids and involved as few as five to seven nucleotide changes. These changes introduced a charged residue into the transmembrane domain and altered both the length and the sequence of the cytoplasmic tail to eliminate the ITIMs. By this process of functional shift, the first hominoid STK emerged 13.5–18 MYA. Subsequently, additional STK were formed through the action of two other evolutionary mechanisms: recombination, which replaced the signaling domain of inhibitory receptors with the activating domain, and, to a lesser extent, duplication of STK genes with subsequent divergence of the daughter genes (Fig. 7
A). The activating Ly49 genes evolved by similar processes from inhibitory Ly49 genes.
The signaling domains of the activating KIR in rhesus monkey and cattle are unrelated to those of the hominoids, indicating that this form of signaling domain has evolved independently in different species at different times. The formation of activating receptors from inhibitory receptors is seen to be a recurrent process. This recurrence is no coincidence, because of the competitive advantage of new variant receptors that can engage an adaptor molecule (e.g., DAP12), which is part of an existing pathway of activating signal transduction. Such activating receptors therefore will tend to be selected.
Although the signaling adaptors have been conserved for more than 400 million years, and both KIR and Ly49 have existed for at least 100 million years, none of the modern-day activating KIR or Ly49 are older than ∼18 and ∼31 million years, respectively. The recurrent creation of activating receptors and the absence of ancient lineages of activating receptors indicates that activating receptors are short lived: they are periodically made, lost, and reinvented, rather than being preserved by strong, persistent selection. Further evidence for this dynamic comes from the distribution of STK in human populations. Overall, the STK have lower frequencies than the inhibitory KIR (41–43), and in some populations (e.g., the Japanese) the frequencies of STK and group B KIR haplotypes are very low (43). None of the STK genes is fixed, and because a common allele of KIR2DS4—the single STK of the group A haplotypes—is probably nonfunctional, many millions of healthy people have no STK at all. These observations all point to natural selection having the capacity to select for or against activating KIR and Ly49, depending on circumstance.
The functions of inhibitory KIR are better understood than those of activating KIR. They are MHC class I receptors that help NK cells be tolerant of healthy autologous cells but reactive toward cells having perturbed MHC class I expression (44). In humans, HLA-C is the major inhibitory ligand; all HLA-C allotypes engage inhibitory KIR2DL, and most KIR haplotypes and genotypes encode inhibitory KIR specific for two broad groups of HLA-C ligand, C1 and C2 (45). By contrast, a minority of HLA-A and -B allotypes are inhibitory ligands, and a significant proportion of KIR haplotypes do not encode a functional inhibitory HLA-B receptor.
The interaction of HLA-C1 with its cognate receptors KIR2DL2 and -3 is weaker than that for HLA-C2 and KIR2DL1 (15, 46). Such polymorphism, combined with independent segregation of KIR and HLA-C, means that individuals differ qualitatively and quantitatively in their NK cell regulation by HLA-C. Consequently, individuals having only the weaker inhibition mediated by C1 binding to KIR2DL3 are more likely to resolve acute hepatitis C virus infections (47), presumably because their NK cells are more readily activated. Other correlative studies show that weaker inhibitory KIR–HLA-C interactions and/or the presence of activating STK is associated with slower progression of HIV infections to AIDS (19) and reduced risk of preeclampsia in pregnancy (48). Conversely, group B KIR haplotypes, with their STK, are also associated with a increased likelihood of autoimmune diseases such as psoriasis (49, 50), psoriatic arthritis (20, 51), type I diabetes (22), scleroderma (52), and vascular complications of rheumatoid arthritis (21).
These clinical correlations indicate that the potential advantages of STK are improvement in the response to infection and increased reproductive success; they also indicate that their potential disadvantage is increased disability resulting from autoimmunity. Although autoimmunity has often been dismissed as a factor in natural selection, because it usually affects people past reproductive age, such conjecture ignores the important contributions that healthy older relatives, notably grandparents, could have made to raising children. During episodes of epidemic viral infection, genotypes containing STK are more likely to be advantageous and selected, whereas in periods when infections are less pressing, genotypes without STK could gain the competitive advantage. The short life of activating KIR and Ly49 can therefore be explained by a two-stage model: a first stage, in which a low-frequency activating receptor is rapidly driven to high frequency because of its beneficial effect, is followed by a second stage during which the benefit is outweighed by the detriment, leading to decreasing frequency and eventual loss of the activating receptor gene (Fig. 7 B).
Although the clinical correlations point to their influence, the manner by which the STK exert their effects is uncertain, because their ligands have yet to be defined. Although the ligand-binding domains of KIR2DS1 and KIR2DL1 are similar, as are those of KIR2DS2 and KIR2DL2/3, the STK have little, if any, affinity for HLA-C. Indeed, mutational and structural analysis shows that KIR2DS1 and KIR2DS2 have each acquired a single–amino acid substitution that prevents the binding of C2 and C1, respectively (15–18). One possibility is that KIR2DS binding is much more dependent on the peptide bound by HLA-C than is KIR2DL binding, as reported for KIR3DL2 (53). Alternatively, receptors that were originally selected by infection as activating HLA-C receptors, may have subsequently become attenuated or inactivated when the disease was no longer a threat. Support for this alternative is found with the Ly49 family, because several Ly49 pseudogenes would encode activating receptors in the absence of frameshifts (see Fig. 6 A for example).
The hominoid STK emerged at a time, ∼13.5–18 MYA, soon after the previously isolated continents of Africa and Eurasia were connected, ∼18 MYA (54). This event facilitated the migration of many African mammals to Eurasia, including hominoids, for which fossils unearthed in Europe date to as early as 16–17 MYA (55). Several authors have proposed that the hominoids who migrated to Eurasia eventually gave rise to humans and the other surviving hominoid species (55). In that circumstance, the period starting ∼18 MYA would have been one of environmental upheaval for hominoid ancestors. The resulting stress and pressure on their immune systems might well have contributed to the selection of a new variety of activating KIR. Through recombination and gene duplication, this ancestral STK would eventually lead to the expanded family of STK present in modern hominoids. The expansion also witnessed striking parallels in evolution (e.g., the independent formation of activating KIR with three Ig domains in bonobos, orangutans, and humans). This parallelism emphasizes the importance of natural selection, not only in the emergence of activating KIR, but also in their expansion and diversification.
Materials And Methods
Datasets and alignments.
Our KIR dataset included all known KIR loci and sequences having >1% divergence. The signaling domain analyses were performed using the sequences from exon 7–9. Sequences with the longest 3′ untranslated regions were chosen. Gg-KIR2DL6, Gg-KIR2DLc, Pt-KIR3DL6, Pp-KIR3DL4, and Pt-KIRC1 were not used for analysis of the signaling domain, because they are recombinants that give divergent phylogenetic signals for the TM and CYT (24). KIR3DS1 and KIR3DL1 were also found to have divergent signals within the CYT domain (unpublished data), and part of the 3′ untranslated region sequence was removed to avoid bias. Bta-KIR2DS1 was too short for analysis of the TM and CYT domains. The 3′ ends of the nonprimate KIR sequences were trimmed to conserve only the well-aligned parts.
To obtain a DAP12 dataset, BLAST (56) searches were made of National Center for Biotechnology Information's nonredundant and expressed sequence tags databases. This search also revealed DAP10, CD3γ/δ, and CD3ε sequences, which we included in the analysis. The FCERIG dataset was obtained similarly to the DAP12 dataset but also includes CD3ζ sequences.
An Ly49 dataset was assembled following BLAST searches. Full-length rat sequences predicted based on genomic DNA were also included (57). Groups of sequences with <1% divergence were represented by one sequence. Alignment columns containing single sequences were discarded, as were the insertions found in several sequences that did not align (divergent splice variants).
Before the analysis of each dataset, we checked for composition bias using a χ2 test (α = 0.05). For nucleotide sequence analyses, we selected the model of DNA substitution using the Akaike information criterion as implemented in Modeltest 3.06 (60). Neighbor-joining (NJ) analyses were performed with MEGA3 (61) using 1,000 replicates, pairwise comparisons, midpoint rooting, and, in all but one case, the Tamura-Nei method. Because of sequence composition bias with three sequences in Fig. 1 A (mouse and rat KIR3DL1: P < 0.001; horse KIR3DL001: P < 0.05), we used the LogDet method. To define groups, we performed the interior branch test (62). For Bayesian analyses we used MrBayes3b4 (63) and conducted three independent runs for each dataset (see supplemental Materials and methods, available at http://www.jem.org/cgi/content/full/jem.20042558/DC1). The three topologies obtained were compared statistically using the Shimodaira-Hasegawa test of alternative phylogenetic hypotheses, as implemented in PAUP*4.0b10 (Sinauer), with resampling estimated log-likelihood optimization and 10,000 bootstrap replicates. This comparison was made with the maximum likelihood model defined by Modeltest. For all the analyses presented, the test failed to reject any of the alternative tree topologies (α = 0.05). PAUP*4.0b10 and the tree bisection-reconnection branch swapping algorithm were used for maximum likelihood and parsimony analyses, with 500 replicates and Modeltest parameters for the former and 1,000 replicates and a heuristic search for the latter.
The analyses of amino acid sequences were performed similarly to the analyses of the nucleotide sequences with the following differences: NJ analyses were performed using a Poisson correction; the Bayesian analyses were conducted using a BLOSUM matrix and gamma distances; and the resulting tree topologies were statistically compared using the Templeton test with a parsimony model (α = 0.05).
In the analysis of each dataset, the tree topologies obtained with the different methods were compared using the Shimodaira-Hasegawa test or the Templeton test as described previously. In all analyses but one, the test failed to reject any topology. The NJ tree presented in Fig. 6 A was less likely than the other two trees (Shimodaira-Hasegawa test, P < 0.05). Because the difference was not significant at a 1% level, and because the nodes responsible affected only terminal branches, which were not discussed in the analysis, the NJ tree was kept.
Ancestral sequence reconstruction.
Ancestral sequences were reconstructed using the marginal reconstruction approach of PAML (64) and Bayesian analysis using MrBayes2.01. Marginal reconstruction used the model of DNA substitution defined by Modeltest. Bayesian analyses were as described previously, except that nodes for which ancestral sequence reconstruction was sought were constrained. More than five runs were performed, and a consensus was generated.
Estimation of divergence time.
Mean, standard deviation, and 95% confidence interval values for divergence times were estimated using the Bayesian relaxed molecular clock approach with the Multidistribute program package (reference 65 and see supplemental Material and methods).
For the KIR dataset, the two topologies of Fig. 2 were used, the mean of the prior distribution of the root of the ingroup tree was set to 27 ± 3 MYA (hominoid–Old World monkey separation), and two nodes in the lineage II and STK groups were constraints between 10 and 18 MYA (separation of orangutans from humans and African apes). For the Mm-KIR3DH dataset, the tree topology of model 1 in Fig. 2 was used, and the root of the ingroup tree was set to 27 ± 3 MYA (hominoid–Old World monkey separation). For the Ly49 dataset, a modified version of the tree topology obtained in the Bayesian analysis performed for Fig. 6 A was used; the root of the ingroup tree was set to 87.5 ± 3.5 MYA (primate–rodent separation), and three internal calibrations corresponding to the divergence between mouse and rat (13–21 MYA) were used (see online supplemental Material and methods).
Analysis of functional constraints.
Sequences containing insertions or deletions that changed the reading frame size were removed, as were sequences having premature stop codons. The final stop codons were removed, as were the positions with gaps. The number of dS and dN were estimated using MEGA3 with the modified Nei-Gojobori method (p-distance). The transition/transversion ratio was estimated using Treepuzzle (66). The SEs were estimated by the bootstrap method (10,000 replicates). The statistical significance of the dN/dS ratios comparing to the neutrality hypothesis was assessed through a two-tailed Z-test.
Online supplemental material.
Detailed methods for the Bayesian phylogenetic analyses, the divergence time analyses, the analysis presented in Fig. 3 A, and the accession numbers for the sequences used are described in supplemental Material and methods.
This study was supported by National Institutes of Health grant no. 5 R01 AI24258 (to P. Parham).
The authors have no conflicting financial interests.
Abbreviations used: CYT, cytoplasmic tail; ITIM, inhibitory tyrosine-containing immunomotif; KIR, killer cell Ig-like receptor; LTK, long-tailed KIR; MYA, million yr ago; NJ, neighbor joining; STK, short-tailed KIR; TM, transmembrane domain.