I f the information age arose with the popularization of the personal computer, then the emergence of super-powered versions of these machines completely integrated into our lives and into the superstructure of society via the Internet has led us into the super-information age. Information is pouring into our lives and onto our desktops with unprecedented speed and volume. Genomics technology and the super-information age have led to a similar proliferation of information in the study of human genetics and immunobiology. In this issue of The Journal of Experimental Medicine, through innovation, persistence, and technological prowess, Matsuda et al. have completed sequencing the entire human VH gene locus, including all of the immunoglobulin VH genes as well as the hundreds of kilobases of intervening sequence (1). This landmark work and the efforts of several other groups of pioneering immunogeneticists in the study of the human VH locus (2–5) have produced an extraordinary resource for the study of B cell biology and the important issues of immunoglobulin diversity. It appears that the study of immunoglobulin VH genes in the human has also entered into the super-information era.
Tonegawa and the Information Age.
With the final proof by Tonegawa (2) that the incredible diversity of immunoglobulin V gene sequences encoding the antibody repertoire were formed by the somatic recombination of relatively few genetic elements (for review see reference 6), the information age of V gene genetics was begun. In the almost twenty years since, a steady flux of one monumental discovery after another concerning the production and fine tuning of immunoglobulins and TCRs from their respective V genes has ensued. There have been extensive analyses of the molecular processes of V gene recombination and the factors involved, and of the intriguing but elusive process of somatic hypermutation. Analyses of the expressed V gene repertoire in the normal healthy subject and in association with immune defects and autoimmune syndromes have yielded steady and fruitful results as well. Although the VH gene locus provided the framework for many of these important analyses, until now there has not been a complete sequence of this complex locus.
The Mapping Effort.
In 1979, it was found through analyses of somatic cell hybrids between human lymphocytes to mouse myeloma cells that chromosome 14 was the only human chromosome present in all independent hybrids producing immunoglobulin heavy chains (3). Following this work, the quest to map and sequence the human VH locus was underway. In the early 1980's, the VH locus was further mapped to chromosome band 14q32.33 (7, 8). The Honjo lab (whose work is highlighted in this commentary), along with several other key groups, was instrumental in mapping and deciphering as well as sequencing the human VH locus. By the mid-1980's, Honjo's group began to piece together the VH locus through Southern blot hybridizations to a library of cosmid clones encompassing 61 VH genes (9), and, using progressively more advanced genome analysis techniques such as pulsed field gel electrophoresis and two- dimensional electrophoresis, several groups helped to frame the VH locus by the beginning of this decade (10–12). With the addition of a second haplotype from isolated YAC clones, the Honjo lab constructed a map of the 3′ end of the VH locus in 1991 (13), which was extended in 1993 to 0.8 megabase encompassing the 64 most 3′ VH segments (4). The use of genomic Southern blot hybridization analyses provided an estimate at the time of the total compliment of VH genes, determined to be between 60 and 200 genes (10, 14). During this period, the expressed VH gene repertoire was being compiled and appreciated as many of the VH genes recognized today were cloned from a number of sources. With these advances, it was recognized that the VH genes consist of seven families based on homology of >80% among VH genes within each family (9, 10, 15–18). In 1992, in a significant contribution to this effort, Tomlinson et al. used a PCR-based approach using various sets of V gene family–specific primers to sequence 74 V genes designated as DP-1 through DP-74 from a single individual (17). These sequences became instrumental in the efforts of this group to map the VH locus (see discussion of Cook and colleagues, below), and provided a good approximation of the total VH gene compliment of a single individual. The ongoing mapping efforts demonstrated that there is no significant clustering of the individual VH gene families on the locus because they were found intermixed throughout. It was also recognized, primarily through cDNA sequence analysis of numerous autoantibodies performed in laboratories around the world, that although there is certainly allelism involving the individual human VH genes (19–21) compared with other multigene families such as the HLA complex and to the mouse VH genes, the human VH locus exhibits relatively little polymorphism (22–24). This trait was an important factor in the efforts to sequence the human locus. However, it should be realized that there are differences between people in the presence or absence of single or blocks of VH genes due to insertional/deletional polymorphism of the VH locus (11, 13, 25).
Finally, in 1994, Cook and colleagues completed the map of the human VH locus with the analysis of a second haplotype at the 3′ end (25) and through an extension to the telomeric end of chromosome 14q32.3 (5). The final map of the VH locus at this time was ∼1,100 kb and included an estimated 95 VH genes varying depending on the haplotype, with ∼51 functional and the remainder pseudogenes. In addition, 24 orphan VH genes not believed to contribute to the production of functional antibodies were found on chromosomes 15 and 16 (26–28).
About a Million Basepairs.
In this issue of The Journal of Experimental Medicine, Matsuda et al. report having independently mapped the telomeric end of the VH locus, and have completed sequencing the entire span of 957,090 bp (1). In their analysis, the VH locus contains 123 VH genes, including 39 functional genes known to produce heavy chains, 5 genes that appear functional but have not been reported as heavy chain proteins, and 79 pseudogenes. It is striking that approximately two-thirds of the VH genes in the human locus are nonfunctional. Again it should be appreciated that the human VH locus can contain insertional/ deletional polymorphism depending on the particular haplotype(11, 13, 25). Previous estimates report that approximately half of the VH genes are functional, and we have found transcripts for 10–12 VH4 family genes in analyses of tonsils from five different individuals (Wilson, P.C., Y.J. Liu, J. Banchereau, V. Pascual, and J.D. Capra, unpublished results), compared with the seven transcribed VH4 genes reported by Matsuda et al. in this issue (1). The extent and importance of VH gene allelism between particular V genes or the total compliment of VH genes in different individuals, disease states, or racial groups is an area of interest that should not be superceded. Of particular interest concerning the differential complexity of V gene loci are the recent surprising findings of Green and Jakobovits that V gene complexity not only affects diversity, but is also important for the efficient development of the B cell lineage (29). With such observations in mind, it is clear that the sequence of a single or even a few human haplotypes does not tell the entire story, but it does tell the majority of the story and provides a vital framework for future analyses.
Of Man Not Mice.
The idea that the human VH locus would be completely mapped, let alone sequenced, long before the murine system would have been preposterous to many immunogeneticists in the early 1980s. The human genome initiative provided both funds and intellectual cover for this huge amount of work. Additional impetus was directed with the association of various VH genes with diseases from autoimmune syndromes to various lymphoid leukemias and lymphomas. Compounding these practical issues was the unforeseen and not yet fully appreciated complexity of the murine VH locus relative to humans. In the mouse, there is extensive polymorphism both of the V gene locus in terms of the V gene compliment of a particular haplotype and also between the individual genes. Thus, despite the considerably greater ease of genetic manipulation and analysis in the mouse, sequencing of the murine locus is not nearly as complete.
What the Future Holds.
The super-information age of genomic analysis means that millions of bases of genetic code are being converted to bytes and the tools to analyze this information is increasingly being found on the desktop, not at the bench. In analyzing the development and expression of human V genes and somatic mutation, the bench work is increasingly just a matter of course and the real excitement comes only after stepping into the office. Having “our-favorite-locus” in its entirety only a URL away is a provoking thought and the list of analyses and old questions to be addressed using this resource is long and fruitful. With the entire one million nucleotides deposited in the database, immunologists and geneticists from around the world have a unique resource at their fingertips. This most complex of loci—unique in containing so many pseudogenes and gene fragments, and a locus that is (other than other V gene loci) unique in being so dynamic with gene segment recombination events—can now be studied in even more detail. With the single caveat of allelic polymorphism, we now “know what's in the germline.” There can no longer be issues concerning the number of V, D, and J gene segments. Indeed, the great debate of germline versus somatic is now fully laid to rest. Another generation of scientists is looking at promoters, enhancers, tissue-specific factors, chromosomal end points, and the like. All of this is now available thanks to this pioneering work by Matsuda et al. (1). The study of human VH genes has now entered the postgenomics era in which all human bioscience will be propelled in the near future via the human genome project.
Address correspondence to J. Donald Capra, Molecular Immunogenetics, The Oklahoma Medical Research Foundation, 825 NE 13th St., Oklahoma City, OK 73112. Phone: 405-271-7393; Fax: 405-271-8237; E-mail: email@example.com