The molecular basis of genetic predisposition to pulmonary tuberculosis in adults remains largely elusive. Few candidate genes have consistently been implicated in tuberculosis susceptibility, and no conclusive linkage was found in two previous genome-wide screens. We report here a genome-wide linkage study in a total sample of 96 Moroccan multiplex families, including 227 siblings with microbiologically and radiologically proven pulmonary tuberculosis. A genome-wide scan conducted in half the sample (48 families) identified five regions providing suggestive evidence (logarithm of the odds [LOD] score >1.17; P < 0.01) for linkage. These regions were then fine-mapped in the total sample of 96 families. A single region of chromosome 8q12-q13 was significantly linked to tuberculosis (LOD score = 3.49; P = 3 × 10−5), indicating the presence of a major tuberculosis susceptibility gene. Linkage was stronger (LOD score = 3.94; P = 10−5) in the subsample of 39 families in which one parent was also affected by tuberculosis, whereas it was much lower (LOD score = 0.79) in the 57 remaining families without affected parents, supporting a dominant mode of inheritance of the major susceptibility locus. These results provide direct molecular evidence that human pulmonary tuberculosis has a strong genetic basis, and indicate that the genetic component involves at least one major locus with a dominant susceptibility allele.
Mycobacterium tuberculosis, the causal agent of tuberculosis, infects an estimated one third of the world's population and is the second leading cause of death from infectious disease worldwide (1). Only ∼10% of individuals infected with M. tuberculosis eventually develop tuberculosis, resulting in over eight million new cases and two million deaths each year (2). About half the individuals who develop overt clinical disease are diagnosed within 2 yr of infection (1). Other patients—adults in particular—may remain clinically well for long periods, in some cases even decades, before reactivation occurs from a silent primary lesion. Complex interactions between environmental factors (microbial and nonmicrobial) and human factors (genetic and nongenetic) determine the clinical outcome of M. tuberculosis infection, accounting for the lack of symptoms in most individuals and the development of disseminated disease in childhood or pulmonary disease in adulthood in a minority of individuals (3). Considerable genetic epidemiological evidence, in particular coming from twin studies, has accumulated to support a major role for human genetic factors in the development of tuberculosis (4, 5).
Progress toward understanding the molecular basis of pediatric tuberculosis has resulted from the study of genetic defects of the IL-12/23–IFN-γ circuit, typically associated with susceptibility to weakly virulent mycobacteria (3). Unrelated children with IL-12Rβ1 deficiency and bona fide tuberculosis as their sole infectious phenotype have provided proof-of-principle that disseminated tuberculosis may reflect a Mendelian predisposition (6). The proportion of children with disseminated tuberculosis caused by Mendelian predisposition remains to be experimentally determined but has been estimated at 3–30% by Bayesian statistics (7). Even in the absence of Mendelian defects, strong genetic factors of susceptibility to pediatric tuberculosis were identified in an association study of variants of the natural resistance–associated macrophage protein 1 (NRAMP1, alias SLC11A1) gene taking into account gene-environment interactions in a well-defined clinical and epidemiological setting (8).
The study of genetic predisposition to adult tuberculosis has proved more difficult. A large number of polymorphisms in numerous candidate genes has been reported to be associated with pulmonary tuberculosis, but only a few of these results have been replicated in independent studies (5). The most consistently replicated risk factors to date are certain HLA class II (9, 10) and NRAMP1 (11–13) alleles. Recently, a strong association of pulmonary tuberculosis with a common promoter polymorphism of the gene encoding monocyte chemoattractant protein 1 (MCP1, alias CCL2) was reported in two populations (14). However, these three susceptibility genes do not provide a comprehensive understanding of genetic predisposition to tuberculosis. Hence, candidate gene approaches and association studies may have failed to identify major genetic risk factors. We report here a genome-wide linkage scan for pulmonary tuberculosis in adult patients from Morocco. We provide evidence for the presence of a single, major susceptibility locus on chromosome region 8q12-q13, conferring dominant susceptibility to tuberculosis.
Results And Discussion
Genome-wide primary screen of predisposition to pulmonary tuberculosis in Moroccan families
We studied 96 multiplex families, each containing at least two siblings presenting pulmonary tuberculosis (Table I). Pulmonary tuberculosis was documented on clinical, radiological, and microbiological grounds (see Materials and methods). Most (82/96, 85%) families were of Arab origin, with the remainder (14/96, 15%) of Berber origin. The 96 families included 227 affected subjects among the offspring: 131 male and 96 female subjects. The age at onset of pulmonary tuberculosis varied from 6 to 49 yr (mean = 22.6 yr), with only 19 patients under the age of 15 yr at onset. A substantial proportion (44/192, 23%) of the parents also had tuberculosis. Both parents were affected in five families, with one parent affected in 34 families and neither parent affected in 57 families. Genomic DNA was available from both parents in 52 families, from one parent in 40 families, and from neither parent in 4 families. 5 of the 48 parents for whom no DNA sample was available were affected. When possible, unaffected siblings (a total of 57) were included to infer the missing parental genotypes.
In the first stage of this genome-wide screen, 48 families (Table I) were genotyped for 388 microsatellite markers covering the entire genome and providing a 10–15 centimorgans (cM) primary genetic map. Five chromosomal regions (1q22, 3q27-q28, 8q12-q13, 10p15, and 19p12) provided multipoint maximum likelihood binomial (MLB) LOD scores with P < 0.01 (LOD score > 1.17) (Table II). We were unable to confirm the involvement of other regions found to provide suggestive evidence for linkage to tuberculosis (LOD scores ∼ 2) in two previous genome-wide scans for susceptibility to tuberculosis. These regions include the two regions on chromosomes 15 and X identified in African families from The Gambia and South Africa (15), and the two regions on chromosomes 11 and 20 previously identified in Brazilian families (16). As the evidence for linkage in these four regions was only suggestive, further studies are required to determine whether these previous results reflect true linkage signals. Finally, maximum MLB LOD score was also close to 0 for the 17q11-q21 region, for which weak evidence for linkage to tuberculosis and leprosy was obtained for the Brazilian sample (17), and which contains the MCP1 gene recently found to be strongly associated with pulmonary tuberculosis (14).
A major susceptibility locus for pulmonary tuberculosis maps to chromosome region 8q12-q13
We investigated the five chromosomal regions (Table II) further by analyzing the remaining 48 families (Table I) and genotyping 72 supplementary microsatellite markers in these regions in all the families. A decrease in maximum LOD scores was observed in regions 10p15 and 19p12 (Table II). Two other regions, 1q22 and 3q27-q28, showed a slight increase in multipoint LOD scores, resulting in maximum LOD scores of 2.00 (P = 0.001) and 1.93 (P = 0.002) at D1S1595 and D3S2748, respectively. LOD score increased substantially only in chromosomal region 8q12-q13, which provided the highest LOD score in the primary screen. Fig. 1 presents multipoint analysis of the 8q12-q13 chromosomal region. The maximum MLB LOD score was 3.49 (P = 3 × 10−5) at marker D8S1723. The 90% confidence interval for placement of the susceptibility locus is approximately defined by markers D8S2332 and D8S1763, corresponding to 4 megabases (Mb) of chromosomal DNA, including 26 genes with known or predicted function.
We then investigated possible gene–gene interactions between the 8q12-q13 major locus and chromosomal regions 1q22 and 3q27-q28, which gave multipoint LOD scores of ∼2. A conditional analysis, performed as previously described (18), provided no significant evidence (P > 0.05) of positive or negative interactions with the long arm of chromosome 8 for either 1q22 or 3q27-q28 (unpublished data). Finally, we observed no significant heterogeneity (P > 0.05) in linkage results between families of different ethnic origins (Arab/Berber). Overall, the proportion of parental alleles shared by affected siblings at the 8q12-q13 locus, π, was estimated at 0.62 (versus 0.5 in the absence of linkage). In addition, the locus-specific λS, defined as the risk for siblings of patients divided by that for the general population, was estimated at 1.69. This value was calculated by dividing the expected proportion of affected sibpairs sharing zero alleles identical by descent (i.e., 0.25) by the observed proportion (19). Overall, these results provide evidence for a single major locus on 8q12-q13, predisposing subjects in the Moroccan sample to pulmonary tuberculosis.
Evidence for a dominant effect of the tuberculosis susceptibility locus
We investigated the 8q12-q13 region further by carrying out a model-based linkage analysis. We assumed a simple dominant model, as previously described (12), with the frequency of the risk allele fixed at 0.05 and a relative risk (comparing homozygous “resistant” subjects and carriers of at least one copy of the risk allele) of 10. Model-based LOD scores were very similar to those obtained in model-free MLB analysis, with a maximum LOD score of 3.38 (P = 4 × 10−5) at D8S1723, demonstrating that the dominant model fitted the data well. Furthermore, for dominant genetic effects, multiplex families with at least one affected parent would be expected to display stronger linkage to the susceptibility locus than families with unaffected parents.
The 96 families studied were therefore split into two subsamples according to the affected/unaffected status of the parents. We found significant heterogeneity of linkage (P < 0.03) with the 8q12-q13 region between the two subsamples. The maximum multipoint MLB LOD score was 0.79 (P = 0.03) at D8S1464 in the 57 families without affected parents, whereas it was 3.94 (P = 10−5) at D8S1113 in the 39 families with at least one affected parent. The effect of the susceptibility locus was stronger in families with at least one affected parent than in the whole sample, with π and λS estimated at 0.69 and 2.64, respectively. These results provide strong evidence for the existence of an autosomal dominant allele of a major susceptibility gene controlling pulmonary tuberculosis in these Moroccan families, particularly in those families including pulmonary tuberculosis cases in multiple generations.
This work provides proof-of-principle that major susceptibility loci—defined as loci that can be detected by genome-wide linkage studies—are involved in the control of adult pulmonary tuberculosis. This finding is consistent with the concept of a continuous spectrum in the genetic control of clinical tuberculosis (20), with major genes bridging the gap between established Mendelian susceptibility to severe tuberculosis in children and more complex polygenic predisposition in adults. It is also consistent with the previous identification of major susceptibility loci in other common infectious diseases, such as schistosomiasis (21), leishmaniasis (18), and leprosy (19, 22). These results go against the prevailing model of a highly dispersed genetic component of susceptibility to infectious diseases (23), based on multiple susceptibility genes, each of which confers a minor risk of disease. The documented impact of major genes does not exclude the role of other genetic factors, and genetic heterogeneity is probably involved in predisposition to tuberculosis. We show here that the genetic control of tuberculosis is heterogeneous, as the chromosome 8 locus was not involved in all families and, in particular, had none or low effect in families with no affected parent. The extent to which this reflects pure genetic heterogeneity (e.g., predominant polygenic control in families without affected parents) and/or the impact of unmeasured environmental risk factors (e.g., M. tuberculosis variable virulence; reference 24) remains to be determined.
Our model-based analysis showed that the same dominant genetic model provided evidence for linkage to two independent regions in two entirely different epidemiological settings: Moroccan nuclear families living in a highly endemic area and a large Canadian aboriginal pedigree during an outbreak (12). This indicates that dominant alleles with a relative strong genetic effect may be a common feature of tuberculosis susceptibility, and provides novel insight into evolutionary population genetics. It has been suggested that prolonged exposure to tuberculosis resulted in the genetic adaptation of exposed populations, accounting for the rapid decline in tuberculosis mortality rates in Europe during the 19th Century, before any specific measures against the disease were taken (25). Not all genetic models of tuberculosis susceptibility are consistent with this theory. For example, if resistant alleles were rare at the beginning of the European epidemic (around the early 17th century), 300 yr would not have been long enough for substantial genetic selection (26). In contrast, if rare dominant susceptibility alleles, like those identified here, accounted for a substantial proportion of cases, then natural selection would have had a strong impact on the genetic composition of an exposed host population, even over a short period of time. Regardless of its historical implications, our work may open up new possibilities for the care and treatment of future patients with tuberculosis. An understanding of the molecular genetic basis of tuberculosis is critical for the development of vaccines specifically designed to protect the minority of individuals genetically predisposed to tuberculosis, and for the development of novel immunomodulatory drugs (7) (http://www.nap.edu/catalog/11471.html). Following the strategy successfully used for the identification of PARK2/PACRG polymorphisms conferring susceptibility to leprosy (27), we are currently carrying out linkage disequilibrium mapping of the 8q12-q13 region for identification of the tuberculosis-causing gene variants.
Materials And Methods
Subjects and families.
We collected nuclear families (parents and offspring) in which at least two of the offspring presented pulmonary tuberculosis (multiplex families). Families were identified at several hospitals and tuberculosis diagnosis centers located in highly endemic areas of Casablanca and Salé (Morocco) where the annual incidence of tuberculosis has been estimated at ∼150 cases per 100,000 inhabitants. The diagnostic criteria for pulmonary tuberculosis were (a) persistent cough and, in most cases, common symptoms of tuberculosis, such as fever, night sweats and weight loss and (b) pathological findings on chest X ray (upper-lobe infiltrates, cavity infiltrates and/or hilar or paratracheal adenopathy). For inclusion, all patients had to have positive sputum smear microscopy results (Ziehl-Neelsen staining) and/or to have provided at least one positive sputum culture (Lowenstein-Jensen medium). Negative bacteriological results for both direct and culture examinations were an exclusion criterion. The study was approved by the Moroccan Ministry of Public Health (Délégation du Ministère de Santé publique à Casablanca et Direction d'Epidémiologie et de Lutte contre les Maladies Infectieuses), the Institutional Review Board of the Medical School of Casablanca, and the ethics review board at the Research Institute of the McGill University Heath Centre in Montreal. Families of patients with pulmonary tuberculosis were contacted by a primary health care worker, and informed consent was obtained from all subjects enrolled in the study.
A previously described panel of 388 highly informative microsatellite markers with an average intermarker spacing of 10 cM was derived from a modified version of the Cooperative Human Linkage Centre (CHLC) Screening Set/version 6.0. These markers also included Genethon markers (19). All markers were amplified with GeneAmp PCR System 9700 (ABI) thermocyclers under uniform conditions as follows: 5 ng of genomic DNA was added to a reaction mixture containing 3.0 mM MgCl2, 0.1 mM dNTPs, 0.1 μM of each primer, and 1.0 unit of Taq polymerase. The reaction was initiated by denaturing the samples at 96°C for 10 min, followed by a touchdown procedure consisting of 40 cycles of denaturation for 30 s at 94°C, annealing for 30 s at 60°C (3×), 59°C (2×), and 54°C (35×) and extension for 1 min at 72°C. Samples were subjected to a final extension at 72°C for 10 min. We genotyped an additional 72 microsatellite markers under the same conditions to generate high-density marker maps of the genomic regions selected for fine mapping. We determined the size of the amplified products on the ABI3700 platform. The genetic positions of the markers were obtained from the sex-average map of the Marshfield Medical Research Foundation (http://research.marshfieldclinic.org/genetics/). Allele binning and Mendelian error checking were performed with PEDMANAGER software (http://www.genome.wi.mit.edu/ftp/distribution/software/pedmanager). The overall genotyping error rate was estimated to be >0.2%.
We used the MLB model-free method (28) for linkage analysis. This approach is based on the number of affected siblings receiving a given parental allele being binomially distributed, and does not require the breaking down of sibships with more than two affected siblings into constitutive sibpairs. The likelihood of the data depends on a single parameter, α, the probability parameter of this binomial distribution. There is a direct relationship between α and the proportion of alleles shared by affected sibs, π, which is equal to 1–2α(1-α), regardless of the number of affected sibs (28). The α parameter is estimated by maximum likelihood, and the test for linkage is a simple likelihood ratio test assessing the departure of α from 0.5. The resulting statistic is asymptotically distributed as a 50%:50% mixture of χ2 distributions with 0 and 1 degrees of freedom, and can be expressed as a LOD score with the same distribution as a classical model-based LOD score estimating the recombination fraction. Multipoint MLB development has been implemented (29) in an extension of the GENEHUNTER program (30), and large simulation studies have shown that the MLB statistic provides very consistent type I errors using asymptotic distributions, particularly when analyzing families with more than two affected siblings (29). Linkage in the chromosome 8 region was also analyzed with a model-based approach, as implemented in GENEHUNTER software. For this analysis, we used the simple dominant model described in a previous study (12), with a frequency of the risk allele fixed at 0.05 and a relative risk of 10 between homozygous “resistant” subjects and carriers of at least one copy of the risk allele.
The genome-wide screen was conducted using a two-step strategy, with a first subsample of 48 families genotyped for the whole set of 388 microsatellite markers (primary scan) and a second subsample of 48 families only genotyped in the chromosomal regions of interest (Table I). The whole sample was randomly split into these two subsamples, and no significant difference (P > 0.05) was observed in the distribution of the number affected siblings between the two subsamples. Within the linked region, we also tested for sample heterogeneity for a binary criterion (e.g., affected status of parents), by carrying out linkage analysis for the whole sample (96 families), and separately for the two subsamples consisting of families with no affected parents (57 families) and families with at least one affected parent (39 families). Under the hypothesis of linkage homogeneity, twice the difference between the likelihood of the whole sample and the summed likelihoods of the two subsamples is asymptotically distributed as a χ2 statistic with one degree of freedom. Finally, we investigated possible gene–gene interactions between the detected 8q12-q13 major gene locus and the other two regions showing some evidence of linkage, using a conditional analysis procedure, as previously described (18). More specifically, we analyzed linkage to regions 1q22 and 3q27-q28 by weighting the families based on their multipoint linkage results at marker D8S1723. A positive relationship was assessed by assigning weights of 0 and 1 to families with LOD scores of ≤0 and >0, respectively, whereas a negative relationship (heterogeneity) was assessed by assigning weights of 1 and 0 to families with LOD scores of ≤0 and >0, respectively.
We sincerely thank all the families that agreed to participate in this study. We thank Alexandre Alcaïs, Stéphanie Dupuis, Daniel Nolan, Natascha Remus, and members of the Laboratory of Human Genetics of Infectious Diseases for fruitful discussions. We also thank Mohamed Ouaaline (Military Hospital Mohamed V) and Andrei Verner (McGill University and Genome Québec Innovation Centre) for their helpful collaboration.
This work was supported by grants from Institut National de la Santé et de la Recherche Médicale/Centre National de Coordination et de Planification de la Recherche Scientifique et Technique, the Canadian Genetic Disease Network, Foundation Banque Nationale de Paris-Paribas, and the Gene Cure Foundation Canada. A. Alter holds a graduate studentship from the Natural Science and Engineering Research Council of Canada. A. Alter and M. Orlova are supported by the Canadian Institutes of Health Research Strategic Training Centre in the Integrative Biology of Infectious Diseases and Autoimmunity. J.-L. Casanova is an International Scholar of the Howard Hughes Medical Institute. E. Schurr is a Chercheur National of the Fonds de Recherche en Santé du Québec. L. Abel was supported in part by Programme de Recherche Fondamentale en Microbiologie Maladies Infectieuses et Parasitaires and Agence Nationale de la Recherche of the Ministère Français de l'Education Nationale de la Recherche et de la Technologie.
The authors have no conflicting financial interests.