Multisystem inflammatory syndrome in children (MIS-C) is a rare condition following SARS-CoV-2 infection associated with intestinal manifestations. Genetic predisposition, including inborn errors of the OAS-RNAseL pathway, has been reported. We sequenced 154 MIS-C patients and utilized a novel statistical framework of gene burden analysis, “burdenMC,” which identified an enrichment for rare predicted-deleterious variants in BTNL8 (OR = 4.2, 95% CI: 3.5–5.3, P < 10−6). BTNL8 encodes an intestinal epithelial regulator of Vγ4+γδ T cells implicated in regulating gut homeostasis. Enrichment was exclusive to MIS-C, being absent in patients with COVID-19 or bacterial disease. Using an available functional test for BTNL8, rare variants from a larger cohort of MIS-C patients (n = 835) were tested which identified eight variants in 18 patients (2.2%) with impaired engagement of Vγ4+γδ T cells. Most of these variants were in the B30.2 domain of BTNL8 implicated in sensing epithelial cell status. These findings were associated with altered intestinal permeability, suggesting a possible link between disrupted gut homeostasis and MIS-C-associated enteropathy triggered by SARS-CoV-2.
Introduction
Severe presentation of COVID-19 in pediatric patients has remained rare throughout the SARS-CoV-2 pandemic. However, 2–6 wk after exposure, ∼1 in 10,000 such children develop a postinfectious syndrome resembling Kawasaki disease (KD), termed multisystem inflammatory syndrome in children (MIS-C) (Whittaker et al., 2020; Sancho-Shimizu et al., 2021; Dionne et al., 2022). Patients often present with similar clinical features to KD including persistent fever, rash, and conjunctivitis, although there was more frequent and profound gastrointestinal involvement (Miller et al., 2020). Epidemiological differences also differentiate KD and MIS-C, with the median age of MIS-C being 9 years compared to KD patients who are typically under 5 years of age (Sancho-Shimizu et al., 2021; Noval Rivas and Arditi, 2023), and with the incidence of KD being highest in Asia whereas MIS-C is more prevalent in Europe and the USA. Moreover, whereas the etiology for KD remains unclear, MIS-C was attributable to SARS-CoV-2 (Sancho-Shimizu et al., 2021; Noval Rivas and Arditi, 2023; Whittaker et al., 2020), with evident SARS-CoV-2 antibodies and/or a history of exposure to the virus despite there being no detectable SARS-CoV-2 infection in their upper respiratory tract upon presentation. Clinical features most commonly included fever, abdominal pain, rash, myocardial dysfunction, and elevated inflammatory biomarkers including C-reactive protein (CRP), NT-pro-BNP, ferritin, TNFα, and IL-6 (Whittaker et al., 2020; Carter et al., 2020; Gruber et al., 2020).
MIS-C is characterized by a distinct IFN-γ and NFκB-dependent signature (Sacco et al., 2022; Consiglio et al., 2020). Signatures of T cell activation and inflammation unique to MIS-C have also been detected, coupled with a specific transient polyclonal expansion of T cell receptor (TCR) Vβ21.3+ CD4+ and CD8+ T cells (Sacco et al., 2022; Moreews et al., 2021; Porritt et al., 2021; Jackson et al., 2023; Hoste et al., 2022). Low T cell and natural killer cell numbers have been reported, with the latter having elevated expression of cytotoxicity genes (Moreews et al., 2021; Gruber et al., 2020). Increased plasma IL-18 and surface expression of CD64 are suggestive of monocyte and neutrophil activation, respectively, in the acute phase of MIS-C (Carter et al., 2020). The mechanism underlying MIS-C remains unclear yet various possibilities have been discussed including antigen persistence, T cell exhaustion, and superantigen-mediated T cell activation (Consiglio et al., 2020; Beckmann et al., 2021; Porritt et al., 2021; Hsieh et al., 2022; Yonker et al., 2021), albeit that T cell super-activation and exhaustion were likewise implicated in COVID-19 pathology that presents very differently (Laing et al., 2020). A genetic predisposition has also been considered on the basis of epidemiological evidence, with a low incidence of MIS-C reported in East Asian (EAS) countries despite a high number of COVID-19 cases (Sancho-Shimizu et al., 2021; Carter et al., 2020; Esposito and Principi, 2021).
Monogenic inborn errors of immunity (IEI) confer increased susceptibility to infections and inflammatory diseases and have been shown in many instances to underlie COVID-19 pneumonia consequent to the ineffective production of the type I IFN response (Bousfiha et al., 2022; Zhang et al., 2020; Casanova and Anderson, 2023). A genetic predisposition has also been reported to underlie susceptibility to MIS-C in previously healthy children (Lee et al., 2020, 2023; Sancho-Shimizu et al., 2021). Rare biallelic variants impairing the 2′5′oligoadenylate synthetase-ribonuclease L pathway have been shown to lead to exacerbated inflammatory responses in 1% of MIS-C patients, owing to an inability to regulate the mitochondrial antiviral-signaling protein-mediated cytokine response in mononuclear phagocytes (Lee et al., 2023). Additionally, several children with existing IEIs with variants in CFH, UNG, and TREX1 have reportedly developed MIS-C (Abolhassani et al., 2022). Other reported genetic risk factors for MIS-C include variants in XIAP, CYBB, and SOCS1 which were shown to impair the negative regulation of IFN and inflammatory signaling (Chou et al., 2021; Lee et al., 2020). Additionally, enrichment of NUMB and NUMBL variants have been identified that seem to disrupt T cell function consequent to Notch1 upregulation (Benamar et al., 2023). Here, we exome-sequenced a new cohort of 154 MIS-C patients and carried out gene burden analysis to further identify genes that underlie susceptibility to MIS-C.
Results
Characterization of the MIS-C cohort
Pediatric patients fulfilling the WHO criteria for MIS-C were recruited from the UK, the Netherlands, and the USA between March 2020 and December 2021 (WHO, 2020). The clinical and demographic characteristics of 154 recruited MIS-C patients are summarized in Table 1. Features of the MIS-C cohort were similar to those previously described with a median age of 9 years, predominately SARS-CoV-2 IgG seropositive (81.2%) and presenting with non-specific symptoms such as fever and gastrointestinal (GI) discomfort (79.2%) (Sancho-Shimizu et al., 2021; Hoste et al., 2021; Sacco et al., 2022). There were 63.6% of patients requiring intensive care with one death. Our cohort had diverse ancestry comprising 29.2% Hispanic/American (AMR), 26.0% European (EUR), 26.6% African (AFR), 11.7% South Asian (SAS), and 3.2% EAS. These patients were whole-exome sequenced (WES), and their exomes were screened for rare non-synonymous variants that have been previously reported in IEIs (Bousfiha et al., 2022). Pathogenic variants identified and consistent with the mode of inheritance as reported by Bousfiha et al. (2022) and genotypic International Union of Immunogical Societies (IUIS) companion paper (Tangye et al., 2022) are reported in Table S2 (CASP10, CD46, FANCC, PRF1, and TBX1). Variants in TBX1 have been previously reported in MIS-C but the impact remains unclear (Abolhassani et al., 2022). These variants may indeed increase the risk of developing MIS-C however further functional and clinical assessment is required to fully implicate them. Patient exomes were also screened for rare non-synonymous variants in the OAS-RNAseL pathway as biallelic variants in this pathway have been shown to lead to exacerbated inflammatory responses in MIS-C patients (Lee et al., 2023). Only monoallelic variants were identified in our cohort (Table S3).
Gene burden testing
We deployed our novel statistical framework, burdenMC, to perform rare variant enrichment analysis to discover genes that may underly the susceptibility to MIS-C. Given the low incidence of MIS-C in the population (Sancho-Shimizu et al., 2021), we focused our burden analysis on rare non-synonymous variants (allele frequency <1% in each population group). Furthermore, we used the CADD score, a widely adopted meta-predictor of variant deleteriousness (Rentzsch et al., 2019), to stratify our analysis. To that end, we imposed a Phred-scaled CADD threshold of 20 to select the top 1% most deleterious variants in the dataset including predicted loss of function (pLOF) and nonsynonymous variants which we collectively considered as rare predicted-deleterious variants. As described in Table 1 and confirmed through principal component analysis (PCA) (Fig. S1), our cohort is highly diverse, with all major ancestral groups represented. For the purposes of rare deleterious burden testing, we focused on ancestral groups comprising at least 10 individuals, which excluded five EAS individuals (3% of the cohort) in additon to five individuals with ambiguous ancestry anssignment (Table 1). For the remaining groups, burdenMC testing was performed separately for each ancestry to account for differences in allelic distributions and then combined.
Increased rare variant burden in BTNL8
This analysis revealed butyrophilin-like 8 (BTNL8), encoding a protein expressed on healthy intestinal epithelial cells modulating intestinal Vγ4+γδ T cells, to be significantly enriched for rare predicted-deleterious variants at an exome-wide significance level (P < 10−6) (Fig. 1, A and B; and Table S4) (Di Marco Barros et al., 2016). A total allele count of 20 representing 8 variants including 1 pLOF variant was identified across 12 patients, accounting for 8.3% of our cohort. BTNL8 was the only gene exhibiting a strong signal across multiple ancestries (Fig. 1, A–C). A further 12 genes were significantly burdened in the combined analysis, with support from only a single ancestral group (Table S4). Fig. 1, B and C demonstrates how individual ancestries contribute to the BTNL8 enrichment. Notably, two of the largest ancestral groups in our MIS-C cohort (EUR and AMR) exhibit a standalone statistically significant burden, while the other two (AFR and SAS) are nominally significant. Based on gnomAD frequencies, burdenMC expects on average only one rare variant to be present in BTNL8 in a cohort of the size analyzed. Of note, a variant in BTNL8 has also been described as a risk modifier for inflammatory bowel disease (IBD) (Dart et al., 2023).
BTNL8 burden is not found in COVID-19 cohort
To examine whether our BTNL8 findings were specific to MIS-C, we performed targeted burden testing in external sequencing cohorts. First, we set out to confirm that BTNL8 is not associated with susceptibility to SARS-CoV-2 infection or severity of COVID-19. To that end, we queried the results of a large-scale rare variant burden study on COVID-19 outcomes (Butler-Laporte et al., 2022). This study meta-analyzed 21 WES and whole-genome sequencing cohorts comprising an aggregate >28,000 COVID-19 cases. In that analysis, BTNL8 did not appear to be associated with any of the standardized COVID-19 outcomes tested, including susceptibility, severity, and hospitalization. Of note, that study utilized equivalent filters for allele frequency (AF) and deleteriousness as our MIS-C analysis and included participants from the same ancestry groups (Butler-Laporte et al., 2022) (Table S5).
BTNL8 burden is specific to MIS-C
To explore the role of BTNL8 in other infectious diseases and to ascertain that the observed differential burden was not due to technical artifacts or methodological limitations, we applied burdenMC directly to two other sequencing cohorts: an in-house WES data from 502 pediatric patients with severe bacterial infections from the EUCLIDS consortium (Martinón-Torres et al., 2018), and 189 patients with COVID-19 from the COVID Human Genetic Effort (COVID-HGE) consortium (Table S1) (Casanova et al., 2020). In these datasets, burdenMC found no evidence of BTNL8 association with bacterial disease or COVID-19. We then estimated the effect size of the BTNL8 variant burden across these three cohorts. The results demonstrate that individuals with rare predicted-deleterious variants in BTNL8 had a fourfold increase in the odds of presenting with MIS-C (odds ratio [OR] = 4.2, 95% confidence interval [CI]: 3.5–5.3) (Fig. 1 D). On the other hand, BTNL8 variants did not appear to increase the odds of COVID-19 (OR = 0.8, 95% CI: 0.7–1.1) or severe bacterial infection (OR = 1.1, 95% CI: 0.8–1.4).
Population genetics of BTNL8 variation
As a whole, gnomAD reports BTNL8 to be moderately intolerant to pLoF variants (LOEUF = 0.8) and missense variants (o/e = 0.83, 90% CI: 0.78–0.9) (https://gnomad.broadinstitute.org/). To estimate selective constraint at the subgenic level, we calculated the missense tolerance ratio (MTR) across BTNL8 using data from gnomAD v2.1.1 (Silk et al., 2019). As a local measure of observed versus expected missense variation, MTR can highlight protein regions under negative selection. Although much of BTNL8 is not predicted to deviate from neutrality (MTR = 1), there appear to be two regions under selective pressure (MTR < 0.7) across multiple ancestral groups (Fig. S2 A), both of which overlap with the intracellular B30.2 domain (Fig. S2 B). Although the precise function of the BTNL8 B30.2 domain is unresolved, such domains are common in other proteins implicated in innate immunity, wherein they mediate protein–protein interactions (Chae et al., 2006, D’Cruz et al., 2013). In this regard, the sole known function of BTNL8, which is to regulate colonic Vγ4+γδ T cells, totally relies on its interaction with BTNL3 (Di Marco Barros et al., 2016; Melandri et al., 2018). Additionally, the B30.2 domain of BTN3A1, a related regulator of human Vγ9+γδ T cells, binds low molecular phosphorylated lipids which reflect a cell’s infected and/or metabolic state (Sandstrom et al., 2014; Paysan-Lafosse et al., 2023). The genomic region encompassing BTNL8 and BTNL3 also contains a polymorphic copy number variant (CNV) that leads to a BTNL8*3 fusion (Aigner et al., 2013). This variant is common across ancestral groups, with an overall AF of 26% on gnomAD, and has been previously associated with a penetrating form (B3) of Crohn’s disease (Dart et al., 2023). To assess the frequency of the CNV in our cohort, we performed PCR-based CNV calling for a subset of our cohort for which there was DNA available (n = 115). This was undertaken using the previously reported BTNL8*3 CNV genotyping PCR (Dart et al., 2023). We did not observe a statistically significant overrepresentation of the CNV in our cohort (Table S6; AF = 22.0% in MIS-C versus 21.5% in genetic ancestry-matched controls). This may in part be due to the lack of statistical power, as relatively large sample sizes (>1,000 patients) would be required to detect differences in such common variants (Fig. S2 C).
Protein domain burden testing
To further elucidate the effect of rare variation on functional coding elements, we also applied burdenMC to the subgenic regions corresponding to protein domains. Given the sensitivity/specificity trade-off of computational pathogenicity predictions (Sun and Yu, 2019), no CADD score filter was applied in this analysis. To that end, we selected the top 58 genes with suggestive rare variant burden (P < 0.001) and mapped 77 well annotated domains from the InterPro database (Paysan-Lafosse et al., 2023). This analysis identified domain B30.2 of BTNL8 as the top hit (P = 1.4 × 10−5), with an allele count of five, representing three rare variants (AF < 1%, CADD > 0) versus none expected (Table S7). By contrast, there was no statistically significant enrichment for rare variants in the BTNL8 IgC and IgV ectodomains, for which no function has yet been established. Similarly, none of the domains mapped to the 12 other genes assessed carry a significant rare variant burden (Table S7).
Validation of B30.2 BTNL8 rare variant burden in independent MIS-C cohort
We set out to validate our BTNL8 findings in an independent MIS-C cohort obtained from the COVID-HGE consortium. This cohort comprises 300 children with MIS-C that match the ancestral composition of our cohort. As a comparator control group, we acquired raw sequencing data from the ICR1000 UK exome project, comprising 1,000 control samples from the British 1958 Birth Cohort (Ruark et al., 2015). Using permutation testing, we demonstrated that the rare variant burden in the B30.2 domain of BTNL8 was also observed in the external COVID-HGE dataset with an allele count of 8 in 300 cases (representing five variants across seven individuals; no homozygotes, no pLOF) versus 7 in 1,000 controls (P = 7 × 10−4), thus replicating our finding (Table S8).
Identification of variants for functional validation
To explore the potential role(s) of BTNL8, we expanded the scope of our analysis to include additional MIS-C patients from external cohorts (Table 2). Following the same filtering strategy focusing on rare variants (AF < 1% & CADD > 0), we identified 21 variants in 46 patients from COVID-HGE (n = 690), 7 of which overlap with the variants detected in our cohort. Including the 11 variants (9 patients) in our cohort, this brings the total to 25 different variants in 55 unrelated MIS-C patients (Fig. 2, A–C and Table 2). In our MIS-C cohort, all individuals were heterozygous for the BTNL8 variants, except for one patient with three variants (p.S6G-R162W-S176F). In the COVID-HGE MIS-C cohort, two individuals were homozygous for p.P299L and four individuals had more than one variant (p.S6G-R162Q; p.R162Q-S176F; p.P299L-A475T; p.R354H-Y446C) (Table 2). Of the 25 variants identified, seven map to the B30.2 domain. By comparison, in a control COVID-HGE-COVID19 cohort (n = 189), we observed only six BTNL8 variants, five of which were also present in MIS-C (Table 2). As these have not been previously assessed functionally in the literature, we performed in silico and in vitro investigations for 26 BTNL8 variants in total.
Structural modeling of BTNL8/BTNL3
The structure of the BTNL8/BTNL3 complex has not been resolved, but a robust model has been proposed based on experimental data elucidating the function of the heterodimer (Melandri et al., 2018). In this model, the BTNL8–BTNL3 interface lies on the intracellular B30.2 domain which is shared by both proteins, while the TCR interaction of the complex is mediated by BTNL3’s extracellular IgV domain (Fig. 3 A). We assessed the potential effects of rare BTNL8 variants on protein function through structural modeling of the BTNL8/BTNL3 dimer. Specifically, AlphaFold (Jumper et al., 2021) was used to generate structure predictions of the BTNL8/BTNL3 complex, and the results were visualized with Mol* (Sehnal et al., 2021) These predictions recapitulate aspects of the proposed heterodimer model while highlighting key structural features of the complex (Fig. 3). Residue interaction network (RIN) analysis of the B30.2 domain demonstrates how hydrogen bonds dominate and define the overall structure, while two disulfide bridges stabilize the complex (Fig. 3 C). Importantly, the results establish the increased connectivity of residues near the BTNL8–BTNL3 interface but also highlight additional interactions distal to the interface. Based on this analysis, five out of seven MIS-C variants in B30.2 (p.H311N, p.Y342C, p.R354H, p.Y446C, p.P456S) exhibit a high degree of residue–residue interaction.
Protein expression of BTNL8 variants
Given the co-dependence of BTNL8 and BTNL3 surface expression, changes in the cell surface expression of BTNL8 and BTNL3 were assessed following co-transfection of BTNL8 and BTNL3 into 293T cells which ordinarily express neither BTNL8 nor BTNL3. Of the 26 variants tested, all but two variants in the IgV and IgC domains (p.L101Q and p.R215X) were expressed comparably to the WT reference allele (Fig. 4). Leucine in position 101 is highly conserved across BTN and BTNL proteins in humans and mice, and hence p.L101Q may result in improper protein folding of the variable domain, consequently impacting surface expression levels. The nonsense p.R215X mutant displayed total loss of surface expression as expected given that the premature stop codon occurs before the transmembrane domain. Only two out of seven variants in the B30.2 domain (p.H311N and p.S335C; Fig. S3) showed comparable expression levels to that of the reference WT allele, whereas variants p.P299L, p.Y342C, p.R354H, p.Y446C, and p.P456S exhibited reduced surface expression. In conclusion, of the 26 variants tested, 6 variants showed impaired cell surface expression.
Assessing functional impact of BTNL8 variants
We next assessed the capacity of the 26 variants to engage a Vγ4+ TCR, based on the established assays of TCR downregulation and CD69 upregulation, the latter reflecting the induction of TCR-dependent signaling (Fig. 4 and Fig. S3) (Melandri et al., 2018). Thus, Vγ4Vδ1-expressing Jurkat cells were co-incubated with 293T cells transduced to express the WT reference allele of BTNL8 or variants thereof together with a WT reference allele of BTNL3 to facilitate BTNL8 surface expression. All six variants with impaired surface expression also exhibited reduced activity with p.R215X showing complete loss of function (Fig. 4). Despite seemingly normal surface expression levels, p.S232L and p.R354H also showed impaired capacity to induce CD69 upregulation and TCR downregulation (Fig. 4). Variants p.299L, p.Y446C, and p.P456S each exhibited normal T cell activation with regards to percentage of CD69+ Vγ4Vδ1 cells induced but were strikingly impaired in the capacity to downregulate the TCR (Fig. 4). Of the eight variants with impaired function, most were located within the B30.2 domain (p.P299L, p.H311N, p.Y342C, p.R354H, p.Y446C, and p.P456S), highlighting its importance. In summary, we have identified 8 variants in 18 MIS-C patients with impaired BTNL8 expression and/or function among a total of 835 MIS-C patients (2.2%) This evidence suggests that in these patients, there is likely to be an impaired capacity of BTNL8/BTNL3 heterodimer to engage γδ T cells, potentially creating homeostatic imbalance in the gut, a major site of MIS-C pathology.
BTNL8 expression in whole blood
BTNL8 is very evidently expressed exclusively by intestinal epithelial cells and also reportedly by neutrophils (https://gtexportal.org). BTNL3 is predominantly expressed in intestinal tissues and exhibits very low (arguably negligible) expression in blood. The role of BTNL8 in the gut has been extensively studied (Vantourout et al., 2018; Dart et al., 2023; Melandri et al., 2018; Willcox et al., 2019), while its role in neutrophils remains wholly unelucidated. Intestinal tissue from acute MIS-C patients was unavailable, so intestinal expression could not be assessed in our cohort. However, whole blood expression could be ascertained from existing samples, so we opted to interrogate BTNL8 and BTNL3 expression in an external RNA sequencing (RNA-seq) control cohort. This dataset comprises MIS-C patients (35 overlapping with our WES), pediatric healthy controls, and pediatric febrile controls (Table S1) and examines expression across three timepoints (acute illness, posttreatment, and convalescence). When compared with healthy controls, MIS-C patients did not exhibit a statistically significant difference in BTNL8 expression (Fig. 5 A). BTNL8 levels were significantly higher in MIS-C compared with viral febrile controls during acute illness. Furthermore, when broken down by time point within MIS-C, BTNL8 expression exhibited a trend toward decreased expression in acute illness, which recovered upon convalescence although this remains non-significant (Fig. 5 A). To determine whether this trend is driven by neutrophilia or lymphocytopenia, which are hallmarks of MIS-C (Whittaker et al., 2020; Rowley et al., 2020), we imputed relative leukocyte fractions from our RNA-seq dataset using CibersortX (Newman et al., 2019). In agreement with the literature, our MIS-C patients exhibit high neutrophil fractions in the acute phase compared with convalescence (Fig. S4 A). However, even after accounting for these differences, our BTNL8 expression findings remain consistent (Fig. S4 B). We also examined BTNL3 in the same RNA-seq cohort, which exhibited minimal expression across phenotypes and time points (Fig. 5 B and Fig. S4 C). In an orthogonal proteomic dataset, plasma BTNL8 levels did not differ across disease groups (Fig. 5 C). A significant increase in BTNL3 plasma protein expression (P < 0.05) was observed in MIS-C patients (n = 79, including 3 of our exome-sequenced cohort; p.S6G, p.H311N; p.S6G-R162Q-S176F), although the detectible levels were so small (Fig. 5 D), that any biological significance is both unresolved and unlikely (Fig. 5, C and D).
BTNL8 expression upon stimulation of whole blood
Having obtained evidence that BTNL8 expression could be detected in whole blood, we sought to evaluate changes in BTNL8 expression that might follow exposure to SARS-CoV-2 antigens. Whole blood from an external MIS-C cohort was stimulated in SARS-CoV-2 QuantiFERON collection tubes using CD4+ specific SARS-CoV-2 receptor binding domain peptides (Ag1), CD4+/CD8+ specific SARS-CoV-2 peptides (Ag2), or mitogen. Expression was assessed by RNA-seq. Following QuantiFERON stimulation, MIS-C patients had elevated IFNG transcripts, encoding IFNγ, suggesting that patients’ cells were able to mount an immunological response to SARS-CoV-2 antigens and mitogen (Fig. S4 D). Upon convalescence, IFNG expression increased ∼10-fold compared with acute MIS-C, to a level that was similar to healthy controls. This could be an indicator of T cell exhaustion in the acute MIS-C samples, a known diagnostic feature of MIS-C (Shankar-Hari et al., 2023; Hoste et al., 2022). Although BTNL8 expression was again detected, none of the stimuli induced a significant difference in expression levels although a downward trend was noted (these data include the patient with hypomorphic p.L101Q) (Fig. S4 E).
Potential role of intestinal immunity and BTNL8 in MIS-C
The significant association of BTNL8 variant alleles with MIS-C might offer a link to enteropathy which is a common symptom of the disease (Table 1). Interestingly, 100% of MIS-C patients with BTNL8 variants in our cohort exhibited GI symptoms (Table 1). Among the many proposed triggers of MIS-C is the persistence of SARS-CoV-2 or SARS-CoV-2-related antigen in the gut leading to increased gut inflammation and intestinal permeability resulting possibly in antigenemia and systemic inflammation (Yonker et al., 2021). To evaluate the level of intestinal damage in MIS-C patients, in the absence of fecal samples, we assessed plasma-based intestinal permeability biomarkers including zonulin, calprotectin, LBP, and FABP2 (Kalla et al., 2016; Vreugdenhil et al., 2011; Seethaler et al., 2021; Riva et al., 2020) in MIS-C patients and controls (Fig. 5 E and Fig. S5). Increased plasma zonulin was observed in acute MIS-C and KD patients whereas increased calprotectin and LBP were found in acute MIS-C patients compared with all groups (Fig. 5 E and Fig. S5) (Yonker et al., 2021). MIS-C patients with BTNL8 variants (n = 3; p.S6G, p.H311N; p.S6G-R162Q-S176F) appear to have plasma zonulin levels within the higher end of the range compared with other MIS-C patients. An increase in these biomarkers is consistent with compromised intestinal integrity. Increased FAPB2 is normally associated with intestinal permeability; however, we saw a significant decrease (Fig. S5 D), which has been reported previously in MIS-C patients with gastroenteritis (Josyabhatla et al., 2021). In addition, FABP2 is a small intestine–specific marker whereas BTNL8 is mainly associated with colonic epithelial cells potentially explaining these observations (Di Marco Barros et al., 2016; Dart et al., 2023). Moreover, previous studies have found that in adult COVID-19 patients’ expression of plasma FABP2 is especially low in patients with the highest levels of inflammation and gut permeability. Assante et al., in fact, suggested in COVID-19 patients reduced plasma FABP2 is a marker of SARS-CoV-2 infected enterocytes (Assante et al., 2022). We were unable to test pre-MIS-C or convalescent samples to assess baseline levels. However, given that baseline uninflamed gut biopsies from BTNL8*3 CNV IBD patients showed normal intestinal architecture (Dart et al., 2023), we suspected BTNL8 may not impact the baseline state but affect the resolution of inflammation once triggered. In this context, patients with variants in BTNL8 may present with unresolved inflammation due to increased intestinal permeability and gut dysbiosis upon infection leading to systemic inflammation resulting in MIS-C (Fig. 6).
Discussion
By applying a refined method of gene burden analysis, burdenMC, we have identified enrichment of rare genetic variants in BTNL8 among patients with MIS-C. The burdenMC method allowed confident exome-wide enrichment analysis across a cohort of diverse ancestries. BTNL8 variants were enriched exclusively in MIS-C when compared with genetic ancestry-matched healthy controls. We found rare predicted-deleterious variants in 8.3% of our MIS-C cohort (n = 144). BTNL8 was found to be specifically associated with MIS-C and not COVID-19 or invasive bacterial diseases, contributing a fourfold increased odds of presenting with MIS-C symptoms. Notably, the BTNL8 burden signal is driven largely by the European and Hispanic sub-cohorts, suggesting that the excess risk of BTNL8-related MIS-C is borne disproportionately by individuals of those ancestries. Furthermore, a protein domain burden analysis confirmed that rare variants in the B30.2 domain of BTNL8 were enriched in our MIS-C cohort, and in an additional COVID-HGE-MIS-C cohort. We tested all 25 rare BTNL8 variants from both MIS-C cohorts and found 8 to be functionally impaired accounting for 18 MIS-C patients affecting 2.3% of all MIS-C patients (n = 835). Notably, 6 of 8 functionally impaired variants were located within the B30.2 domain, which in other contexts has been shown to mediate protein-protein interactions and/or sensing of intracellular metabolites (Yuan et al., 2023; Willcox et al., 2019).
The mechanism by which these variants might promote MIS-C remains unclear. As in many inflammatory diseases, including COVID-19, MIS-C has shown dysregulation of neutrophils which reportedly express BTNL8. However, neither cell surface expression nor function has yet been established for BTNL8 in neutrophils, and we had no success in identifying changes in BTNL8 expression in whole blood from MIS-C patients effectively stimulated with SARS-CoV-2 antigens. In addition, neutrophil BTNL8 expression has only been reported in public expression datasets (https://gtexportal.org), with no studies focused on isolated neutrophils; moreover, we did not have access to patients’ neutrophils. However, our whole blood RNA-seq data suggest that changes in neutrophil counts during acute disease did not significantly impact BTNL8 expression (Fig. S4). Hence there is no unequivocal evidence for a role for BTNL8 in neutrophils in our data, but this does not exclude some role for it in the development of disease; in sum, the issue warrants further investigation. In another study, BTNL8 was found to be upregulated in regulatory T cells (Tregs) and a possible marker of regulatory macrophage-induced interleukin-10-expressing-induced Tregs (Riquelme et al., 2018), which suggests another possible role for BTNL8 in the blood. Specifically, Tregs of MIS-C patients, which we were not able to assess in this study, may be an area of further investigation.
The other reported site of BTNL8 expression is the human gut epithelium where its co-expression with BTNL3 critically regulates intraepithelial Vγ4+ T cells that are implicated in maintaining and/or restoring the integrity of the gut barrier. Indeed, a common hypomorphic variant allele fusing BTNL8 and BTNL3 is associated with disease-modifying exacerbation of Crohn’s disease, consequent to dysregulation of gut Vγ4+ cells (Dart et al., 2023). Evidence for the importance of BTNL8 in the development of enteropathy is highlighted not only by this study suggesting a role for BTNL8 in the resolution of gut inflammation but furthermore in a recent study, where BTNL8 was used as a marker for enteropathy in immunodeficiency, polyendocrinopathy, and enteropathy X-linked syndrome (IPEX) patients (Vazquez et al., 2022). Unfortunately, intestinal tissue from acute MIS-C patients was unavailable, thus limiting our investigations. However, the prospect that the reduction in the surface expression and/or function of BTNL8 may result in hyperinflammation of the gut and increased intestinal permeability is completely consistent with increased gut permeability documented in most of our MIS-C patients, and invariably in those with variant BTNL8 alleles. Additionally, gut barrier dysregulation may promote antigenemia and thus systemic inflammation. This is in turn is consistent with a proposed model for MIS-C in which there is a persistent reservoir of virus in the gut, demonstrated by viral RNA in stool in other studies of MIS-C (Yonker et al., 2021). The delay in onset of MIS-C following initial SARS-CoV-2 exposure as well as the persistence of virus in the intestinal milieu may reflect a breakdown in local immunological tolerance that takes time to accrue and become symptomatic. Local immunologic tolerance would be predicted to be affected by barrier integrity, consistent with which mice lacking γδ T cells show increased αβ T cell–mediated inflammation, both spontaneously and following insults (Born et al., 1999; Markle et al., 2013; Hayday and Tigelaar, 2003; Yuan et al., 2023).
Several issues remain unresolved. First, our functional assays for BTNL8 variants are limited by our knowledge, being focused on specific aspects of Vγ4 TCR engagement that may therefore underestimate the frequency of hypomorphs among the variants. Second, although there may be many explanations, our failure to find enrichment of rare predicted-deleterious BTNL3 variants in our cohort might also point to BTNL3-independent functions of BTNL8 that we have not assayed. Third, the striking enrichment of BTNL8 variants in the B30.2 domain highlights the importance of better understanding its function, particularly in relation to gut epithelial cell regulation.
Finally, BTNL8 variation has not previously been associated with an infectious disease. Given the importance of BTNL8 in regulating the severity of Crohn’s disease (Dart et al., 2023), and given animal models in which γδT cell deficiencies have been associated with disease severity rather than disease incidence, it would be of interest to further investigate the role of this pathway in pediatric IBD and other pediatric inflammatory disorders such as KD where tissue repair may be an essential limit on symptoms. Interestingly, we have observed increased zonulin levels in KD patients who have been shown to share a host immune response with MIS-C (Tsoukas and Yeung, 2022), possibly suggesting commonalities of pathogenesis (Ghosh et al., 2022; Tsoukas and Yeung, 2022). Beyond this, our data confirm that genetic susceptibility may be a contributing factor in the development of MIS-C. Understanding the genetic basis of MIS-C may not only help elucidate its pathogenesis but also provide insights in understanding and/or managing other similar hyperinflammatory syndromes.
Materials and methods
Human experimental guidelines approval statement
This research included patients recruited from multiple countries and was conducted with written informed consent from the residing country of the patient and in accordance with the local regulation and institutional review board approvals. These include the following: the Medical Research Guidelines (https://www.ukri.org/councils/mrc/) and the National Health Service Research Ethics Committee (https://www.hra.nhs.uk/about-us/committees-and-services/res-and-recs/), Investigating the Genetics of Infection Study REC17/LO/153; DIAMONDS: Diagnosis and Management of Febrile Illness using RNA Personalized Molecular Signature Diagnosis REC20/HRA/1714; PERFORM: Personalised Risk assessment in Febrile illness to Optimize Real-life Management across the European Union REC16/LO/1684; EUCLIDS: the genetic basis of meningococcal and other life-threatening bacterial infections of childhood REC11/LO/1982; the Medical Ethical Committee of the Amsterdam UMC NL41023.018.12; the French Ethics Committee; the French National Agency for Medicine and Health Product Safety; the Institut National de las Santé et de la Recherche Medicale in Paris, France (protocol no. C10-13); the Rockefeller University Institutional Review Board in New York, NY, USA (protocol no. JCA-0700); the University of California at San Diego Institutional Review Board (protocol no. 140220 and 170790); the University of Hong Kong/Queen Mary Hospital (protocol no. UW 20-509); and University of São Paulo (protocol no 2020/05548-8).
Patients
MIS-C patients (n = 154) were enrolled via DIAMONDS (Diagnosis and Management of Febrile Illness using RNA Personalised Molecular Signature Diagnosis) (https://diamonds2020.eu) (n = 84); Amsterdam UMC, Netherlands (n = 13) and KD Research Center at University California, San Diego, La Jolla, CA, USA (n = 57). All ethical approvals necessary are in place for access to patient clinical data and biological samples. Clinical features of the cohort are described in Table 1, and all MIS-C patients (≤19 years old) met the World Health Organization diagnostic criteria for MIS-C (WHO, 2020) and were enrolled prior to December 2021. The diagnostic criteria as defined by the World Health Organization are as follows: Children and adolescents 0–19 years of age with fever >3 days, and two of the following: rash or bilateral non-purulent conjunctivitis or mucocutaneous inflammation signs; hypotension or shock; features of myocardial dysfunction, pericarditis, valvulitis, or coronary abnormalities; evidence of coagulopathy; and acute GI problems (diarrhea, vomiting, or abdominal pain). In addition to elevated markers of inflammation, no other obvious microbial cause of inflammation and evidence of COVID-19 or likely contact. External cohorts used for validation are summarized in Table S1. These include exome-sequenced MIS-C (n = 690) and SARS-CoV-2 infected children (n = 189) from the international COVID-HGE cohort (https://www.covidhge.com/) used for gene burden analysis; a previously published cohort of MIS-C patients (n = 38), healthy (n = 134), and febrile controls (n = 326) used for transcriptomic analysis recruited through DIAMONDS or prepandemic though PERFORM (Personalised Risk Assessment in Febrile Illness to Optimize Real-like Management across the European Union) (https://www.diamonds2020.eu/our-research-history/perform/) or EUCLIDS (European Union Childhood Life-Threatening Infectious disease Study (https://www.diamonds2020.eu/our-research-history/euclids/) (Jackson et al., 2023).
WES
Genomic DNA was isolated from blood, quantified, and quality-checked before carrying out WES (Novogene Ltd.). Exome targets were captured using Agilent SureSelect Human All Exon v6, and sequenced on Illumina NovaSeq yielding 150 bp paired-end reads. The mean on-target coverage was >50×, with 95% of target bases being covered at least 10×.
WES processing
Sequencing reads were aligned to the reference genome (hs37d5) using BWA-MEM (Li and Durbin, 2009; Li, 2013, Preprint). Raw-mapped reads were postprocessed to account for duplicates and systematic errors using Picard tools (https://broadinstitute.github.io/picard/). Per-sample variant calling was then performed using GATK (Van der Auwera et al., 2013; Poplin et al., 2018, Preprint), followed by joint genotyping across the entire cohort according to best practices. This process generated raw SNP and indel calls that were further refined and annotated using the Ensembl Variant Effect Predictor (McLaren et al., 2016). Further customized variant annotation was performed to facilitate downstream comparisons. Finally, the genotypes underwent cohort-level quality control with Peddy (Pedersen and Quinlan, 2017) to identify confounders such as cryptic relatedness and sample contamination. Genetic screening was performed to identify rare nonsynonymous variants previously reported to be involved in primary immunodeficiency (Bousfiha et al., 2022), severe COVID-19 (Zhang et al., 2020), MIS-C (Lee et al., 2023), KD, and familial hemophagocytic lymphohistiocytosis (Constantin et al., 2023) (Table S2). Medium to high-impact variants including missense, nonsense, insertion/deletion with a population AF of <0.01 were considered. Consequent to ambiguous genetically predicted ancestry assignment, nine patients were removed prior to burden testing.
Empirical burden testing
WES is currently most informative for relatively large cohorts of ethnically homogeneous patients, who are expected to carry mutations in a small set of genes along the same biological pathway. Such cohorts are inherently underpowered to detect common variants with small effects but can be leveraged to extract knowledge from aggregated lower-frequency variants. Techniques that aggregate variants to identify genes and pathways overburdened with low-frequency mutations can help unravel the genetic etiology of uncommon disease but currently face several limitations, including, lack of matched control samples, and the requirement for ancestral homogeneity. To address these limitations, we have developed a novel computational framework, burdenMC, to extract knowledge from small, and ethnically heterogeneous patient sequencing cohorts. Our approach has been designed to be as inclusive and flexible as possible, without sacrificing statistical power.
burdenMC
Our framework generates empirical estimates for the burden of rare variants in patient sequencing cohorts. To achieve this, we take advantage of existing large-scale population sequencing databases that capture the breadth of human genomic variation across different populations. The main contributing database is gnomAD (Gudmundsson et al., 2022), which contains whole genome and whole exome sequences of >140,000 individuals from around the globe. burdenMC is a Monte Carlo sampling method that iteratively simulates control datasets from gnomAD using only the summary statistics that are publicly available.
The process starts by determining the genetic ancestry of the individuals comprising the test cohort. To leverage the largest available training dataset, burdenMC utilizes the global ancestry inference pipeline devised by gnomAD (Karczewski et al., 2020). This entails projecting the test sequencing data onto gnomAD’s precomputed principal components and running a Random Forest classifier to assign each individual to one of five continental ancestry groups (EUR, AFR, SAS, EAS, AMR). The sizes of each ancestry group are denoted as NEUR, NAFR, NSAS, NEAS, and NAMR. Next, burdenMC extracts raw variant-level allele count data (ACEUR, ACAFR, ACSAS, ACEAS, and ACAMR) from gnomAD across the entire genome, along with the total number of gnomAD alleles in called genotypes (ANEUR, ANAFR, ANSAS, ANEAS, and ANAMR). Using the hypergeometric distribution, burdenMC performs Monte Carlo resampling for each allelic variant to simulate control groups matching the test cohort. This was achieved through iterative generation of hypergeometric random variates from Hypergeometric(ANpop,ACpop,Npop), where . To capture the extreme values of the allelic distribution, the Monte Carlo simulation was iterated 106 times, thus generating 106 control cohorts of size Npop.
For each of these control cohorts, burdenMC aggregates the number of sampled variants (at the exon, gene, pathway, or user-defined level) to obtain the empirical probability distribution of genomic burden. Finally, we calculated the variant burden in the test cohort (BPOP) at an equivalent aggregation level and by comparing to the empirical distribution we estimate P values as Pr(X≥BPOP), i.e., the probability of randomly observing a burden at least as large as BPOP (Fig. S1). To ensure comparability between the test and control sets, we performed extensive data harmonization, which includes a customized reciprocal filtering and pruning strategy.
Variant filtering strategies
burdenMC relies on publicly available genomic datasets to simulate control samples and therefore requires rigorous quality control at the variant level. To that end, we developed a comprehensive variant filtering strategy, combining population-based metrics as well as cohort-derived statistics. A key tenet of burdenMC filtering is reciprocity, meaning if a variant is flagged in controls, it will also be filtered in cases, and vice versa. Importantly, to avoid penalizing the sensitivity of burdenMC, some of our filters are implemented as annotations (“soft filters”), with informative flags attached to the burden result to aid interpretation.
Population-derived filters
As it comprises the largest publicly accessible database of genomic variants, gnomAD v2.1 is used as the basis for our population filters. Variants failing gnomAD’s own Random Forest filter are removed from downstream analyses. In addition, we filter variants located in genomic regions that are not well covered in the dataset (flag: gnomadDP0). This is achieved by inverting gnomAD’s documented exome calling regions. Given the database’s total sample size of 141,456, we also filtered variants for which <5,000 individuals had genotype calls (flag: ns5000). As burdenMC depends on summarized AF, the ns5000 filter aims to penalize sites at which rare variation may not be captured.
Genomic complexity filters
Certain genomic loci are challenging for both exome capture protocols and variant calling algorithms due to their low complexity. This manifests in multiple ways, most notably in the presence of repeat elements. To identify these regions, we modified the genome masking process originally devised for the Simons Genome Diversity Project (Mallick et al., 2016). Specifically, we used three criteria: the output of the symmetric DUST algorithm (score ≥ 28), which identifies tandem trinucleotide repeats (Morgulis et al., 2006); long homopolymer runs (≥7 bp); DNA satellites identified by RepeatMasker (as reported by the UCSC genome annotation database) (Smit et al., 2015). The resulting regions were merged and extended by 10 bp on either side to generate the final low-complexity region filter (flag: lcr).
Cohort-derived filters
We also implemented quality control filters that are calculated directly on the cohort being examined. We followed GATK best practices by filtering variants with potential strand bias, using the StrandOddsRatio metric (SOR > 3). We also apply a lenient filter to identify sites with excess heterozygosity (ExcessHet > 100). Similarly, we flagged variants that appear to violate Hardy-Weinberg assumptions (P value <0.001). These metrics are designed to highlight technical artifacts or cryptic consanguinity and are most informative for larger cohorts (>100 unrelated individuals).
Despite using SOR-based filtering, strand bias remained an issue particularly for genomic sites of relatively low coverage. To address this, we followed gnomAD practice by filtering variants where all individuals failed one of the following two criteria: skewed minor allele balance (<20% or >80%); low quality genotype (read depth <10 and genotype quality <20). This filter captures sites for which the only individuals that don’t exhibit strand bias are the ones with less confident genotype calls (flag: AC0). Finally, we excluded variants with >1% of the genotypes missing.
Output annotations
We also designed some annotations for the burdenMC results to highlight unusual scenarios that might require follow-up. When the burden signal is driven by only one variant, burdenMC becomes a proxy for a single-variant association test, which is not its intended purpose. The result may still be relevant to the phenotype, but further investigation is required (flag: var1). In another scenario, the burden signal may be driven by variants that appear to be common in the cohort of interest but are expected to be rare in the population. This may reflect a true over-representation of a causal variant, but it also captures dubious variants that have not triggered any of the previously described filters. Therefore, for variants that exhibit a pronounced allele frequency disparity between cases and controls (AF > 5% versus AF < 1%), we flag the corresponding burden result to aid interpretation (flag: af5).
Linkage disequilibrium (LD) pruning
Given the breadth of genetic variation that is represented in large databases such as gnomAD, statistical methods that examine multiple genomic sites simultaneously need to account for the co-occurrence of (neighboring) variants. Especially for approaches that rely solely on summary statistics, treating each gnomAD variant as independent will result in inflated aggregate counts in the control group, thus reducing the sensitivity to detect true mutational burden. To address this challenge, we have implemented a pruning step to identify and filter variants that rely solely on summary statistics, treating each gnomAD variant as independent will result in inflated aggregate counts in the control group, thus reducing the sensitivity to detect true mutational burden. To address this challenge, we have implemented a pruning step to identify and filter variants that are in LD.To achieve this, we calculated exhaustive pairwise LD in 10-kb genomic windows using raw data of variant co-occurrence generated by gnomAD. This was performed separately per ancestral group to account for differences in LD structure. We then annotated variants exhibiting high levels of LD (r2 ≥ 0.95) for downstream pruning (flags: ld_eur, ld_afr, ld_amr, ld_sas, ld_eas).
Since LD patterns can be variable between populations for the same locus, pruning was not performed at the variant level, but selectively for ancestry-specific summary statistics. This process requires a memory-efficient representation of genomic variants to allow for random access of the entire gnomAD database. To that end, we employed the reversible numerical encoding formulated by VariantKey (Asuni and Wilder, 2019, Preprint) to create an index structure that could be queried for pruning. Using this strategy, 84,943 variants were pruned for at least one ancestral group, representing ∼1.36% of all gnomAD variants (global AF < 5%). Finally, an additional round of LD pruning is performed directly on the specific cohort being analyzed using raw sequencing data. This is designed to identify linkage patterns (r2 > 0.95) for ultra-rare and novel variants that would be absent from the databases and may thus inflate the aggregate counts in the case group, leading to type I errors.
Effect size calculation
To quantify the effect size of aggregated variant burden, we adapted a strategy that was originally developed for the analysis of UK Biobank exome sequencing data (Backman et al., 2021; Bycroft et al., 2018).This is achieved by collapsing rare deleterious variants across the gene into a single marker. This marker is scored with 0 for individuals carrying no variants in the gene of interest, 1 for individuals carrying only heterozygous variants, and 2 for individuals carrying at least one homozygous variant. For cohorts with individual-level sequence data available, the score can be directly calculated. For comparator groups, such as gnomAD, which only provide population-level data, we utilized the Monte Carlo simulations generated by burdenMC to obtain conservative estimates of the collapsed variant score. The scores are then used as predictors in a logistic regression framework with disease status as the binary phenotype.
Transient transfection of BTNL8 and flow cytometry
Patient variants of BTNL8 were introduced into pCSIGPW encoding HA-BTNL8 (Vantourout et al., 2018) by site-directed mutagenesis quikchange II PCR (Agilent) and sequence confirmed. 293T human embryonic kidney cells were transfected with fivefold dilutions (4–500 ng) of HA-tagged BTNL8 and FLAG-tagged BTNL3 using PEI (3:1 PEI:DNA ratio). Total amount of transfected DNA was kept constant by supplementing BTNL plasmids with empty vector (EV) to 1 µg per 250,000 cells. 48 h posttransfection, cells were either stained with PE anti-FLAG (clone L5; Biolegend) and APC anti-HA (clone 16B12; Biolegend) to assess cell surface expression, or co-cultured for 5 h at 37°C, 5% CO2 with J76 cells transduced with the Vγ4Vδ1 TCR clone hu17, as previously described (Melandri et al., 2018). Co-cultured cells were then stained with PE anti-CD45 (clone HI30; Biolegend) to separate J76 and 293T cells; BV421 anti-CD3ε (clone OKT3; Biolegend) to monitor TCR downregulation; and APC anti-CD69 (clone FN50; Biolegend). To determine the surface expression of FLAG-BTNL3 and HA-BTNL8, specific geometric mean fluorescence intensities (gMFI) were calculated by subtracting the gMFI values of cells transfected with EV control and stained with anti-FLAG and anti-HA. TCR downregulation was calculated as = 100 − ([CD3ε gMFI of J76 cells co-cultured with BTNL-transfected cells]/[CD3ε gMFI of J76 cells co-cultured with EV-transfected cells]) * 100. CD69 upregulation was calculated as = (%CD69+ J76 cells co-cultured with BTNL-transfected cells)/(%CD69+ J76 cells co-cultured with EV-transfected cells).
QuantiFERON assay and RNA-seq
An external MIS-C cohort was recruited via DIAMONDS (https://diamonds2020.eu) (QuantiFERON RNA-seq cohort; Table S1) (Shankar-Hari et al., 2023). The cohort for transcriptomic analysis included MIS-C (n = 21) patients with a confident viral infection phenotype (n = 3) and pediatric healthy controls (n = 10). Venous blood was collected from patients and stimulated for 16–24 h using SARS-CoV-2 QuantiFERON tubes (Qiagen) and subsequently stored in a PAXgene blood RNA tube. Total RNA was extracted using PAXgene Blood miRNA kit (Qiagen) followed by additional DNAse treatment using RNA clean & concentrator kit (Zymo Research). Bulk RNA-seq was performed using NovaSeq PE150 sequencing platform by Novogene Ltd.; generating 60 M read per sample.
RNA-seq analysis
Prior to data analysis, genes with zero counts across all samples were removed from all datasets. PCA was performed to identify and remove outliers. Counts were normalized using DESeq2 (Love et al., 2014), and genes with zero counts across all samples post normalization were removed. Analysis of a published whole blood RNA-seq cohort (whole blood transcriptomics cohort; Table S1) was additionally performed (Jackson et al., 2023). Statistical analysis was conducted using the Mann–Whitney U test (P < 0.05).
Protein structure modeling
To assess the effects of genetic variation on protein function, we developed a structural modeling framework based on AlphaFold (Jumper et al., 2021). First, structure predictions are generated using AlphaFold-multimer (Evans et al., 2022, Preprint) and the MMseq2-based homology search heuristic (Mirdita et al., 2022). The resulting structural models are ranked based on their template modeling scores (TM-scores) and the highest-ranking models are selected for downstream analyses and visualization. Subsequently, to identify key amino acids in the predicted structure we performed RIN analysis, using established approaches (Clementel et al., 2022). This technique employs graph theory, with amino acids represented as nodes and residue interactions represented as edges in the network. The interaction edges are defined by 3D geometry and physico-chemical properties and thus capture higher order relationships between amino acids.
Proteomic analysis
Plasma levels of BTNL3, BTNL8, and zonulin were compared on a subset of patients analyzed on the SomaScan 7k platform (SomaLogic Inc.), a multiplexed assay that measures 7,000 known protein targets using modified aptamers (slow off-rate modified aptamers, SOMAmers), which included pediatric patients with MIS-C (n = 79), KD (n = 38), definite viral infections (DV; n = 43), and healthy controls (n = 24), recruited from the PreVAIL, DIAMONDS, and PERFORM studies (SomaScan cohort; Table S1) (Yeoh et al., 2024; Wang et al., 2023). Statistical analysis was conducted using Kruskal–Wallis and Mann–Whitney test (P < 0.05).
Online supplemental material
Fig. S1 shows a schematic representation of our novel genetic burden testing. Fig. S2 shows the genetic variation in BTNL8. Fig. S3 shows example flow cytometry plots for BTNL8 and BTNL3 transfected 293T cells. Fig. S4 shows BTNL8 gene expression in whole blood and SARS-CoV02 antigen stimulated whole blood. Fig. S5 shows abundance of serum proteins used as markers for intestinal integrity. Table S1 shows the breakdown of external cohorts described and the analyses. Table S2 shows rare non-synonymous variants in genes previously implicated in primary immunodeficiencies identified in MIS-C patients. Table S3 shows genetic variants identified in MIS-C cohort in OAS-RNAseL pathway. Table S4 shows genes significantly enriched in combined gene burden testing. Table S5 shows COVID-19 host genetics initiative meta-analysis gene burden results for BTNL8 which is adapted from supplementary material of Butler-Laporte et al. (2022). Table S6 shows CNV frequency in MIS-C cohort compared to gnomAD. Table S7 shows rare variant burden analysis at the protein domain level. Table S8 shows domain-level BTNL8 rare variant burden in COVID-HGE cohort.
Acknowledgments
We thank Luigi D. Notarangelo, Helen Su, Ginni Khurana, Venizelos Papayannopoulos, and Robin Dart for helpful discussion on the manuscript.
This work was funded by the following: V. Sancho-Shimizu is funded by the UK Research and Innovation Future Leaders Fellowship (MR/S032304/1), H2020 UNDINE HORIZON-HLTH-2021-DISEASE-04 project 101057100, NIHR Imperial Biomedical Research Centre (PA0873), Community Jameel Imperial College COVID-19 Excellence Fund. J.A. Herberg is funded by Rosetrees Trust COVID-19 fund. M. Levin has recieved funding from the European Union’s Seventh Framework Programme under EC-GA no. 279185 (EUCLIDS), and the European Union’s Horizon 2020 Grant Agreement no. 848196 the Diagnosis and Management of Febrile Illness using RNA Personalised Molecular Signature Diagnosis Study (DIAMONDS). J.C. Burns received funding from the National Institutes of Health (NIH) and National Institute for Child Heath and Development R33HD105590 for the PreVAIL grant. A.C. Hayday acknowledges support from the Francis Crick Institute, which receives its core funding from Cancer Research UK (CRUK) (FC001003), the UK Medical Research Council (FC001003), and the Wellcome Trust (FC001003). F. Rieux-Laucat acknowledges funding from ANR-18-RHUS-0010 and ANR-flash Covid19 ‘‘AIROCovid’’ the Fondation pour la recherche Medicale (FRM: EQU202103012670). M. Noursadeghi is supported by the Wellcome Trust (207511/Z/17/Z) and NIHR Biomedical Research Centre at University College London Hospitals. Additional support for A. Pujol was provided by the Fundació La Marató de TV3 (202131-32-33), ACCI20–759 CIBERER, Marató TV3 COVID 2021–31-33, the Horizon 2020 program under grant no. 824110 (EasiGenomics grant no. COVID-19/PID12342), AGAUR 2021SGR00899 and the CERCA Program/Generalitat de Catalunya. This study makes use of the ICR1000 UK exome dataset generated by Professor Nazneen Rahman’s Team in the Division of Genetics & Epidemiology at The Institute of Cancer Research, London. This work made use of data and samples generated by the 1958 Birth Cohort (NCDS), which is managed by the Centre for Longitudinal Studies at the UCL Institute of Education, funded by the Economic and Social Research Council (grant number ES/M001660/1). Access to these resources was enabled via the Wellcome Trust & Medical Research Council (MRC): 58FORWARDS grant (108439/Z/15/Z) (The 1958 Birth Cohort: Fostering new Opportunities for Research via Wider Access to Reliable Data and Samples). Before 2015 biomedical resources were maintained under the Wellcome Trust and MRC 58READIE Project (grant numbers WT095219MA and G1001799). Data governance was provided by the METADAC data access committee, funded by Economic and Social Research Council, Wellcome, and MRC (2015–2018: grant number MR/N01104X/1; 2018–2020: grant number ES/S008349/1. The Laboratory of Human Genetics of Infectious Diseases is supported by the Howard Hughes Medical Institute, the Rockefeller University, the St. Giles Foundation, the NIH (R01AI163029 and R21AI160576), the National Center for Advancing Translational Sciences, NIH Clinical and Translational Science Award program (UL1TR001866), the Fisher Center for Alzheimer’s Research Foundation, the Meyer Foundation, the JPB Foundation, the French National Research Agency (ANR) under the “Investments for the Future” program (ANR-10-IAHU-01), the Integrative Biology of Emerging Infectious Diseases Laboratory of Excellence (ANR-10-LABX-62-IBEID), the French Foundation for Medical Research (FRM) (EQU201903007798), the ANR GenMISC (ANR-21-COVR-039), the ANRS-COV05, ANR GENVIR (ANR-20-CE93-003) and ANR AABIFNCOV (ANR-20-CO11-0001) projects, the ANR-RHU program (ANR-21-RHUS-08), the European Union’s Horizon 2020 research and innovation program under grant agreement no. 824110 (EASI-genomics), the HORIZON-HLTH-2021-DISEASE-04 program under grant agreement 101057100 (UNDINE), the ANR-RHU Program ANR-21-RHUS-08 (COVIFERON), the Square Foundation, Grandir - Fonds de solidarité pour l’enfance, the Fondation du Souffle, the SCOR Corporate Foundation for Science, the Battersea & Bowery Advisory Group; William E. Ford, General Atlantic’s Chairman and Chief Executive Officer; Gabriel Caillaux, General Atlantic’s Co-President, Managing Director and Head of business in EMEA, and the General Atlantic Foundation; the French Ministry of Higher Education, Research, and Innovation (MESRI-COVID-19), Institut National de la Santé et de la Recherche Médicale, Imagine Institute, and Université Paris Cité. Open Access funding provided by Imperial College London.
Author contributions: E. Bellos: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Visualization, Writing - original draft, Writing - review & editing, D. Santillo: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Validation, Visualization, Writing - original draft, Writing - review & editing, P. Vantourout: Investigation, Methodology, Resources, Visualization, Writing - review & editing, H.R. Jackson: Formal analysis, Methodology, Software, Writing - review & editing, A. Duret: Investigation, Writing - review & editing, H. Hearn: Data curation, Formal analysis, Software, Y. Seeleuthner: Resources, Software, E. Talouarn: Data curation, S. Hodeib: Investigation, H. Patel: Resources, O. Powell: Investigation, S. Yeoh: Formal analysis, Visualization, S. Mustafa: Data curation, Investigation, Project administration, D. Habgood-Coote: Data curation, S. Nichols: Methodology, L. Estramiana Elorrieta: Resources, G. D’Souza: Validation, V.J. Wright: Project administration, Supervision, Writing - review & editing, D. Estrada-Rivadeneyra: Investigation, A.H. Tremoulet: Resources, Writing - original draft, Writing - review & editing, K.B. Dummer: Resources, Writing - review & editing, S.A. Netea: Resources, Writing - review & editing, A. Condino-Neto: Resources, Writing - review & editing, Y.L. Lau: Funding acquisition, Investigation, Resources, E. Núñez Cuadros: Resources, J. Toubiana: Conceptualization, Investigation, Resources, Writing - review & editing, M. Holanda Pena: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing, F. Rieux-Laucat: Resources, C.-E. Luyt: Investigation, Writing - review & editing, F. Haerynck: Resources, J.L. Mege: Resources, S. Chakravorty: Data curation, Investigation, Resources, Writing - review & editing, E. Haddad: Resources, Writing - review & editing, M.-P. Morin: Resources, Writing - review & editing, Ö. Metin Akcan: Data curation, S. Keles: Data curation, Validation, M. Emiroglu: Data curation, Writing - review & editing, G. Alkan: Data curation, Validation, S.K. Tüter Öz: Data curation, S. Elmas Bozdemir: Resources, G. Morelle: Resources, A. Volokha: Resources, Writing - review & editing, Y. Kendir-Demirkol: Investigation, Methodology, B. Sözeri: Investigation, Visualization, T. Coskuner: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing, A. Yahsi: Data curation, B. Gulhan: Writing - review & editing, S. Kanik-Yuksek: Resources, G.I. Bayhan: Conceptualization, Data curation, Formal analysis, Investigation, A. Ozkaya-Parlakay: Conceptualization, Data curation, Supervision, Writing - review & editing, O. Yesilbas: Data curation, Investigation, N. Hatipoglu: Resources, T. Ozcelik: Resources, A. Belot: Resources, Writing - review & editing, E. Chopin: Resources, V. Barlogis: Resources, E. Sevketoglu: Investigation, E. Menentoglu: Investigation, Z.G. Gayretli Aydin: Resources, M. Bloomfield: Resources, Writing - review & editing, S.A. AlKhater: Resources, Writing - review & editing, C. Cyrus: Investigation, Resources, Y. Stepanovskiy: Resources, A. Bondarenko: Conceptualization, Investigation, Resources, Writing - review & editing, F.N. Öz: Investigation, M. Polat: Investigation, J.: Resources, J. Lebl: Investigation, Resources, Validation, Writing - review & editing, A. Geraldo: Investigation, E. Jouanguy: Resources, M.J. Carter: Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing - review & editing, P. Wellman: Writing - review & editing, M. Peters: Investigation, Resources, Writing - review & editing, R. Pérez de Diego: Conceptualization, L.A. Edwards: Validation, Writing - review & editing, C. Chiu: Data curation, Investigation, Resources, Writing - review & editing, M. Noursadeghi: Investigation, A. Bolze: Writing - review & editing, C. Shimizu: Resources, M. Kaforou: Data curation, Formal analysis, Funding acquisition, Resources, Supervision, Writing - review & editing, M.S. Hamilton: Data curation, Formal analysis, Investigation, Resources, Writing - review & editing, J.A. Herberg: Funding acquisition, Supervision, Writing - review & editing, E.G. Schmitt: Resources, A. Rodriguez-Palmero: Resources, A. Pujol: Data curation, Funding acquisition, Investigation, Resources, Supervision, Writing - review & editing, J. Kim: Data curation, Formal analysis, Software, Writing - review & editing, A. Cobat: Formal analysis, Resources, Writing - review & editing, L. Abel: Formal analysis, Writing - review & editing, S.-Y. Zhang: Funding acquisition, Investigation, Resources, Writing - review & editing, J.-L. Casanova: Funding acquisition, Project administration, Resources, Writing - review & editing, T.W. Kuijpers: Conceptualization, Investigation, Resources, Writing - review & editing, J.C. Burns: Resources, Writing - review & editing, M. Levin: Conceptualization, A.C. Hayday: Project administration, Resources, Supervision, Writing - review & editing, V. Sancho-Shimizu: Conceptualization, Funding acquisition, Project administration, Supervision, Writing - original draft, Writing - review & editing.
References
COVID-19 Human Genetic Effort, DIAMONDS, and EUCLIDS members are listed at the end of the PDF.
Author notes
E. Bellos and D. Santillo contributed equally to this paper.
Disclosures: A.H. Tremoulet reported non-financial support from Janssen Pharmaceuticals outside the submitted work. C.-E. Luyt reported personal fees from Advanzpharma, grants from Merck, and non-financial support from Pfizer outside the submitted work. A.C. Hayday reported grants from Takeda Pharmaceuticals, and personal fees from ImmunoQure AG, Prokarium, and TransImmune AG outside the submitted work; in addition, A.C. Hayday had a patent to US20210246187A1 pending (Takeda). No other disclosures were reported.