Richter syndrome (RS) derives from the rare transformation of chronic lymphocytic leukemia (CLL) into an aggressive lymphoma, most commonly of the diffuse large B cell lymphoma (DLBCL) type. The molecular pathogenesis of RS is only partially understood. By combining whole-exome sequencing and copy-number analysis of 9 CLL-RS pairs and of an extended panel of 43 RS cases, we show that this aggressive disease typically arises from the predominant CLL clone by acquiring an average of ∼20 genetic lesions/case. RS lesions are heterogeneous in terms of load and spectrum among patients, and include those involved in CLL progression and chemorefractoriness (TP53 disruption and NOTCH1 activation) as well as some not previously implicated in CLL or RS pathogenesis. In particular, disruption of the CDKN2A/B cell cycle regulator is associated with ∼30% of RS cases. Finally, we report that the genomic landscape of RS is significantly different from that of de novo DLBCL, suggesting that they represent distinct disease entities. These results provide insights into RS pathogenesis, and identify dysregulated pathways of potential diagnostic and therapeutic relevance.
Richter syndrome (RS) is defined as the development of a lymphoma, most commonly of the diffuse large B cell lymphoma (DLBCL) type, during the clinical progression of B cell chronic lymphocytic leukemia (CLL; Swerdlow, 2008; Rossi and Gaidano, 2009). This event occurs in the natural history of CLL with a cumulative incidence of ∼10% (Rossi and Gaidano, 2009). Analysis of immunoglobulin (Ig) gene rearrangements showed that RS derives from the same B cell that originated the CLL clone. Despite being classified histologically as DLBCL, clonally-related RS is characterized by a much poorer outcome, as most of the patients do not respond to standard therapy and succumb rapidly to the disease (Tsimberidou et al., 2008; Rossi et al., 2011b).
The molecular pathogenesis of RS is only partially understood. Inactivation of the TP53 tumor suppressor gene is observed in ∼60% of patients and represents an independent predictor of overall survival after RS transformation (Rossi et al., 2011b). The second most frequent genetic lesions are represented by NOTCH1 and MYC activating events, each occurring in about ∼30% of cases (Fabbri et al., 2011; Rossi et al., 2011b). Similar to other examples of evolution of mature B cell–derived tumors from indolent to aggressive disease, MYC-activating events often coexist with TP53 disruption in RS (Yano et al., 1992; Davies et al., 2007), whereas they distribute in a mutually exclusive fashion with NOTCH1 mutations (Fabbri et al., 2011), consistent with the notion that NOTCH1 is a direct transcriptional activator of MYC expression (Palomero et al., 2006). These genetic abnormalities are often acquired at transformation, suggesting their importance for the establishment and maintenance of the phenotype characteristic of RS (Fabbri et al., 2011; Rossi et al., 2011b).
The known RS-associated genetic lesions have been identified based on candidate gene approaches, do not account for all cases, and most likely represent only a fraction of those associated with RS pathogenesis. Conversely, unbiased, genome-wide methods aimed at the characterization of the entire spectrum of genetic lesions that are acquired by RS can be useful to decipher the clonal evolution pattern occurring in the CLL-RS transition process, to provide further insights into its pathogenesis, and to identify possible differences between RS-associated DLBCL and de novo DLBCL (Morin et al., 2011; Pasqualucci et al., 2011; Lohr et al., 2012; Zhang et al., 2013). Toward this end, we have adopted an integrated approach based on next generation whole-exome sequencing (WES) and genome-wide high-density single-nucleotide polymorphism (SNP) analysis to investigate the RS coding genome.
We report that most RS cases derive from the same clone as CLL cells through a linear evolution pattern involving the acquisition of novel genetic lesions. These lesions are heterogeneous in number and spectrum among RS cases, and include genes not previously associated with RS transformation. Finally, our results demonstrate that RS represents a genetically distinct entity from de novo DLBCL.
RESULTS
Most RS arise through a linear evolution pattern from the CLL predominant clone
Transformation of CLL to RS can occur as a linear process, in which RS is a direct descendant of the initial major CLL clone, or by branched evolution from a common earlier precursor cell that acquired distinct genetic lesions to become a CLL or a RS. To decipher the clonal evolution pattern of CLL to RS, we performed next-generation WES, and copy number (CN) and fluorescence in situ hybridization (FISH) analysis for the detection of translocations known to be recurrent in RS and/or de novo DLBCL (i.e., MYC, BCL6, and BCL2 translocations) in a set of nine paired and clonally related CLL-RS cases. In three cases, we also included the specimen corresponding to the terminal leukemic phase of RS (RSII), which represents an extremely rare and aggressive phenotype of the disease, characterized by severe cytopenia and leukemic dissemination of large B cells within the peripheral blood, and invariably associated with patients’ death (Table S1; Rossi and Gaidano, 2009). Additionally, an independent set of six paired CLL-RS cases with only CN data available was included in the analysis (total, n = 15 pairs).
To reconstruct the evolutionary history of CLL transformation, we assessed separately the load of genetic lesions shared between the two phases of the disease, those present in CLL and absent in RS, and those present in RS but not in CLL (Fig. 1). This analysis allowed us to discriminate between a linear pattern of evolution, in which all CLL lesions are maintained in the RS specimen together with additional RS-specific abnormalities, and a branched pattern of evolution, in which CLL- and RS-specific lesions are present in addition to the set of shared abnormalities. Due to the unavailability of paired normal DNAs, mutations present in both CLL and RS were counted only for genes known to be mutated in CLL (i.e., NOTCH1, SF3B1, TP53, ATM, and MYD88). Genetic lesions that were only observed in the CLL phase because of CN losses or copy neutral-LOH in RS were not computed in the analysis.
In the majority of RS patients (n = 12/15), all the abnormalities identified in the CLL phase of the disease were maintained in the RS phase, which, in turn, acquired a variable number of genetic lesions. This scenario is consistent with a linear pattern of evolution from CLL to RS, as observed for other tumor progression states (Walter et al., 2012; Aparicio and Caldas, 2013) and for a subset of progressed CLLs (Schuh et al., 2012; Landau et al., 2013; Ouillette et al., 2013; Fig. 1; Table S2 and Table S3). In one patient (case 7), a subclonal 13q14 deletion detected by FISH in 9% of the nuclei of CLL cells (Table S1) was not found in the paired RS, suggesting its presence in a minor clone that was not selected for subsequent transformation, and consistent with the notion that absence of del13q14 is associated with a higher risk of transformation (Rossi et al., 2008).
In the remaining three patients (20%), a subset of the lesions identified in the CLL clone was not detectable in the RS specimen. In particular, case 5 harbored one point mutation in the CLL phase that was absent both in the corresponding RS sample and in the final leukemic phase; in case 4, four point mutations were present in CLL at diagnosis and in the terminal leukemic phase, but not in the RS specimen (Fig. 1 and Table S2); in the third patient, characterized only by CN analysis (case 25), ∼35% of CLL lesions (n = 8/23) were not maintained in the RS diagnostic sample (Table S3). In all three cases, a variable number of genetic lesions were uniquely present in RS cells. These scenarios are consistent with a branching pattern of evolution, in which a putative precursor carrying the genetic lesions common to CLL and RS has then evolved toward CLL and RS through the acquisition of distinct and independent events.
To address the evolutionary pattern followed in the progression of RS from nodal to leukemic disease, we next focused on the set of lesions present either in the RS or in the RSII phase. Notably, whereas case 2 progressed through a linear evolution pattern, in both case 4 and case 5 the RS and its terminal leukemic phase acquired a totally or partially distinct set of genetic lesions, suggesting their progression through a branching pattern of clonal evolution (Figs. 1–3). In case 4, biallelic TP53 disruption was acquired in both phases of RS, but this event clearly occurred in different tumor subclones, as documented by the different size of the deletion in 17p and by the distinct TP53 mutation observed in RS and RSII (R213X and R248Q, respectively; see Fig. 3, a and b). In case 5, multiple genetic lesions ultimately leading to NOTCH signaling activation were acquired during CLL transformation and subsequent RS progression (Fig. 3, c and d). In particular, a NOTCH1 PEST-truncating mutation (P2514fs) was present in the CLL phase and in all following RS biopsies; a homozygous frameshift deletion in the SPEN gene (P1356fs), a putative negative regulator of NOTCH signaling targeted by inactivating mutations in splenic marginal zone lymphoma (Kuroda et al., 2003; Saito et al., 2003; Rossi et al., 2012c), was acquired at RS diagnosis and maintained in the leukemic phase of the disease; and during the progression from nodal localization (RS) to peripheral blood invasion (RSII), a second NOTCH1 mutation targeting the heterodimerization domain of the protein was acquired (V1721M; Fig. 3, c and d). The detection of PEST and heterodimerization domain mutations in the same patient is of relevance, as mutations in these two distinct domains have been shown to act synergistically in promoting signaling activation (Weng et al., 2004).
Overall, these patterns of somatic alterations are consistent with a predominantly linear evolutionary model, in which RS evolves through the addition of novel genetic lesions to those already present in the predominant CLL clone. In contrast, the blood localization of the disease seems to be often the result of a branching pattern of evolution, consistent with the selection and expansion of an ancestral precursor clone that was present in the patients at an earlier stage than the time of the initial diagnostic RS biopsy (Fig. 2).
RS acquires a heterogeneous number and spectrum of genetic lesions
To assess the load and the spectrum of genetic lesions acquired in the transition of CLL to RS, we next focused on the set of genetic aberrations present exclusively in the RS coding genome. WES analysis in the 9 CLL-RS pairs revealed the presence of 241 nonsilent mutations specific for RS, either at diagnosis (n = 165, range 0–118 events/case) or in the terminal leukemic phase of the disease (n = 76, range 0–36 events/case; Table S2; Fig. 1 c). A subset of these mutations (n = 13/241, 5.4%) was already detectable in a small fraction of the CLL sequencing reads (≤20%), suggesting their presence at the subclonal level early during CLL development, and subsequent selection and outgrowth during transformation. Analogously, CN analysis of the same cases, and of the 6 pairs with only CN data available, revealed the presence of 138 CNAs specific for RS at diagnosis (n = 95, range 0–29/case) or for the terminal leukemic phase (n = 43, range 14–29; Table S3; Fig. 1 c).
The combination of WES, CN, and FISH data obtained in the 9 pairs for which the complete set of analysis was available showed that the RS genome acquires a mean of ∼22 genetic lesions/case, with a great variability across individual samples (range 0–130/case; Fig. 1 c). The only case (8-RS) in which we did not detect any RS-specific genetic lesion may have acquired genetic/epigenetic aberrations not assessable with our integrated approach (e.g., translocations, mutations in noncoding regulatory portions of the genome, or aberrant methylation; Küppers and Dalla-Favera, 2001; Kulis et al., 2012; Huang et al., 2013). The lack of acquisition of lesions in this patient was not caused by low tumor representation, as documented by the presence of a set of abnormalities, including a TP53 mutation, which were shared by the CLL and RS phase (Fig. 1 a). The load of alterations acquired by the terminal leukemic phase of the disease was higher compared with the one observed in the CLL-RS transition, with ∼40 acquired lesions per case on average (Fig. 1 c).
Although the number of cases studied is limited, no significant difference in the load of acquired genetic lesions was observed between IGHV mutated or unmutated RS or in cases with different clinical stages of the disease (Fig.1; Table S1).
A detailed list of genes altered in RS in the discovery panel is reported in Tables S2 and S3, with notes on their possible functional significance.
Overall, these data show that the genetic lesions acquired in the transition of CLL to RS are quite heterogeneous in terms of load and spectrum across samples.
Recurrent targets of CN changes in RS: frequent loss of TP53 and CDKN2A/B
One way to determine the relevance of newly identified genetic lesions is to assess their recurrence in the disease. Toward this end, we first extended the CN analysis to a larger cohort of RS (n = 42 cases). Consistent with the results obtained in the discovery panel, this analysis confirmed the high degree of heterogeneity of the RS genome, with on average ∼12.5 CNAs per case and a range of 0–55 aberrations per sample, mostly represented by CN losses, which accounted for ∼60% of the identified lesions (Fig. 4 a).
We identified 107 focal minimal common regions (MCRs) of aberration (84 of loss and 23 of gain) encompassing 1–3 protein-coding genes, which most likely represent the target of the lesion (Fig. 4 b). The relevance of the identified lesions was confirmed through GISTIC, an algorithm based on the amplitude and frequency of occurrence of CN changes (Beroukhim et al., 2007; Fig. 4 d). To address the specificity of these newly identified recurrent regions of CN aberration for RS, we compared their frequency with that reported in a cohort of 353 newly diagnosed and previously untreated CLL cases (Edelmann et al., 2012), focusing on genes altered in ≥10% of RS cases as well as on known recurrent cytogenetic abnormalities (Edelmann et al., 2012; Fig. 4, b and c).
Consistent with previous reports (Rossi et al., 2011b), the most frequent aberration significantly enriched in RS was represented by del17p, detected in ∼40% of RS cases and in the majority of the cases involving the TP53 tumor suppressor gene (n = 18/20). Notably, the overall number of genetic lesions was significantly higher in TP53-disrupted cases compared with TP53 wild-type cases, consistent with an increased genomic instability caused by this genetic abnormality (P < 0.001; Fig. 4 e; Ouillette et al., 2010). Moreover, four TP53 disrupted cases displayed complex intrachromosomal rearrangements caused by alternating gains and losses of genetic material, frequently accounting for >10 switches per chromosome, possibly suggestive of chromothripsis (unpublished data). A preferential association of this phenomenon with TP53 lesions has been reported in CLL (Edelmann et al., 2012), as well as in other tumor types such as medulloblastoma (Rausch et al., 2012).
Our analysis identified del9p21 as a frequent genetic aberration commonly detectable in RS (∼30% of cases) and shown to be specifically acquired at transformation in all cases in which the CLL phase was available (n = 3/3; Fig. 5, a and b). Consistently, del9p21 was never observed in unselected CLL cases (Edelmann et al., 2012). In seven out of the 15 affected cases (46.7%), the loss of genetic material encompassed <10 genes, always including the CDKN2A/B locus. Disruption of CDKN2A/B was biallelic in over one third of cases (n = 6/15); in 3 of them, this was a result of reduplication of 1 allele carrying a hemizygous CDKN2A/B deletion, as assessed by LOH analysis, whereas one additional case harbored a heterozygous deletion with a CDKN2A splice site mutation in the second allele (Fig. 5 c and Table S2). One patient (16-RS) harbored a focal CN loss encompassing only the CDKN2B gene, which encodes for p15INK4B, a regulator of cell cycle arrest that is potently induced by TGF-β signaling (Hannon and Beach, 1994). Collectively, these results suggest that the CDKN2A/B locus is the specific target of the 9p21 deletions, which is consistent with observations in other tumor types (Sherr, 2004).
CDKN2A encodes both for the negative regulator of cell cycle p16INK4A (Serrano et al., 1993) and for ARF, an inhibitor of the negative regulator of p53 MDM2 (Zhang et al., 1998). We thus sought to analyze the distribution of aberrations involving this locus with other lesions known to affect key regulators of cell cycle control and DNA repair (Fig. 5 d). This analysis revealed that CDKN2A losses, TP53 disruption, and MYC-activating events often coexist. Interestingly, all these lesions were mutually exclusive with trisomy 12, another recurrent alteration observed in ∼30% of RS cases (P < 0.05). In addition, TP53 disruption was significantly (P < 0.05) associated with del13q14 and with RB1 disruption (Fig. 5 e).
Several additional CN changes occurred at higher frequency in RS compared with CLL, including trisomy 12, losses of the 7q31.31-36.3, 8p and 14q23.2-q32.33 loci, and high level amplifications affecting the long arm of chromosomes 8, 11, 13, and 18. The putative targets of these lesions included genes previously linked to cancer, such as POT1, a gene involved in telomeres protection, which was reported to harbor mutations in ∼3% of CLL patients and also found to be mutated in 2 of our 27 RS (Table S2; Baumann and Cech, 2001; Quesada et al., 2012; Landau et al., 2013; Ramsay et al., 2013); TRAF3, a negative regulator of noncanonical NF-κB that is frequently inactivated in multiple myeloma (Annunziata et al., 2007; Keats et al., 2007); and the oncogenes MYC, ETS1, MIR17HG, and BCL2 (Vaux et al., 1988; Bories et al., 1995; He et al., 2005; Xiao et al., 2008). No point mutations were found in the aforementioned genes.
Among the other CNAs that are recurrent in CLL patients, del11q and gains of the 2p16.1-2p15 locus occurred in RS at a frequency comparable to that of unselected CLLs (14 and 8% vs. 27 and 7%, respectively; Fig. 4 c; Edelmann et al., 2012). In contrast, the frequency of del13q14 was significantly lower in RS compared to CLL cases at diagnosis (∼28 vs. 61%; P < 0.001), consistent with the observation that its absence is an independent risk factor for RS development and that this lesion, known to be an early event in CLL onset, was never acquired at RS (Edelmann et al., 2012; Rossi et al., 2008).
Collectively, these results confirm the high degree of heterogeneity of the RS genome, point to relevant genes and loci not previously linked to RS, and provide a large array of novel candidates for mutational analysis.
Recurrent targets of point mutations in RS
To identify recurrent targets of point mutations in RS, we performed WES analysis in a “screening panel” of 18 RS cases. Due to the unavailability of paired normal DNAs, the mutation frequency was only assessed on those genes/mutations for which previous information of functional relevance was available, including: (a) genes identified by WES in the discovery panel (n = 231); (b) genes located in the focal MCR of an aberration with a frequency ≥10% (n = 54); and (c) genes mutated in de novo DLBCL and CLL, as identified in recently published genome-wide studies (Fabbri et al., 2011; Morin et al., 2011; Pasqualucci et al., 2011; Puente et al., 2011; Wang et al., 2011; Lohr et al., 2012; Quesada et al., 2012; Landau et al., 2013) and listed in the Cancer Gene Census or for which mutations have been implicated functionally in cancer (n = 215). Using these criteria, we interrogated a total number of 481 genes. Fig. 6 shows the prevalence (including discovery and screening panels) of RS cases with mutations in genes altered with a frequency >10% (n = 23).
To address the specificity of the newly identified recurrent mutations for RS, we compared their frequency with the one previously reported in whole-exome/genome studies performed in CLL patients (n = 205; Fabbri et al., 2011; Puente et al., 2011; Quesada et al., 2012; Wang et al., 2011). Indeed, with the exception of three genes previously linked to CLL (i.e., MYD88, SF3B1, and ATM; Stankovic et al., 1999; Wang et al., 2011; Puente et al., 2011; Quesada et al., 2012), 20/23 mutated genes were altered at a significantly higher frequency in RS (p-values in Fig. 6) compared with CLL. This finding is consistent with the notion that MYD88 mutations seem to arise early in the natural history of CLL (Landau et al., 2013) and with the reported enrichment of SF3B1 mutations, which often coexist with ATM disruption (Wang et al., 2011) in CLL cases refractory to fludarabine-based therapy compared with CLLs at diagnosis and RS patients (Rossi et al., 2011a).
Genes affected in >10% of RS patients were representative of a variety of biological programs, including DNA repair (TP53 and ATM; Stankovic et al., 1999; Zenz et al., 2010), NOTCH signaling pathway (NOTCH1 and SPEN; Fabbri et al., 2011; Puente et al., 2011; Rossi et al., 2012c; Saito et al., 2003), transcription and chromatin remodeling (ZFHX3, ARID1A, ARID1B, MLL2, and IRX5; Sun et al., 2005; Grasso et al., 2012; Wu and Roberts, 2013), B cell development/activation (IRF8, PIM1, and MYD88; Domen et al., 1993; Wang and Morse, 2009; Rawlings et al., 2012), PI3K pathway (PREX2; Berger et al., 2012; Fine et al., 2009), cytoskeleton/extracellular matrix organization and adhesion (PCLO, COL27A1, COL2A1, SVEP1, and SYNE2), ubiquitin proteolysis (USP8, USP34; Berlin et al., 2010; Lui et al., 2011), and splicing regulation (SF3B1; Golas et al., 2003; Fig. 6). A detailed description of the identified mutations is reported in Table S2, with remarks on their possible functional consequence. Overall, these data point to specific oncogenic biological programs and pathways involved in the pathogenesis of the disease, as described in the following section.
Cellular programs commonly altered in RS
To organize the spectrum of genetic lesions in cellular pathways relevant for RS pathogenesis, we performed pathways/gene functional clustering analysis using an integrated approach based on the combination of publicly available tools (i.e., GSEA and DAVID 2008; Mootha et al., 2003; Subramanian et al., 2005; Huang et al., 2009a,b) and manual annotation of the genes altered in RS, including (a) RS-WES mutated genes; (b) genes in focal MCR of aberration with a frequency ≥10%; and (c) “de novo DLBCL/CLL CGC genes” mutated in at least 2 RS cases (total n = 304 genes).
All used tools reported proliferation-, apoptosis-, and cell–cycle regulation in the top statistically significant categories, suggesting that deregulation of these processes is the driving force of this highly aggressive tumor that combines in a single disease the effects of chemoresistance and rapid disease kinetics. Based on our integrated analysis, the four genes most commonly altered in RS were TP53, NOTCH1, MYC, and CDKN2A/B, which are well-characterized regulators of tumor suppression, cell proliferation, and cell cycle control, whose lesions overall accounted for ∼90% of RS cases (Figs. 6–8). Additional programs and pathways that emerged as significantly altered in RS included the MAPK, Wnt, NF-κB, and TGF-β pathways, RNA processing/degradation, and protein translation processes.
Collectively, these results suggest that the transformation of CLL to RS involves a heterogeneous set of pathways in different cases, with those involving apoptosis and cell proliferation representing the most consistent ones.
The coding genome of RS is distinct from the one of de novo DLBCL
Although most RS cases are histologically classified as DLBCL, the response to treatment is significantly poorer in RS than in de novo DLBCL, even when considering the less curable activated B cell–like subtype (Coiffier et al., 2002; Tsimberidou et al., 2006; Tsimberidou et al., 2008), suggesting a distinct underlying pathogenesis. To obtain a comparative assessment of the genetic landscape of these two diseases, we compared the frequency of the most prevalent genetic lesions in 27 RS cases, characterized for WES and CN data (see above), and in 71 newly diagnosed and previously untreated DLBCL cases, with available CN and targeted resequencing data of genes commonly mutated in DLBCL (Pasqualucci et al., 2011). We separately compared the frequency of genetic lesions in RS versus the two main subtypes of DLBCL, i.e., activated B cell–like (ABC, n = 40) and germinal center B cell like (GCB, n = 31) DLBCL (Fig. 8). For genes for which we lacked mutational analysis data, we used a recently published WES-dataset of 49 primary DLBCL cases with no information regarding ABC versus GCB subtype (Lohr et al., 2012).
Although almost all RS cases display a nongerminal center phenotype (Hans et al., 2004), the genetics of RS significantly differed from that of de novo ABC-DLBCL, with the exception of CDKN2A/B losses and point mutations affecting MYD88, which were, overall, observed at comparable frequency. In particular, several lesions typically associated with ABC-DLBCL were rare or absent in RS. Significant differences included biallelic inactivation of the TNFAIP3/A20 and PRDM1/BLIMP1 tumor suppressor genes and BCL6 translocations, which were never observed in RS, whereas they are associated with ∼30, ∼22, and 37.5% of ABC-DLBCL, respectively (P < 0.01; Fig. 8). Additional lesions typically associated with ABC-DLBCL, including mutations of PIM1, CD79B, and CARD11, as well as BCL2-CN gains, were relatively rare in RS (11.1 vs. 17.5%, 7.4 vs. 25%, 7.4 vs. 12.5%, and 22.2 vs. 42.5%, respectively).
Among the lesions typically associated with GCB-DLBCL, MYC-activating events presented comparable frequencies in the two diseases (Fig. 8), whereas mutations involving CREBBP and EZH2, which were observed in 22.6 and 9.7% of GCB-DLBCL, respectively, never occurred in RS. The difference was statistically significant (P < 0.05) only for CREBPP mutations, most likely due to the lower frequency of EZH2 mutations in GCB-DLBCL. BCL2 translocations, present in ∼30% of GCB-DLBCL, were never observed in RS, as previously reported (Rossi et al., 2011b; P < 0.01).
Finally, among the lesions equally represented in both DLBCL subtypes, some occurred with similar frequency in the two diseases (i.e., MIR17HG gain, 8p loss, and deletion of PTEN). B2M biallelic disruption and gains of chromosome 7 occurred less frequently in RS compared with DLBCL (0 vs. 16.9%, and 7.4 vs. 38%, respectively; P < 0.05). The frequency of MLL2 and PCLO mutations was also lower in RS compared with DLBCL (15 vs. 35% and 11.1 vs. 22.5%), but this difference did not reach statistical significance (Fig. 8). Strikingly, the two most frequent genetic lesions that characterize the RS genome, i.e., TP53 biallelic disruption and NOTCH1 mutations, were significantly enriched in RS compared with both subtypes of DLBCL (P < 0.01), thus confirming their specific relevance in the pathogenesis of the disease (Fig. 8).
Altogether, these data underline the fact that, despite the morphological and phenotypic similarities between RS and de novo DLBCL, the molecular profile of these two conditions is largely different, suggesting that they represent distinct disease entities.
DISCUSSION
This study was aimed at elucidating relevant questions regarding RS pathogenesis, including its pattern of clonal evolution from CLL, the genetic determinants of the transformation event, and the pathogenetic relationship between RS and classical non-CLL associated de novo DLBCL. Although these questions have been previously addressed using mostly candidate gene approaches, we have used unbiased whole-exome and CN analysis for a comprehensive definition of these issues. The obtained results provide insights into RS pathogenesis and have diagnostic and therapeutic implications.
The clonal evolution of any given tumor type occurs either through a linear model, in which the predominant clone acquires novel genetic lesions leading to progression, or a branching model, in which a common ancestor evolves to the initial tumor and to the progressed stage through distinct genetic pathways (Aparicio and Caldas, 2013). Our results show that the CLL-RS transition occurs through a linear model of evolution in most cases, where RS represents the final stage of evolution of the original founder CLL clone. This result is consistent with early observations based on the analysis of intraclonal diversification of IGHV genes in paired CLL-RS (Rossi et al., 2012b). In a minor fraction of cases, RS seems to be the result of the branching evolution of CLL and RS from an earlier common precursor cell, perhaps consistent with the observation that early hematopoietic stem cells from CLL patients can acquire propensity to generate clonal B cells (Kikushige et al., 2011). Both linear and branching patterns of clonal evolution have been described in CLL clinical progression (Schuh et al., 2012; Landau et al., 2013; Ouillette et al., 2013), as well as in the transformation of follicular-lymphoma (FL) to DLBCL (Carlotti et al., 2009; and unpublished data), although in the latter case a branching pattern of evolution is predominant. This is also the case in the rare progression of RS from a nodal to a leukemic disease, at least in two out of the three cases analyzed in this study. The terminal leukemic phase of RS might be the result of the selection and outgrowth of a subclone, perhaps driven by the highly cytotoxic therapeutic regimes used for RS treatment (Rossi and Gaidano, 2009).
Moreover, our data reveal that no single lesion or combination of genetic lesions or cellular pathways seem to be responsible for CLL transformation to RS, with various RS cases differing in both the number and the type of aberrations. Nonetheless, the fact that ∼90% of the cases display combinations of lesions of TP53, NOTCH1, MYC, and CDKN2A/B suggests that RS transformation does not involve primarily specific B cell signaling pathways, but rather general regulators of tumor suppression, cell cycle control and cell proliferation. Whereas previous reports have already identified TP53 inactivation, and the mutually exclusive NOTCH1 and MYC-activating events as common in RS pathogenesis (Fabbri et al., 2011; Rossi et al., 2011b), our results identify loss of the CDKN2A/B locus as present in ∼30% of RS cases and often co-occurring with the two other most frequent lesions observed in RS, namely TP53 disruption and NOTCH1 mutations. The individual and combined role of these lesions can be studied further in appropriate mouse models. Overall, we note that RS transformation appears to involve pathways that are distinct from those accompanying CLL progression and/or acquisition of chemoresistance, the latter being recurrently associated with lesions of the SF3B1 gene, encoding a component of the spliceosome, and of the BIRC3 gene, encoding a negative regulator of the NF-κB pathway, although largely lacking MYC activation and CDKN2A/B disruption that are common in RS (Wang et al., 2011; Quesada et al., 2012; Rossi et al., 2011a,b).
The results herein also indicate that the DLBCL disease typical of RS is pathogenetically distinct from both forms (GCB and ABC) of de novo DLBCL (Morin et al., 2011; Pasqualucci et al., 2011). The most relevant differences involve the common activation of NOTCH and inactivation of TP53 pathways, which are virtually absent or rare in de novo DLBCL, respectively. Furthermore, RS-DLBCL lacks lesions common in all DLBCL types, such as inactivation of the acetyltransferase genes CREBBP/EP300 and of the B2M gene, as well as those common in ABC-DLBCL (e.g., translocations of BCL6 and loss of PRDM1/BLIMP1 and TNFAIP3/A20) or GCB-DLBCL (e.g., translocations of BCL2; Challa-Malladi et al., 2011; Lohr et al., 2012; Morin et al., 2011; Pasqualucci et al., 2011). These differences indicate that DLBCL transformed from CLL and classical DLBCL represent distinct disease entities.
Finally, these findings have diagnostic and therapeutic implications. The higher frequency of specific lesions in RS compared with CLL suggests that their early detection can be tested for its prognostic significance in predicting CLL evolution toward RS. Moreover, the specific presence (e.g., NOTCH1 mutations) or absence (e.g., inactivation of CREBBP/EP300 and translocations of BCL2) of defined lesions in RS imply distinct approaches in identifying targeted therapies from those currently being tested for DLBCL. In particular, two of the most common lesions occurring in RS, namely NOTCH1 mutations and CDKN2A/B deletions, represent established therapeutic targets with some drugs being already available for testing in single or combinatorial regimens (Monti et al., 2012; Samon et al., 2012; Wu et al., 2010).
MATERIALS AND METHODS
Primary cases.
The discovery panel comprised 21 samples from 9 paired CLL-RS cases (9 CLL-RS pairs, plus 3 paired terminal leukemic phases of RS). Diagnosis of CLL was based on IWCLL-NCI Working Group criteria (Hallek et al., 2008) and confirmed by a flow cytometry score >3, whereas diagnosis of RS was based on the histology of lymph node or extranodal tissue excisional biopsies. After institutional pathological review, all RS cases were classified as DLBCL according to the World Health Organization Classification of Tumors of the Hematopoietic and Lymphoid Tissues (Swerdlow, 2008). The clinical and biological features of the discovery cases are summarized in Table S1.
The screening panel used for additional CN analysis was composed of a multi-institutional cohort of 42 clonally related RS, 17 of which were also characterized by WES. An additional RS case with no CN data available was included in the WES-sequencing panel.
The cases included in this study were provided by the Division of Hematology, Department of Translational Medicine, Amedeo Avogadro University of Eastern Piedmont, Novara, Italy and by the Oncology Institute of Southern Switzerland (IOSI), Bellinzona, Switzerland. The study was approved by the Institutional Review Board of Columbia University and IOSI and by the Ethical Committee of the Azienda Ospedaliera Maggiore della Carità di Novara, Amedeo Avogadro University of Eastern Piedmont.
DNA extraction.
High molecular-weight genomic DNA was extracted from samples according to standard procedures, quantitated by the Quant-iT PicoGreen reagent (Invitrogen) and verified for integrity by gel electrophoresis. Clonality of tumor samples was established by amplification of the rearranged Ig genes. The percentage of tumor cells of CLL and RSII samples was estimated by cytofluorometry analysis of CD5+/CD19+ cells in the peripheral blood, and the one of RS specimens by immunohistochemistry analysis of CD20+/CD79a+ cells in the correspondent RS section. In all phases, the percentage of tumor cells was then corrected based on the inferred CN value observed at the clonally rearranged Ig loci (i.e., the segment of intrachromosomal deletional recombination observed at the 14q32.33, 2p11.2, and 22q11.2 loci; Bergsagel and Kuehl, 2013). All analyzed cases had at least 35% of tumor cell content in the specimen used for WES and CN analyses (Table S1).
FISH and IGHV mutation analysis.
The molecular characterization of RS cases, including FISH analysis and assessment of IGHV mutational status, was performed as previously described (Fabbri et al., 2011; Rossi et al., 2011b). In particular, the following probes were used for FISH analysis of the CLL-RS discovery pairs: (1) LSI13, LSID13S319, CEP12, LSIp53, LSIATM, LSI IGH/BCL2, LSI BCL6, LSI IGH/c-MYC/CEP8, c-MYC break-apart, LSI N-MYC (Abbott); (2) BCL3 split signal (Dako); (3) 6q21/α-satellite (Kreatech Biotechnology); and (4) bacterial artificial chromosome clones 373L24-rel and 440P05-BCL11A.
Whole-exome capture and massively parallel sequencing, sequence mapping, and identification of tumor-specific variants.
Purified high molecular weight genomic DNA (∼1.5 µg) was enriched in protein-coding sequences using the SureSelect 50 Mb All Exon kit for the discovery cases and the SureSelect Human All Exon v4–51Mb for the screening cases (Agilent Technologies), following standard protocols. The resulting target-enriched pool was amplified and subjected to paired-end sequencing (2 × 100 bp) on the HiSeq2000 System. Exome capture and sequencing procedures were performed at St. Jude Children’s Research Hospital, Fasteris, and Centrillion Biosciences. The mean depth of coverage for aligned reads was 83X (range, 34-158X), with ∼75% of the target sequence being covered by at least 30 reads (range, 48–89%). Based on a comparison with the heterozygous SNP call rate obtained in the same cases by analysis of Affymetrix SNP 6.0 array data (available for 20/21 samples), the sensitivity for the identification of heterozygous somatic mutations was estimated to be ∼97% (range, 94–99%). After filtering for duplicate reads (defined as reads with identical start and orientation), sequencing reads were mapped to the reference genome hg19 assembly (GRCh37) using the Burrows-Wheeler Aligner (BWA) alignment tool version 0.5.9 (Li and Durbin, 2010). Sequence variants (i.e., nucleotide substitutions and small insertions/deletions) were obtained using the SAVI (Statistical Algorithm for Variant Identification) algorithm independently for each sample (Tiacci et al., 2011). In brief, we constructed empirical priors for the distribution of variant frequencies in each sample independently and obtained high-credibility intervals (posterior probability ≥1–10−5) for corresponding changes in frequency between any of the analyzed samples. Variants were considered absent when observed with a frequency between 0 and 2%, and present when showing a frequency ≥3%. Given a mean depth of coverage for aligned reads >80X across samples, we chose 3% as the lower threshold for detecting variants, considering a 1% error rate for the technology and a binomial distribution for the errors. Candidate variants were then filtered for (a) systematic errors known to be associated with the Illumina sequencing technology, (b) variants observed in only one strand, and (c) variants mapping to multiple loci in the genome, which may reflect captured pseudogenes and regions of low complexity. To this end, each variant with a flanking 35-base context sequence around its genomic position was mapped to the hg19 reference using the BLAST algorithm, and only variant reads with “unique mappability” were retained, that is, we required the 71-base sequence to uniquely map to the reference genome with only one mismatch. Candidate tumor-specific nonsilent variants were then obtained by subtracting (a) known polymorphisms reported in the NCBI dbSNP database (Build 137); (b) variants present in any one of 105 exomes from unaffected individuals analyzed at our institution; and (c) silent mutations. In addition, splice site variants were kept only if located within 4 nt of a consensus splice site.
As the percentage of tumoral cells present in the analyzed specimens was not homogeneous across samples (Table S1), we corrected the variants’ frequencies by calculating the expected frequency based on the total depth of the variant, its observed frequency, and the tumor content of the sample, assuming a binomial distribution. Mutations were classified as clonal if the fraction of variant reads was >20 (upon correction for the percentage of tumor cells in the sample), and subclonal otherwise.
The genomic positions of the candidate variants were converted to the hg18 coordinates using the LiftOver tool (UCSC Genome Browser) to be consistent with the annotation used to report CN aberrations.
For the purpose of deciphering the history of clonal evolution during CLL transformation to RS, CLL-only variants were defined as events clonally represented in CLL and absent or subclonal in RS and/or RSII and RS-only variants included events clonally represented in RS and absent/subclonal in CLL or subclonal in RS and absent in CLL. The same criteria were applied to identify RSII-only variants. To establish whether the difference in representation of the variants in distinct disease phases was statistically significant, we calculated the probability of observing the variant’s depth, given its position total depth, its observed frequency in the paired sample, and the estimates for samples’ tumor contents, using a binomial distribution. Variants with p-values >0.05 were discarded.
Previously reported genetic lesions detected in the same cases by Sanger sequencing (TP53, NOTCH1, and SF3B1) were all correctly identified by our approach (Table S1). Only one 50-bp duplication affecting the PEST domain of NOTCH1 was missed by WES, consistent with the lower sensitivity of our mutation discovery algorithm in detecting large insertions and deletions. A set of candidate RS-specific nonsilent somatic mutations were subjected to validation by conventional Sanger-based resequencing analysis of PCR products obtained from tumor DNA using primers specific for the exon encompassing the variant, as previously described (Fabbri et al., 2011; Pasqualucci et al., 2011). The validation rate of a subset of the identified point mutations tested by Sanger sequencing was ∼94% (n = 133/141).
Given the absence of paired normal DNAs, SAVI was used to interrogate a specific list of target genes (n = 481) in the WES-screening panel, as described in the section “Recurrent targets of point mutations in RS.”
CN analysis by high-density SNP array analysis.
Genome-wide DNA profiles from 50 RS samples, of which 14 with paired CLL, including 8 out of 9 discovery RS cases, 3 RS-leukemic phases, and a series of normal samples (n = 7) were obtained from high molecular weight genomic DNA using the Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix) and standard protocols from Affymetrix as previously described (Fabbri et al., 2011; Pasqualucci et al., 2011). Previously reported genetic lesions detected by FISH-based analysis (when present in >25% of nuclei) were all correctly identified by our approach (Table S1). Data analysis, including the identification of Minimal Common Regions of aberration and the Genomic Identification of Significant Targets in Cancer (GISTIC), was performed as previously described (Fabbri et al., 2011; Pasqualucci et al., 2011). In brief, we focused on focal MCRs encompassing ≤3 protein-coding genes and not determined by the overlap between large segments (i.e., >1 Mb; Pasqualucci et al., 2011). Recurrent focal MCRs are shown in Fig. 4 b and significant regions of loss and gain as assessed by GISTIC are reported in Fig. 4 d.
Identification of significant association and mutual exclusion between RS-associated genetic lesions.
To determine the independence of any two genetic events occurring in the RS cases, we counted the number of samples involved in both events and calculated a p-value according to the hypergeometric distribution. If the number was significantly larger than expected, the two events were considered comutated, whereas if the number was significantly smaller they were considered mutually exclusive. We used a p-value threshold of 0.05 to assess the significance of both mutual-exclusion and comutation. Due to the limited number of samples, we did not use multi-test correlation in this experiment.
Functional categories and pathways analyses of the mutated genes.
Genes found to be altered in RS were assigned to functional categories or to annotated pathways using the gene ontology GO database (http://www.geneontology.org/), the publicly available bioinformatic tool DAVID 2008 6.7 (Database for Annotation, Visualization and Integrated Discovery, http://david.abcc.ncifcrf.gov/), and the Molecular Signatures Database from the Broad Institute (MSigDBv4.0, http://www.broadinstitute.org/gsea/msigdb/index.jsp). In brief, the functional annotation chart tool of DAVID 2008 was used to identify enriched functional categories associated with RS-altered genes (including the “GOTERM_BP_ALL” and the “Pathways” terms); the Compute Overlap tool of the Molecular Signatures Database was used to examine how the gene set composed by RS-altered genes overlapped with gene sets belonging to the C2-Canonical pathways collection. DAVID 2008 categories and MSigDB overlaps were then selected to retain only those with a P value ≤0.05.
Accession nos.
Affymetrix SNP Array 6.0 data and WES data have been deposited in dbGaP under accession no. phs000364.v2.p1.
Online supplemental material.
Table S1 shows clinical and biological features of the 9 CLL-RS discovery cases; Table S2 reports mutations identified by WES in the discovery and screening panels; and Table S3 illustrates the segments (regions) of tumor-acquired CN alterations identified in the discovery panel and in additional six paired and clonally related CLL-RS pairs.
Acknowledgments
We would like to thank V. Miljkovic and the Genomics Technologies Shared Resource of the Herbert Irving Comprehensive Cancer Center at Columbia University for hybridization of the Affymetrix SNP6.0 arrays; W. Lei and the St. Jude Department of Computational Biology for assistance with the whole-exome capture and sequencing procedure; G. Gaidano, D. Rossi (Division of Hematology, Department of Translational Medicine, Amedeo Avogadro University of Eastern Piedmont, Novara, Italy), and F. Bertoni (Lymphoma and Genomics Research Program, IOR Institute of Oncology Research, and Lymphoma Unit, IOSI Oncology Institute of Southern Switzerland, Bellinzona, Switzerland) for providing pathologically, clinically and molecularly characterized subject samples; and F. Bertoni for providing SNP6.0 CEL files for a subset of the RS cases.
R. Dalla-Favera was supported by National Institutes of Health (NIH) grant 1R01CA177319-01. R. Rabadan was supported by the Stewart Foundation, the Partnership for Cure, and NIH grant 1R01CA177319-01. M. Messina was supported by Associazione Italiana per la Ricerca sul Cancro (AIRC) Special Program Molecular Clinical Oncology, 5 x 1000, Milan, Italy. G. Fabbri is a Fellow of the American Italian Cancer Foundation.
The authors have no competing financial interests.
G. Fabbri, R. Dalla-Favera, and L. Pasqualucci designed the study, interpreted data, and wrote the manuscript. H. Khiabanian, J. Wang, C.G. Mulligan, and R. Rabadan developed bioinformatics tools and performed bioinformatic analysis. C.G. Mulligan contributed to the WES experiment in the discovery panel. A.B. Holmes and M. Messina contributed to the CN experiment and analysis.
References
Author notes
G. Fabbri and H. Khiabanian contributed equally to this paper.
R. Rabadan and R. Dalla-Favera contributed equally to this paper.