Examples of associations between human disease and defects in pre–messenger RNA splicing/alternative splicing are accumulating. Although many alterations are caused by mutations in splicing signals or regulatory sequence elements, recent studies have noted the disruptive impact of mutated generic spliceosome components and splicing regulatory proteins. This review highlights recent progress in our understanding of how the altered splicing function of RNA-binding proteins contributes to myelodysplastic syndromes, cancer, and neuropathologies.
Introduction
Nearly all protein-encoding human genes have multiple exons that are combined in alternative ways to produce distinct mRNAs, often in an organ-specific, tissue-specific, or cell type–specific manner. Although documenting the function of this vast collection of splice variants is a challenging endeavor, the regulated production of splice variants is required for important functions encompassing virtually all biological processes. The growing recognition of splicing and alternative splicing as critical contributors to gene expression was accompanied by many new examples of how splicing defects are associated with human disease. As several excellent reviews have reported on this expanding, and sometimes causal, relationship (Poulos et al., 2011; Singh and Cooper, 2012; Zhang and Manley, 2013; Cieply and Carstens, 2015; Nussbacher et al., 2015), the goal of this review is to highlight recent efforts in understanding how disease-associated mutations disrupt regulation of splicing. After an overview of basic concepts in splicing and splicing control, we discuss recently described defects in the control of splicing that suggest contributions to myelodysplastic syndromes (MDS), cancer, and neuropathologies.
Splicing and splicing control
Intron removal is performed by the spliceosome (Fig. 1 A), whose assembly starts with the recognition of the 5′ splice site (5′ss), the 3′ splice site (3′ss), and the branch site by U1 small nuclear RNP (snRNP), U2AF, and U2 snRNP, respectively. Along with the U4/U6.U5 tri-snRNP, >100 proteins are recruited to reconfigure the interactions between small nuclear RNAs, between small nuclear RNAs and the pre-mRNA, and to position nucleotides for two successive nucleophilic attacks that produce the ligated exons and the excised intron (Wahl et al., 2009; Matera and Wang, 2014). Fewer than 1,000 introns (i.e., ∼0.3%) are removed by the minor spliceosome, which uses distinct snRNPs (U11, U12, U4atac, and U6atac) but shares U5 and most proteins with the major spliceosome (Turunen et al., 2013).
Definition of intron borders often requires the collaboration of RNA-binding proteins (RBPs), such as serine arginine (SR) and heterogeneous nuclear RNPs (hnRNPs), which interact with specific exonic or intronic sequence elements usually located in the vicinity of splice sites. As the combinatorial arrangement of these interactions helps or antagonizes the early steps of spliceosome assembly (Fu and Ares, 2014), one ambitious goal is to determine how cell-, tissue-, and disease-specific variations in the expression of these splicing regulators and their association near splice sites induce specific changes in alternative splicing (Barash et al., 2010; Zhang et al., 2010). This challenge is compounded by the fact that only a fraction of the >1,000 RBPs has been studied (Gerstberger et al., 2014) and that all RBPs have splice variants, usually of undetermined function. Moreover, the function of RBPs is often modulated by posttranslational modifications that occur in response to environmental insults and metabolic cues (Fu and Ares, 2014).
An extra layer of complexity to our view of splicing control is added when we consider that experimentally induced decreases in the levels of core spliceosomal components also affect splice site selection (Saltzman et al., 2011). Indeed, reducing the level of dozens of spliceosomal components, including SF3B1, U2AF, and tri-snRNP components, affects the production of splice variants involved in apoptosis and cell proliferation (Papasaikas et al., 2015). Although it remains unclear whether variation in the levels and activity of generic factors is used to control splicing decisions under normal conditions, deficiencies in tri-snRNP proteins or in proteins involved in snRNP biogenesis are now frequently associated with aberrant splicing in disease (e.g., PRPF proteins in retinitis pigmentosa [Tanackovic et al., 2011], the SMN protein in spinal muscular atrophy [SMA; Zhang et al., 2008], and SF3B1, SRSF2, and U2AF1 in MDS [see Spliceosomal proteins in MDS section]). How mutations in generic splicing factors confer gene- and cell type–specific effects is an intriguing question. The suboptimal features of some introns that dictate this sensitivity may normally be mitigated by the high concentration or activity of generic factors. Consistent with this view, repression of PRPF8 alters the splicing of introns with weak 5′ss (Wickramasinghe et al., 2015). Thus, deficiencies in the activity of generic spliceosome components may compromise the splicing of a subset of introns, contributing to the onset of disease.
As splicing decisions are usually made while the pre-mRNA is still being transcribed (Fig. 1 B), regulatory links with transcription and chromatin structure take place at several levels. First, spliceosome components and regulators are recruited to the transcription machinery (e.g., the C-terminal domain of RNA polymerase II) to facilitate their transfer onto the emerging nascent pre-mRNA (Bentley, 2014). Second, the speed of the elongating polymerase provides a kinetic window for the assembly of enhancer or repressor complexes that influence commitment between competing pairs of splice sites (Bentley, 2014; Naftelberg et al., 2015). Third, posttranslational modifications of histones and chromatin remodeling factors impact the speed of transcription as well as the recruitment of adapters that interact with splicing regulators (Luco et al., 2011; Lee and Rio, 2015; Naftelberg et al., 2015). Notably, histone modifications in specific chromatin regions can be triggered by Argonaute proteins bound to endogenous or exogenously provided small RNAs (Alló et al., 2009; Ameyar-Zazoua et al., 2012). Long noncoding RNAs (lncRNAs), whose expressions vary in human diseases (e.g., MALAT-1 in cancer), may also contribute to splicing control by interacting with splicing factors to regulate their availability, or by orchestrating local epigenetic modifications that impact the speed of transcription or the recruitment of adapters (Zhou et al., 2014a; Gonzalez et al., 2015).
Shooting the pre-messenger by disrupting splicing control elements
More than 200 human diseases, including progeria and some forms of breast cancer and cystic fibrosis, are caused by point mutations that affect pre-mRNA splicing by destroying or weakening splice sites, or activating cryptic ones (Wang et al., 2012), thereby producing mRNAs that encode defective proteins or that are targets for nonsense-mediated mRNA decay (NMD). Splicing defects can also lead to the cotranscriptional degradation of nascent pre-mRNAs (Davidson et al., 2012; Vaz-Drago et al., 2015). A splice site mutation in BRAF is associated with resistance to the anticancer agent vemurafenib, but inhibitors of the generic splicing factor SF3B1 decrease the production of the mutation-induced BRAF variant and inhibit drug-resistant cell proliferation (Poulikakos et al., 2011; Salton et al., 2015). Splice site variations can also have health-positive effects, as shown recently for a variant of LDLR that lowers non–high density lipoprotein cholesterol and protects against coronary artery disease (Gretarsdottir et al., 2015). In addition to mutations at splicing signals themselves, mutations that destroy silencer or enhancer elements form another important group of disease-causing alterations that impact alternative splicing (Sterne-Weiler and Sanford, 2014). Because more than half of the nucleotides in an exon may be part of splicing regulatory motifs (Chasin, 2007), synonymous exon mutations, as well as an undetermined number of intron mutations, may further contribute to splicing misregulation that leads to disease. A recent computational analysis relying on RNA sequencing data from normal and disease samples and using >650,000 single-nucleotide variations (SNVs) identified >10,000 intronic and 70,000 missense and synonymous exonic SNVs occurring in splicing regulatory motifs that linked potential splicing defects with disease (Xiong et al., 2015). Notably, the computational tool developed for this analysis predicted, with tantalizing accuracy, the impact of mutation on the direction and amplitude of splicing shifts associated with SMA and hereditary colorectal cancer and identified several intronic autism-associated SNVs with a high potential of splicing impact (Xiong et al., 2015).
Incapacitating the regulators
Disease-causing mutations in intronic or exonic control elements generally affect splicing by perturbing the binding of regulatory proteins that normally recognize them. The activity of the splicing regulators themselves can also be altered in disease. Changes in the nuclear level of regulators, including RBFOX2, hnRNP, and SR proteins, often occur in cancer (Venables et al., 2009; Zhang and Manley, 2013). Although these changes frequently produce splice variants that affect cell cycle control, apoptosis, cell motility, and invasion, the molecular mechanisms that lead to these alterations and to specific downstream events that promote cancer remain largely unclear (Zhang and Manley, 2013; Shilo et al., 2015). Another way to alter the activity of splicing regulators is through sequestration. This is the case in DM1 and DM2 myotonic dystrophies, where muscleblind-like (MBNL) proteins are recruited to mRNAs carrying expansions of CUG and CCUG repeats, respectively. This sequestration compromises MBNL binding to normal RNA targets, deregulates the expression of CUGBP1, and alters alternative splicing of hundreds of transcripts not only in muscle tissues but also in the brain (Poulos et al., 2011; Charizanis et al., 2012; Echeverria and Cooper, 2012; Batra et al., 2014; Goodwin et al., 2015). The formation of cytoplasmic aggregates, possibly also triggered by mRNAs carrying nucleotide expansions, is frequently associated with neuropathological diseases such as amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD; see Splicing control defects in neuropathological and muscle-related disease section). In other instances, particularly in cancer, the localization and/or activity of splicing factors are misregulated by posttranslational modifications, e.g., phosphorylation (Naro and Sette, 2013). Mutation of generic spliceosomal components is also becoming a recurrent theme in disease, and recent advances in this area are mentioned throughout our review. Likewise, with the increasing awareness that splicing decisions are coordinated with transcription, and thus with processes that modify chromatin, splicing alterations provoked by disease-associated lncRNAs and chromatin-modifying enzymes are likely to become an emerging focus of inquiry.
Impairing splicing factors to promote cancer
Here, we highlight recent work that has focused on the role of spliceosomal components in MDS, a heterogeneous group of disorders that affect hematopoietic progenitor cells and the production of different types of blood cells. MDS often progress to fully malignant acute myeloid leukemia (AML) with the abnormal accumulation of hematopoietic precursors arrested at an early stage of differentiation. We then provide examples of network restructuring of alternative splicing regulators that have been more solidly associated with carcinogenesis or that may constitute new concepts that link splicing factors with the emergence and maintenance of cancer.
Spliceosomal proteins in MDS
Somatic heterozygous mutations in any of the spliceosomal proteins SF3B1, SRSF2, U2AF1, and the U2AF-related gene ZRSR2 occur in >50% of all MDS patients (Papaemmanuil et al., 2011; Quesada et al., 2011; Yoshida et al., 2011). No homozygous mutations have been described, and almost all mutations are missense, usually occurring at conserved positions. Bone marrow and cancer cells harboring these mutations display splicing abnormalities (Makishima et al., 2012; Przychodzen et al., 2013; Gentien et al., 2014). This may be a direct consequence of the specific mutations because the RNAi-mediated depletion of wild-type SF3B1, SRSF2, and U2AF1 in a variety of cell types or the expression of mutated proteins in nonhematopoietic and cancer cell lines also disrupts alternative splicing (Massiello et al., 2006; Pacheco et al., 2006; Graubert et al., 2011; Yoshida et al., 2011; Venables et al., 2013; Brooks et al., 2014; Shao et al., 2014; Dolatshad et al., 2015; Kfir et al., 2015; Komeno et al., 2015; Papasaikas et al., 2015). Here, we will summarize a set of studies that provide tantalizing insight into how mutated SRSF2, SF3B1, U2AF1, and ZRSF2 affect splicing programs and alter hematopoiesis in mice and MDS patients (Fig. 2).
SRSF2
Notably, telomerase-negative mice with short telomeres that induce a persistent DNA damage response (DDR) present hematopoietic defects that recapitulate the clinical features of human MDS (Colla et al., 2015). Moreover, this telomere deficiency is associated with a decrease in the level of splicing factors that are frequently mutated in MDS (e.g., SRSF2, U2AF2, SF3B2, and SF3A3). Progenitor cells with deficient telomeres produce defective transcripts encoding components involved in DNA repair and chromatin structure. One splicing change reduces the level of the DNA methyl transferase DNMT3a, whose frequent mutation in MDS patients contributes to rapid progression to AML (Walter et al., 2011; Colla et al., 2015). SRSF2 is a splicing regulator that contributes to both generic and alternative splicing (Long and Caceres, 2009). The impact of short telomeres on the expression of SRSF2 inspired Colla et al. (2015) to create SRSF2-haploinsufficient mice. Remarkably, these mice display impaired erythroid differentiation and express several of the defective alternative splicing events caused by telomere dysfunction. Further, they aberrantly splice transcripts encoding components involved in telomere maintenance, potentially providing a feedback loop that may elicit more splicing defects (Colla et al., 2015).
Importantly, the MDS-associated P95H mutation in SRSF2 shifts the affinity of SRSF2 to a subset of binding sites (Kim et al., 2015; Komeno et al., 2015; Zhang et al., 2015a), thus providing an explanation for the fact that the recapitulation of splicing defects observed in SRSF2-haploinsufficient mice is only partial (Colla et al., 2015). CD34+ hematopoietic stem cells from MDS patients with the P95H mutation have defects in the production of splice variants involved in telomere maintenance, DNA repair, and chromatin remodeling (Colla et al., 2015). Finally, murine bone marrow cells expressing the SRSF2-P95H mutant display features that are characteristic of MDS, including increased proliferation of progenitor cells and impaired differentiation (Kim et al., 2015). One of the P95H-mediated splicing alterations in mice reduces the expression of the histone methyl transferase EZH2, an outcome also occurring in human cells expressing mutant SRSF2. Strikingly, restoring expression of EZH2 in SRSF2 mutant mice partially rescues the hematopoietic defect (Kim et al., 2015). Overall, these studies provide strong evidence that MDS-associated mutations in SRSF2 affect the production of splice variants involved in chromatin structure that in turn elicit hematopoietic defects.
SF3B1
SF3B1 is a U2 snRNP–associated protein involved in branch point selection (Corrionero et al., 2011). The depletion of SF3B1 impairs the growth and the differentiation of myeloid cell lines (Dolatshad et al., 2015). SF3B1 haploinsufficiency in mice compromises the repopulating ability of hematopoietic stem cells, but is not sufficient to induce MDS (Visconte et al., 2012; Matsunawa et al., 2014; Wang et al., 2014a; Dolatshad et al., 2015). Decreasing the levels of SF3B1 in myeloid cell lines alters the alternative splicing of transcripts encoding components involved in apoptosis and cell cycle control. Interestingly, in bone marrow cells and progenitor bone marrow stem cells from SF3B1-mutated MDS patients, the expression and splicing of genes/transcripts associated with mitochondrial and heme-related functions are altered, providing a link with the abnormal iron homeostasis observed in MDS patients (Visconte et al., 2012; Dolatshad et al., 2015). Notably, iron homeostasis influences alternative splicing by modulating the activity of SRSF7 (Tejedor et al., 2015). Interestingly, it has been observed that SRSF7 is itself abnormally spliced in MDS patients carrying the SF3B1 mutation (Dolatshad et al., 2015). We speculate that this defective splicing may be responsible, at least in part, for the noted heme deficiency in MDS patients.
SF3B1 is also part of a complex with BCLAF1, U2AF, and PRPF8 that is recruited to chromatin-bound BRCA1 to stimulate the splicing of transcripts encoding factors involved in DNA repair and the DDR (Savage et al., 2014). In line with this finding, several DNA repair and DDR genes (e.g., ABL1, BIRC2, and NUMA1) produce aberrantly spliced transcripts in cells of patients with SF3B1 mutations (Dolatshad et al., 2015). Interestingly, one splicing alteration in these patient cells occurs in EZH1, a functional homologue of the histone methyl transferase EZH2, which is also defectively spliced in SRSF2-mutated cells and contributes to the MDS phenotype (Visconte et al., 2012; Dolatshad et al., 2015; Kim et al., 2015). Notably, the expression and alternative splicing of transcripts encoding RNA-processing factors, including PRPF8 and U2AF2, are also affected in SF3B1-mutated cells (Dolatshad et al., 2015). This observation is important because mutations in PRPF8 and U2AF2 are found in MDS patients (Boultwood et al., 2014). However, although PRPF8 mutations are associated with alternative splicing defects (Kurtovic-Kozaric et al., 2015), U2AF2 mutations appear neutral (Shao et al., 2014). Overall, these results suggest that SF3B1 mutations alter the splicing of transcripts involved in chromatin structure, DNA repair, and the DDR, thereby possibly providing an explanation for the accumulation of DNA damage in hematopoietic progenitor cells of MDS patients (Zhou et al., 2013). A function for SF3B1 in splice site selection has recently been associated with a specific interaction with histone marks that are enriched in exons (Kfir et al., 2015). Although SF3B1 mutations often occur in the C-terminal HEAT repeats involved in protein–protein interactions, it remains to be shown whether these mutations affect the recruitment of SF3B1 to chromatin. If they do, combining a mutated SF3B1 with chromatin modification defects may amplify splicing alterations, gradually leading to more detrimental hematopoietic deficiencies.
U2AF1
U2AF1 is the smaller of two proteins that make up the U2AF heterodimer implicated in generic 3′ss recognition. Although the U2AF1-S34F mutation elicits hematopoietic abnormalities in mice that compromise the repopulating ability of stem cells, it does not elicit MDS (Shirai et al., 2015). Many splicing defects in MDS patients with U2AF1 mutations occur in transcripts that encode components involved in cell cycle and splicing control (Przychodzen et al., 2013). Expression of mutated U2AF1 proteins in a human erythroleukemic cell line causes thousands of splicing alterations, including some in transcripts encoding components involved in DNA methylation (e.g., DNMT3B, also affected by mutations in SF3B1), DDR, and apoptosis (Ilagan et al., 2015). Different U2AF1 mutations alter its binding to 3′ss in different ways and lead to distinct yet overlapping splicing defects (Brooks et al., 2014; Shao et al., 2014; Ilagan et al., 2015). A meta-transcriptome analysis using samples from U2AF1-S34F mutant mice, AML patients with U2AF1 mutations, and primary bone marrow cells overexpressing U2AF1-S34F uncovered common splicing alterations in transcripts encoding splicing proteins and components that are mutated in MDS and AML, or that are involved in hematopoietic stem cell function. These observations provide strong support to the view that mutated U2AF1 elicits abnormal hematopoiesis (Shirai et al., 2015).
ZRSR2
ZRSR2 has been implicated in the splicing of introns that use the U12-dependent minor spliceosome in transcripts encoding cancer-relevant proteins such as PTEN, MAPK1, MAPK3, BRAF, and E2F2 (Madan et al., 2015). MDS-associated mutations in ZRSR2 are often inactivating, and depleting ZRSR2 reduces the growth and clonogenic potential of leukemia cell lines and alters the differentiation potential of human CD34+ bone marrow cells (Madan et al., 2015).
Overall, the studies mentioned above suggest that alternative splicing likely makes a crucial contribution to the clinical evolution of MDS (Fig. 2). Mutations in SRSF2, U2AF1, and SF3B1 may elicit a shared set of splicing alterations that trigger common hematopoietic defects and predispose stem cells to cancer development. The insight gained by studying the contribution of mutated splicing factors to MDS is likely to benefit our understanding of how mutations in splicing factors lead to cancer in general because mutations in SF3B1, U2AF1, and SRSF2 are also found in a variety of solid tumors (Kandoth et al., 2013; Scott and Rebel, 2013; Maguire et al., 2015). Although a recent compilation indicates that splicing factor genes are frequently mutated in different types of cancer (Sveen et al., 2015), a more extensive characterization of the functional impact of these mutations will be required to determine whether these alterations preferentially contribute to specific types of cancer.
An integrating hypothesis: Altered RBPs cause R loops and persistent DDR signaling that alter splicing
Although mutated SF3B1 and U2AF1 are expected to impact branch site/3′ss selection directly, we speculate that decreases in the level, and changes in the RNA binding specificity of splicing factors may also cause a second wave of alternative splicing changes through activation of the DDR (Fig. 3). This model is based on the observation that when core spliceosomal components are delocalized or when RNA-processing factors such as SRSF1 and RNPS1 are depleted, R loop formation occurs and triggers the DDR to impact alternative splicing (Li and Manley, 2005; Li et al., 2007; Domínguez-Sánchez et al., 2011; Tresini et al., 2015). If we are correct, drops in the level or changes in the activity of SRSF2, SF3B1, and U2AF1 may perturb alternative splicing through persistent R loop–mediated activation of the DDR (Fig. 3), a consequence that would be consistent with the noted accumulation of DNA damage in MDS progenitor cells (Zhou et al., 2013). DNA damage affects the expression, modification, and localization of several splicing regulatory proteins (Shkreta and Chabot, 2015). Likewise, DNA damage caused by deficient telomeres in mice alters the expression of splicing regulators (Colla et al., 2015). Importantly, alternative splicing defects in MDS patients and MDS mouse models affect variants involved in apoptosis, cell cycle control, DNA repair, splicing control, and chromatin structure (Fig. 3), precisely matching the functional categories of transcripts whose alternative splicing is affected by DNA-damaging agents (Shkreta and Chabot, 2015). As drops in the activity of splicing factors are frequently associated with human pathologies, this model may be applicable to a variety of diseases in addition to MDS, including myotonic dystrophies, retinitis pigmentosa, ALS, and FTD.
The epithelial–mesenchymal/mesenchymal–epithelial transition connection
Cancer metastasis involves cell migration and tissue invasion through reversible transitions from mesenchymal to epithelial cell types (mesenchymal–epithelial and epithelial–mesenchymal transition [MET and EMT, respectively]; Fig. 4; Yang and Weinberg, 2008). ESRPs and RBFOX2 control the alternative splicing of several transcripts encoding cell adhesion proteins involved in the epithelial or mesenchymal phenotypes (Shapiro et al., 2011; Venables et al., 2013; Braeutigam et al., 2014). A splice variant of the tyrosine kinase receptor RON that promotes cell migration and activates EMT is controlled by antagonistic interactions involving SRSF1 and hnRNP A1, A2, and H proteins (LeFave et al., 2011; Biamonti et al., 2014). Likewise, hnRNP M antagonizes ESRP in the splicing of the cell adhesion molecule CD44 and plays a key role in the metastatic behavior of breast cancer cells in mouse models (Xu et al., 2014). In contrast, RBM47 behaves as a suppressor of breast cancer progression and metastasis (Vanharanta et al., 2014). Consistent with their role in metastasis, the expression of hnRNP M and RBM47 is respectively high and low in aggressive human breast cancer (Vanharanta et al., 2014; Xu et al., 2014). LIN28A, the expression of which increases in the HER2 breast cancer subtype, interacts with hnRNP A1 to modulate the production of splice variants of ENAH that is associated with breast cancer metastasis (Di Modugno et al., 2007; Yang et al., 2015). In addition to the lncRNA MALAT1, which is implicated in metastasis, possibly by controlling alternative splicing (Tripathi et al., 2010; Gutschner et al., 2013), another lncRNA modifies chromatin to prevent the recruitment of a repressive chromatin-splicing adapter complex that normally enforces the mesenchymal-specific splicing of FGFR2 (Gonzalez et al., 2015). lncRNAs may act on opposite functional sides of the oncogenic pathway. On the one hand, the lncRNA INXS, which interacts with Sam68 to favor the production of the proapoptotic Bcl-xS splice variant, is down-regulated in tumors and its overexpression in mouse xenograft models elicits tumor regression (DeOcesano-Pereira et al., 2014). On the other hand, the lncRNA FAS-AS1 interacts with RBM5 to reduce expression of the prosurvival soluble FAS variant (Sehgal et al., 2014). Other lncRNAs that have been implicated in cancer include linc-p21, PANDA, TUG1, and Pint, but their impact on splicing and their contribution to cancer and metastasis are speculative and need to be investigated in more detail (Wang et al., 2014b; Zhang and Peng, 2015).
The MYC splicing connection
The overexpression of MYC contributes to malignant transformation and is associated with many cancers. Several studies have established a role for MYC in splicing control (Fig. 5). MYC contributes to cancer metabolism and tumor growth by increasing the levels of splicing regulators PTBP1, hnRNP A1, and hnRNP A2 that shift the production of pyruvate kinase from splice variant PKM1, which drives oxidative phosphorylation, to PKM2, which elicits aerobic glycolysis (Christofk et al., 2008; David et al., 2010). In glioblastoma, the up-regulation of hnRNP A1 promotes the splicing of a transcript encoding the MYC-interacting partner Max to generate ΔMax, producing a feed-forward loop that enhances MYC function and hnRNP A1 expression (Fig. 5; Babic et al., 2013). MYC also stimulates the expression of the SR protein SRSF1, which drives oncogenesis through alternative splicing of a network of transcripts encoding signaling molecules (e.g., RON and MKNK2) and transcription factors (e.g., BIN1; Das and Krainer, 2014). SRSF1 also elicits the production of variants, such as CASC4 with antiapoptotic function, as well as MDM2 and cyclin D1 variants with prooncogenic properties (Olshavsky et al., 2010; Anczuków et al., 2012, 2015; Comiskey et al., 2015). Positive feedback likely occurs because the SRSF1-mediated splice variant BIN1-12a no longer binds to MYC and lacks tumor suppressor activity (Ge et al., 1999; Karni et al., 2007). KRAS mutations that are frequently found in colorectal cancer activate the MAPK–extracellular signal-regulated kinase pathway to increase the level of the transcription factor ELK1 that in turn increases MYC with the expected impact on the production of PKM2 (Hollander, D., and Ast, G., personal communication). The activated MAPK–extracellular signal-regulated kinase pathway also stimulates the expression of Sam68, which increases the level of SRSF1 through alternative splicing (Matter et al., 2002; Valacca et al., 2010). Interestingly, the expression of SRSF1 is also stimulated by the anticancer drug gemcitabine, producing a splice variant of MKNK2 that phosphorylates eIF4E to promote cell growth and drug resistance (Adesso et al., 2013). Gemcitabine resistance is also provided by the expression of PKM2 through the increased production of PTBP1 (Calabretta et al., 2015).
Splicing factor addiction
The term “oncogene addiction” has been used in the cancer field to describe the increased dependence of cancers on oncogenes for growth and survival (Luo et al., 2009). Recent results suggest that there is an analogous hypersensitivity of cancer cells on splicing factors. This relationship was established when it was noted that MYC-regulated genes and pathways provoke a general increase in pre-mRNA synthesis that imposes a strain on generic splicing (Hsu et al., 2015; Koh et al., 2015). The fact that MYC up-regulates enzymes that modify snRNP proteins in cancer cells is consistent with the high demand for spliceosome components (Koh et al., 2015). Nevertheless, MYC-driven cancer cells are more sensitive to depletions of spliceosome components such as U2AF1 and SF3B1 (Hsu et al., 2015). This splicing stress may also affect the production of functionally important splice variants because decreases in the level or activity of generic spliceosome components also affect alternative splicing. Other cancers may be similarly addicted to splicing factors. For example, PRPF6, a component of the tri-snRNP complex, is overexpressed in a subset of primary and metastatic colon cancers, and its depletion by RNAi in cell lines reduces cell growth and decreases the production of the oncogenic ZAK kinase splice variant (Adler et al., 2014). Likewise, expression of splicing regulator SRSF10 is increased in aggressive colon cancers. The siRNA-mediated depletion of SRSF10 decreases tumor formation in mice, an effect that is mediated, at least in part, by a drop in the production of the oncogenic splice variant of the splicing factor BCLAF1 (Zhou et al., 2014b). Thus, the overall stimulation in gene expression in cancer cells may increase their reliance on splicing factors, hence providing avenues to explore novel anticancer strategies.
Splicing control defects in neuropathological and muscle-related diseases
As in cancer, pathogenic mechanisms in neurological and muscle-associated diseases can be caused by mutations in genes that affect splicing of their pre-mRNAs, or by mutations that affect the expression and the activity of splicing factors that control splice site utilization. Excellent reviews have recently presented the prevalence of alternative splicing, the role of RBPs, and the functional diversity of splice variants in neuronal systems (Darnell, 2013; Raj and Blencowe, 2015). Here, we present recent advances that solidify the links between splicing control and neuronal and muscular pathologies (Fig. 6). Identifying functionally relevant variants and changes in the expression/activity of regulators remains challenging, particularly in neuropathologies. This is mainly a result of tissue availability and heterogeneity, as well as difficulties in developing adequate animal models that recapitulate human phenotypes.
ALS and FTD
Although mutations in the splicing regulatory RBP TDP-43 are found in only a fraction of all cases of ALS and FTD, cytoplasmic inclusions and the nuclear depletion of TDP-43 are hallmarks of these diseases (Janssens and Van Broeckhoven, 2013; Scotter et al., 2015). Decreasing the expression of TDP-43 leads to neuronal defects in mice and affects the alternative splicing of transcripts encoding components important in neuronal development or implicated in neurological diseases (Polymenidou et al., 2011; Tollervey et al., 2011; Yang et al., 2014a). Splicing defects in ALS tissues occur in target TDP-43 transcripts (Arnold et al., 2013; Yang et al., 2014a). A recent study in mice indicates that a decrease in TDP-43 impairs splicing fidelity and leads to the aberrant inclusion of cryptic exons, an effect also seen in brain tissues from ALS-FTD patients (Ling et al., 2015). Similar to TDP-43, mutations and loss of nuclear function of FUS have been linked to alternative splicing changes in ALS, with a few pre-mRNA targets also regulated by TDP-43 (Lagier-Tourenne et al., 2012; Coady and Manley, 2015). Cytoplasmic aggregates of mutated FUS or TDP-43 often sequester other splicing proteins, and this may also contribute to alterations in splicing profiles. For example, the ability of FUS to interact with U1 snRNP is likely responsible for the U1 snRNP cytoplasmic mislocalization in FUS-mutated ALS patient fibroblasts (Yu and Reed, 2015; Yu et al., 2015). ALS-associated mutations in hnRNP A1/A2 proteins also cause cytoplasmic aggregation (Kim et al., 2013). In several ALS-FTD patients, GGGGCC repeat expansion that promotes G-quadruplex formation in the C9ORF72 gene sequester splicing factors such as SRSF2 and hnRNP H, which in turn may promote extensive alternative splicing defects and neurodegeneration (Lee et al., 2013; Prudencio et al., 2015; Zhang et al., 2015b). Further studies should clarify whether the pathogenic impact of aggregates is strictly caused by loss of function or whether toxicity associated with aggregate formation also contributes to the clinical manifestation of ALS and FTD.
Alzheimer’s disease (AD) and Huntington’s disease (HD)
The deposition of oligomeric β-amyloid peptides and the formation of neurofibrillary tangles associated with the hyperphosphorylation of the microtubule-associated TAU protein have been implicated in AD (Ittner and Gotz, 2011). ApoE4 status is one of the strongest genetic risk factors, and it possibly affects both β-amyloid and neurofibrillary tangle pathologies. Many genes involved in these pathways, including ApoE4, sustain splicing mutations that have been linked to AD or present profiles of alternative splicing that are altered in AD tissues (Love et al., 2015). RNA sequencing data suggest considerable alternative splicing abnormalities in AD tissues, including in transcripts encoding presenilin-1 and clusterin (Bai et al., 2013). Several splicing factors whose expression are misregulated in AD have been identified, including RBFOX, SR, and hnRNP A1 proteins, whereas splicing components, such as the U1 snRNP, appear to be depleted from the nucleus to form cytoplasmic aggregates (Bai et al., 2013; Hales et al., 2014). Interestingly, a depletion of U1 snRNP components in HEK293 cells disrupts the expression of splice variants encoding the amyloid precursor protein and increases the level of a β-amyloid peptide (Bai et al., 2013). HD is caused by expanded CAG repeats in the HTT gene that promote missplicing of its transcripts (Sathasivam et al., 2013). The CAG repeats may also sequester splicing factors eliciting alternative splicing defects in other transcripts (Mykowska et al., 2011). Like individuals suffering from FTD, HD subjects display an imbalance in the production of TAU variants that promote deposits. Human HD tissues and a mouse model of HD show alterations in the expression of SRSF6, which may modulate TAU splicing, leading to TAU variants with a greater propensity to form deposits (Yin et al., 2012; Fernández-Nogales et al., 2014).
Schizophrenia (SZ)
SZ is a complex neuronal disease promoting brain dysfunction. A variety of alternative splicing anomalies have been described in the brain or neuronal subtypes of SZ patients, including transcripts encoding a glutamate transporter (EAAT; O’Donovan et al., 2015) and microcephalin (MCPH1; Oldmeadow et al., 2014). A polymorphism associated with an increased risk of SZ occurs in the dopamine receptor gene DRD2 and affects the ability of the splicing regulator ZRANB2 to control alternative splicing of DRD2 transcripts (Cohen et al., 2015). The lncRNA gomafu, which is down-regulated in the gray matter from the superior temporal gyrus of SZ patients, is bound by the splicing regulators QKI and SRSF1 to control the alternative splicing of transcripts implicated in SZ (Barry et al., 2014). Other lncRNAs have been associated with neuronal stem cell differentiation and the control of alternative splicing through interaction with the neuronal splicing factor PTBP1 (Ramos et al., 2015). However, although changes in the expression of lncRNAs involved in epigenetic modifications have been linked to neuronal diseases, their contribution to alternative splicing control remains to be examined (Roberts et al., 2014).
Autism spectrum disorder (ASD)
Mutations in, or altered expression of, >100 genes have been linked to ASD (Devlin and Scherer, 2012; Corominas et al., 2014). The majority of these genes produce splice variants, and recurrent splicing defects in some of them have been noted in autistic individuals (Voineagu et al., 2011; Corominas et al., 2014). RBFOX proteins play a critical role in brain development and function (Gehman et al., 2011, 2012), and RBFOX1 haploinsufficiency has been implicated in a variety of neuropsychiatric disorders including ASD (Voineagu et al., 2011). In the mouse brain, the depletion of RBFOX proteins alters the alternative splicing of transcripts implicated in ASD (Weyn-Vanhentenryck et al., 2014). Identification of a clinically relevant set of splicing events remains challenging because RBFOX proteins affect other pathways in RNA processing and in transcription. Moreover, three highly related RBFOX proteins with partially overlapping functions are expressed in the brain. A recent study has identified a highly dynamic set of microexons (3–15 nucleotides in size) in transcripts of different neurofunctional categories that are misregulated in the brain of autistic individuals. Several neural microexons affect protein–protein interactions that are crucial for neural function, and many are controlled by the splicing regulator nSR100, whose expression is important for normal nervous system development (Quesnel-Vallières et al., 2015) and is reported to be reduced in autistic brain tissues (Irimia et al., 2014). Neural microexon splicing is also regulated by the PTBP1 and RBFOX proteins (Li et al., 2015) that are critical for normal neuronal function (Gehman et al., 2012; Licatalosi et al., 2012; Li et al., 2014). Because microexons have also been linked to SZ and epilepsy, it will be most revealing to characterize the molecular pathways that regulate their inclusion in these neurological disorders.
SMA
Mutations that reduce the level of SMN proteins, which are involved in snRNP biogenesis, cause SMA. Although multiple alternative splicing defects have been noted, it remains unclear which splicing abnormalities cause the human phenotypes (Ule et al., 2005; Zhang et al., 2008, 2013; Fogel et al., 2012; Highley et al., 2014). As the SMN protein deficiency can be rescued by stimulating exon 7 inclusion in the SMN2 pre-mRNA, efforts deployed to achieve this goal in mouse models have produced encouraging results using oligonucleotides that block the activity of an intron splicing silencer (Hua et al., 2015; Staropoli et al., 2015) or small molecules that stimulate exon 7 inclusion with apparent high specificity (Naryshkin et al., 2014; Palacino et al., 2015).
Heart disease
Mutations that truncate the sarcomeric protein titin cause dilated cardiomyopathy (Herman et al., 2012). A loss-of-function mutation in RBM20 affects the alternative splicing of titin, causing dilated cardiomyopathy (Guo et al., 2012). Hypoxic conditions associated with cardiac hypertrophy activate the expression of SF3B1, which in turn induces the production of a splice variant of ketohexokinase associated with contractile dysfunction (Mirtschink et al., 2015).
Advances and challenges in monitoring disease-associated changes in alternative splicing
Today, it is very clear that cells derived from patients with a variety of diseases display splicing defects, with studies relating to cancer and neuropathologies being the most prevalent. These splicing alterations may generate recognizable signatures that can guide diagnostics and may lead to the identification of new therapeutic targets. This important cataloguing effort is now increasing through genome-wide studies that exploit affordable RNA sequencing technologies and access to sequence repositories. Bioinformatic resources designed to interrogate these data are also expanding and are becoming widely available (Tang et al., 2013; Sebestyén et al., 2015; Hollander, D., and Ast, G., personal communication).
The reliable identification of targets that support actionable therapeutic approaches is challenged by the fact that correlations are often derived from heterogeneous clinical samples. Moreover, although documentation of the functional impact of splice variants is accumulating (Kelemen et al., 2013; Pagliarini et al., 2015), the causal contribution of disease-associated splice variants to the disease remains unknown in most cases. The functional assessment of a continuously expanding list of splice variants is an experimentally daunting task, possibly explaining why recent studies have restricted their analysis to mRNA variants encoding proteins with known distinct activities or with premature stop codons that decrease protein production.
To understand the molecular mechanisms that lead to splicing alterations, it will be important to (a) assess the expression, posttranslational modifications, or mutations of splicing regulators and chromatin-modifying components; (b) profile the binding sites of the putative regulatory RBP on target pre-mRNAs in relevant tissue cells as was originally done for NOVA, whose inactivation causes paraneoplastic neurological disorders (Zhang et al., 2010); and (c) sequence the genome of diseased and normal tissues for each patient to identity somatic mutations that may contribute to splicing alterations. This comparison is especially relevant to cancer in which genomes are often intrinsically unstable. Moreover, in light of the model proposed earlier, defects in the activity or levels of splicing factors may lead to R loop–mediated mutations that may have a permanent impact on alternative splicing.
To accommodate the analyses of this vast quantity of data, robust computational methods are being developed to link the production of recurrent variants with changes in RBPs (Sebestyén et al., 2015). Alternatively, combining large-scale collections of molecular interaction datasets (protein–DNA, protein–RNA, and protein–protein) with cancer transcriptome datasets may reveal regulatory pathways relevant to cancer (Hollander, D., and Ast, G., personal communication). Putative connections can then be validated experimentally or confirmed, for example by using The Cancer Genome Atlas. These emerging procedures justify the usefulness of network-based approaches (Yang et al., 2014b) to capture molecular relationships across different regulatory layers that become compromised or that emerge during diseases.
Acknowledgments
We thank Nancy Greenbaum and Raymund Wellinger for thoughtful comments on the manuscript. We thank Gil Ast and Dror Hollander for sharing results before publication. We apologize to colleagues whose work could not be cited because of space constraints.
We acknowledge support from the Canadian Institutes of Health Research. B. Chabot is the Pierre C. Fournier Research Chair in Functional Genomics at Université de Sherbrooke.
The authors declare no competing financial interests.
References
- AD
Alzheimer’s disease
- ALS
amyotrophic lateral sclerosis
- AML
acute myeloid leukemia
- ASD
autism spectrum disorder
- DDR
DNA damage response
- EMT
epithelial–mesenchymal transition
- FTD
frontotemporal dementia
- HD
Huntington’s disease
- hnRNP
heterogeneous nuclear RNP
- lncRNA
long noncoding RNA
- MDS
myelodysplastic syndromes
- RBP
RNA-binding protein
- SMA
spinal muscular atrophy
- snRNP
small nuclear RNP
- SNV
single-nucleotide variation
- SR
serine arginine
- SZ
schizophrenia