Subcellular localization of RNAs has gained attention in recent years as a prevalent phenomenon that influences numerous cellular processes. This is also evident for the large and relatively novel class of long noncoding RNAs (lncRNAs). Because lncRNAs are defined as RNA transcripts >200 nucleotides that do not encode protein, they are themselves the functional units, making their subcellular localization critical to their function. The discovery of tens of thousands of lncRNAs and the cumulative evidence involving them in almost every cellular activity render assessment of their subcellular localization essential to fully understanding their biology. In this review, we summarize current knowledge of lncRNA subcellular localization, factors controlling their localization, emerging themes, including the role of lncRNA isoforms and the involvement of lncRNAs in phase separation bodies, and the implications of lncRNA localization on their function and on cellular behavior. We also discuss gaps in the current knowledge as well as opportunities that these provide for novel avenues of investigation.
Introduction
A large body of research over the last few decades has revealed that RNAs are multifaceted, versatile regulators of most cellular processes, contrary to the initial perception that they acted solely as mediators for translating DNA to protein. As the field advanced, a factor that has emerged as central to RNA function is their subcellular localization (Buxbaum et al., 2015). Indeed, building on studies of asymmetric RNA localization in ascidia, yeast, and Xenopus (Jeffery, 1984; Jeffery, 1989; Long et al., 1997; Melton, 1987; Pizzinga and Ashe, 2014; Pizzinga et al., 2019), a hallmark study of RNA cellular distribution revealed that up to 70% of mRNAs in Drosophila embryos exhibit specific localization patterns, serving to nucleate key cellular machineries (Lécuyer et al., 2007).
In mammalian cells, RNA localization has been studied the most in depth in highly polarized cells, such as neurons (Park et al., 2010; Park et al., 2014; Wu et al., 2016), in which it is now well established that selective mRNA transport, localized storage, and/or translation are essential for synaptic plasticity, axon branching, and growth (Fernandez-Moya et al., 2014; Hengst et al., 2006; Perry et al., 2016; Sambandan et al., 2017). MicroRNAs (miRNAs) can also be transported, processed, and actively suppress mRNA targets in response to synaptic stimulation (Corradi et al., 2020). Further, specific mRNAs also localize to cellular protrusions of migrating fibroblasts, where their active translation and subsequent silencing upon retraction control the dynamic process of cell migration (Mardakheh et al., 2015; Mili et al., 2008; Moissoglu et al., 2019; Wang et al., 2017a). Under homeostatic conditions, mRNAs and a distinct subset of miRNAs, as well as their precursor primary miRNAs (pri-miRNAs), localize at adherens junctions of well-differentiated epithelial cells (Kourtidis et al., 2017; Kourtidis et al., 2015). These miRNAs are processed by adherens junction–localized RNAi machinery to suppress a set of mRNAs involved in cell growth and protumorigenic signaling, thereby maintaining polarized epithelial cell homoeostasis (Kourtidis et al., 2017; Kourtidis et al., 2015; Nair-Menon et al., 2020). Moreover, mRNAs and miRNAs also coexist in cytoplasmic membraneless p-bodies (PBs; aka GW-bodies) and stress granules (SGs; Jakymiw et al., 2005; Jakymiw et al., 2007), with the latter mainly forming under conditions of stress to temporarily suppress translation and preserve mRNAs (Anderson and Kedersha, 2009; Leung et al., 2006). These compartments allow cells to rapidly engage and disengage mRNAs in translation, in response to stimuli (Hubstenberger et al., 2017).
These studies demonstrate that spatial subcellular RNA distribution is a broad phenomenon that occurs across cell types and species, under homeostatic, stimulated, or cellular stress conditions. Tight regulation of RNA localization controls localized protein expression, turnover, and subsequent signal regulation. Here, we discuss emerging research that an expanding class of RNAs, long noncoding RNAs (lncRNAs), can function in equally diverse and dynamic cellular processes. Further, we emphasize that their subcellular localization is critical in understanding lncRNA interaction partners, post- or cotranscriptional regulatory modifications, the external stimuli directly impacting lncRNA function, and the broad range of roles lncRNAs can play in cellular homeostasis.
lncRNAs
The development of RNA sequencing (RNA-seq) technologies and mapping of expressed transcripts revealed that while the human genome is pervasively transcribed, only a small fraction of RNAs (∼2%) code for proteins (ENCODE Project Consortium et al., 2007; Djebali et al., 2012; Kapranov et al., 2007). The majority of expressed transcripts do not encode protein, with those >200 nt in length being broadly classified as lncRNAs (Hangauer et al., 2013; Iyer et al., 2015; Managadze et al., 2013; St Laurent et al., 2015; see text box). In this review, we focus on these >200-nt-long RNA transcripts with described noncoding functions. lncRNAs are now recognized as playing crucial roles in numerous cellular processes, including the cell cycle (Kitagawa et al., 2013), differentiation (Ballarino et al., 2016; Brazão et al., 2016; Delás et al., 2017), and metabolism (Sirey et al., 2019; Sun and Wong, 2016), as well as in disease (Esteller, 2011; Wang et al., 2013; Yuan et al., 2014). Recent evidence also suggests that lncRNAs play a role in viral infection (Wang et al., 2020). An outpouring of lncRNA-centered research over the last decade has been fueled by the intriguing nature of these RNAs, which can function through diverse mechanisms of action. For example, lncRNAs can modulate transcription, epigenetic modifications, protein/RNA stability, translation, and posttranslational modifications by interacting with DNA (Arora et al., 2014; Clemson et al., 1996; Postepska-Igielska et al., 2015), RNAs (Grelet et al., 2017; Kleaveland et al., 2018; Zealy et al., 2018) and/or proteins (Ahn et al., 2018; Jiang et al., 2017; Yamazaki et al., 2018). lncRNAs were also recently shown to directly interact with signaling receptors (Schmidt et al., 2020). The ability of lncRNAs to interact with a range of molecular species underscores that understanding lncRNA localization and local interactions are key to predicting their function.
long non-coding RNAs (lncRNAs): RNA transcripts longer than 200 nucleotides that do not encode protein
circular RNAs (circRNAs): a subclass of long non-coding RNAs with covalently linked ends that are generated during splicing when a splice donor site joins an upstream splice acceptor site (“back-splicing”)
phase separation bodies: non-membranous subcellular structures that form through liquid–liquid phase separation of ribonucleoprotein complexes
nuclear speckles: nuclear, phase-separated bodies, enriched in and regulating pre-mRNA splicing factors
paraspeckles: nuclear, phase-separated bodies that regulate gene expression by sequestering RNAs and proteins
p-bodies (PBs): cytoplasmic, phase-separated bodies, constitutively present under homeostatic conditions, where untranslating mRNAs associate with the RNAi machinery, miRNAs, and the RNA decay machinery
stress granules (SGs): cytoplasmic, phase-separated bodies, composed of untranslating mRNAs that form in response to translational arrest triggered by stress stimuli
While lncRNAs do not encode protein, certain aspects of lncRNA biology parallel that of mRNAs. Like mRNAs, the majority of lncRNAs are transcribed by RNA polymerase II (Pol II) and are capped and polyadenylated (Derrien et al., 2012). Although it was originally purported that lncRNAs are unstable, this is true for only a minority of lncRNAs (Clark et al., 2012). Most lncRNAs are stabilized through polyadenylation (Beaulieu et al., 2012; Clark et al., 2012), while non-polyadenylated lncRNAs can be stabilized through secondary structures, such as triple-helical structures in their 3′ ends (Brown et al., 2012; Wilusz et al., 2012). Apart from a stabilizing role, these 3′ sequence features can facilitate efficient nuclear export (Wilusz et al., 2012). The vast majority of lncRNAs undergo extensive alternative splicing, dramatically increasing their potential number of isoforms (Deveson et al., 2018). It was recently reported that while lncRNA splicing efficiency, meaning the specific intron splicing frequency, appears to be lower than that of mRNAs (Zuckerman and Ulitsky, 2019), in agreement with previous reports (Melé et al., 2017; Tilgner et al., 2012), lncRNAs are substantially more alternatively spliced than mRNAs. The extensive alternative splicing of lncRNAs as an overlooked aspect of their biology with the potential to further diversify the functional outcome of an lncRNA via differential localization patterns is discussed later in this review.
While lncRNA expression levels are typically lower than that of mRNAs (Mukherjee et al., 2017), they display stronger tissue-specific expression patterns, suggesting integral roles in cell type–specific processes (Cabili et al., 2015; Derrien et al., 2012; Djebali et al., 2012; Zuckerman and Ulitsky, 2019). Further, while mRNAs exhibit high sequence conservation among species, lncRNAs generally lack this conservation (Hezroni et al., 2015; Kutter et al., 2012; Necsulea et al., 2014; Ulitsky, 2016; Ulitsky et al., 2011). While this makes assessing lncRNA function more challenging, it may also provide insights on the roles that lncRNAs have evolved to play in different species. Still, there are subsets of lncRNAs that exhibit conservation at the sequence or genomic position level and may function similarly across species (Amaral et al., 2018; Hezroni et al., 2015; Necsulea et al., 2014; Ulitsky, 2016; Ulitsky et al., 2011). However, a recent intriguing study (Guo et al., 2020a) revealed that while a significant set of lncRNAs displayed sequence and/or positional conservation between human and mouse embryonic stem cells, these lncRNAs are processed differently, consequently localize to different subcellular compartments, and ultimately serve distinct functions in mouse versus human cells. This was starkly contrasted by conserved mRNAs, which displayed similar localization patterns in both species. This work demonstrates that lncRNA sequence conservation does not always translate to conserved functional roles and that lncRNA processing and binding partners significantly impact subcellular distribution and function. Most importantly, this study further emphasizes a role for lncRNA localization in lncRNA function (Chen, 2016). Overall, some of the differences between mRNAs and lncRNAs reflect that unlike mRNAs, which need to be translated into proteins that carry out specific cellular functions, lncRNAs themselves are the functional unit. Therefore, like proteins, lncRNA functions in different subcellular compartments are directed by local molecular interactions, which must be finely tuned to maintain cellular homeostasis.
lncRNAs in the nucleus versus the cytoplasm
The first functionally characterized lncRNAs were primarily chromatin regulators. Although those studies established functional significance for an unappreciated class of RNAs, they also instilled the notion that lncRNAs are generally nuclear (Clemson et al., 1996; Khalil et al., 2009; Mondal et al., 2010; Tsai et al., 2010; Zhao et al., 2008). Indeed, an early microarray-based screen for nuclear-enriched polyadenylated RNAs uncovered three abundant lncRNA transcripts, namely XIST, NEAT1, and MALAT1 (Hutchinson et al., 2007). Although lncRNAs are overall more numerous in the nucleus (Cabili et al., 2015; Fazal et al., 2019; Kaewsapsak et al., 2017), recent studies indicate that the number of cytoplasmic lncRNAs is higher than previously thought (Benoit Bouvrette et al., 2018; Carlevaro-Fita et al., 2016; van Heesch et al., 2014), expanding the repertoire of distinct topologies in which lncRNAs can participate inside the cell (Fig. 1). Mechanisms that regulate nuclear or cytoplasmic localization of lncRNAs are extensively discussed later in this review.
Interestingly, although nuclear lncRNAs are overall more abundant, they are less stable than their cytoplasmic counterparts (Clark et al., 2012; Zuckerman and Ulitsky, 2019). It has been suggested that the instability of nuclear lncRNAs reflects their roles in regulating gene expression, facilitating dynamic fine tuning of their levels in response to stimuli, analogous to the turnover of transcription factors (Clark et al., 2012). Mechanistically, nuclear instability of lncRNAs can be regulated by PABPN1 through promoting polyA-polymerase–dependent hyperadenylation and subsequent decay of lncRNAs (Bresson et al., 2015).
Because lncRNA interactions influence their function, the roles of lncRNAs in the nucleus are expected to be different than in the cytoplasm. In the nucleus, lncRNAs function to modulate transcriptional programs through chromatin interactions and remodeling (Kugel and Goodrich, 2012; Melé and Rinn, 2016; Saxena and Carninci, 2011) and establish spatial organization of the nuclear compartment via scaffolding (Clemson et al., 2009). In the cytoplasm, lncRNAs function to mediate signal transduction pathways, translational programs, and posttranscriptional control of gene expression. For example, lncRNAs can sequester miRNAs (Cesana et al., 2011; Du et al., 2016) and proteins (Lee et al., 2016) to regulate their activity and levels (Du et al., 2016; Grelet et al., 2017; Song et al., 2014), influence protein posttranslational modifications (Lin et al., 2016), or mediate mRNA translation and stability (Carrieri et al., 2012; Gong and Maquat, 2011; Yoon et al., 2012; Yuan et al., 2017). Notably, a recent study revealed that lncRNAs serve as scaffolds in the cytoplasm to nucleate complex networks of proteins functioning in tightly regulated signaling transduction programs, such as the TLR-TRIF (Toll-like receptor/TIR-domain-containing adapter-inducing IFN-β) immune pathway (Aznaourova et al., 2020). The lncRNAPYCARD-AS1 provides an example of how the same lncRNA transcript functions differently in the nuclear versus the cytoplasmic compartment. PYCARD-AS1 is an antisense lncRNA to the proapoptotic gene PYCARD. In the nucleus, this lncRNA recruits DNMT1 and G9a to the PYCARD promoter to facilitate DNA methylation and H3K9me2 modification. Concomitantly, in the cytoplasm, PYCARD-AS1 interacts with PYCARD mRNA to inhibit ribosome assembly and PYCARD translation (Miao et al., 2019; Fig. 1). Therefore, the subcellular microenvironment enables distinct functions of the same lncRNA, by enabling interactions with different functional protein partners and targets of action.
Localization of lncRNAs to organelles and macromolecular structures
In addition to their overall distribution in the nucleus or the cytoplasm, studies have begun to interrogate the localization of lncRNAs to specific organelles and macromolecular structures. The distribution of RNAs to distinct compartments has largely been studied using fractionation-based methods, followed by RNA-seq and in situ hybridization–based microscopy. While providing valuable inventories, these approaches are limited to identifying lncRNAs in cell fractions that can be biochemically separated and by the number and design of sequence-specific imaging probes. New advances in imaging techniques have facilitated imaging of thousands of barcoded RNAs (Chen et al., 2015; Shah et al., 2016), while APEX-RIP, a method that combines engineered ascorbate peroxidase (APEX)–catalyzed proximity biotinylation of endogenous proteins (Rhee et al., 2013) and RNA immunoprecipitation (RIP) has allowed for specific, unbiased, a priori quantification of RNAs in compartments such as the nucleus, cytosol, mitochondria, and ER (Benhalevy et al., 2018; Kaewsapsak et al., 2017). Notably, the most recent studies using APEX-RIP have provided further, finer-scale mapping to the nuclear lamina, nucleolus, nuclear pore, ER membrane, and outer mitochondrial membrane (Fazal et al., 2019), as well as cell–cell interfaces (Benhalevy et al., 2018) and phase-separated bodies (Padrón et al., 2019). Although these inventories have been constructed, only a small portion of organelle-associated lncRNAs have been further validated or functionally characterized. Below we highlight lncRNAs with characterized functions in distinct subcellular compartments and discuss current gaps in knowledge.
Mitochondrial- and ER-localized lncRNAs
A groundbreaking study (Rackham et al., 2011) described that noncoding RNAs, other than rRNAs and tRNAs, make up 15% of the human mitochondrial transcriptome. This and subsequent studies revealed that the majority of mitochondrially enriched lncRNAs play key roles in mitochondrial gene regulation (Mercer et al., 2011; Rackham et al., 2012; Rackham et al., 2011). Regulation of mitochondrially encoded lncRNAs was found to be highly cell- and tissue-specific and to be controlled by nuclear-encoded proteins (Rackham et al., 2011). In addition, lncRNAs transcribed from nuclear DNA have important roles in mitochondrial homeostasis (Leucci et al., 2016; Noh et al., 2016; Vendramin et al., 2018). For example, RMRP is the noncoding RNA component of the RNA processing endoribonuclease (RNase MRP) that is essential for the processing of preribosomal RNA in the nucleolus (Goldfarb and Cech, 2017). However, upon binding to Hu antigen R (HuR), RMRP is exported in the cytoplasm through CRM1 and targeted to the mitochondria, where it selectively localizes to the inner mitochondrial matrix and associates with the mitochondrial protein GRSF1 to maintain mitochondrial structure and mediate oxidative phosphorylation and mitochondrial DNA replication (Noh et al., 2016; Figs. 1 and 2). Additionally, a recent study revealed that differentiation of thermogenic adipocytes leads to cAMP-dependent transcriptional up-regulation of the lncRNA, LINC00473, and distinct mitochondrial interactions (Tran et al., 2020). While LINC00473 is localized in the nucleus at basal state, during differentiation, upon cAMP up-regulation, LINC00473 shuttles to the cytoplasm and localizes to the lipid droplet–mitochondria interface, where it exists in multimeric complexes with mitochondrial and lipid droplet proteins and regulates lipolysis and mitochondrial function (Tran et al., 2020; Fig. 1). Together, these studies demonstrate that lncRNAs play a role in the dynamic interplay between the mitochondria and the nucleus.
Furthermore, APEX-RIP analysis using an APEX construct that is targeted to the ER lumen (Kaewsapsak et al., 2017) revealed enrichment of a small group of 28 lncRNAs. Although lncRNAs found in the ER were far less abundant than mRNAs, they represent a new and distinct family of ER-localized lncRNAs. Indeed, a recent study reported that although 97% of ER membrane–enriched transcripts were mRNAs, two lncRNAs, namely TUG1 and NORAD, were identified as highly abundant ER-enriched transcripts (Fazal et al., 2019), pointing toward potentially novel and unexplored function roles, since TUG1 has been previously shown to function in the nucleus (Wang et al., 2017b), and NORAD’s role in sequestering Pumilio proteins is more broadly observed through the cytoplasm (Lee et al., 2016; Fig. 1). However, the role and association of lncRNAs with the ER is still largely unexplored and represents a distinct gap in our knowledge of lncRNA localization and function.
lncRNAs associated with ribosomes
Multiple studies have observed recruitment of lncRNAs to ribosomes and reported their association with the translational machinery (Carlevaro-Fita et al., 2016; van Heesch et al., 2014). Strikingly, 54% of expressed lncRNAs were reported to be in the cytoplasm, with the majority of those lncRNAs having most of their transcripts associated with polysomal fractions (Carlevaro-Fita et al., 2016). Although it is evident that ribosomal complexes are sites of distinct lncRNA localization in the cytoplasm, there remains much controversy over the functional significance of this association. A substantial number of annotated lncRNAs have recently been identified to encode small peptides (Anderson et al., 2015; Chen et al., 2020; Huang et al., 2017; Matsumoto et al., 2017; Nelson et al., 2016; Rossi et al., 2019; Stein et al., 2018; van Heesch et al., 2019); however, that lncRNA association with polysomal fractions is indicative of active translation (Ingolia et al., 2011; Ruiz-Orera et al., 2014) is not supported by other studies (Bánfai et al., 2012; Guttman et al., 2013). In fact, work in yeast and mammalian cells demonstrated that lncRNAs or unannotated RNAs are targeted to ribosomes for degradation by nonsense-mediated decay (NMD; Carlevaro-Fita et al., 2016; Chew et al., 2013; Smith et al., 2014). Interestingly, it was shown that ribosomal localization and NMD sensitivity of some of these transcripts is due to the presence of single short open reading frames at their 5′ ends (Smith et al., 2014). In another example, the well-studied GAS5 lncRNA has also been shown to be degraded through NMD when it is associated with ribosomes, owing to the presence of premature stop codons (Smith and Steitz, 1998; Tani et al., 2013). Therefore, current evidence regarding the exact functions of lncRNAs at ribosomes is not conclusive, and more studies are required to more fully understand the significance and complexity of this localization (Fig. 1).
lncRNAs in phase-separation bodies
Besides the traditionally known membrane-bound organelles, nonmembranous subcellular structures have recently gained recognition as distinct cellular compartments where specific sets of molecules concentrate to spatiotemporally regulate critical cellular mechanisms (Banani et al., 2017; Ivanov et al., 2019; Protter and Parker, 2016; Shin and Brangwynne, 2017). These structures form through the process of liquid–liquid phase separation (Courchaine et al., 2016; Weber and Brangwynne, 2012), resulting in well-defined borders while enabling a dynamic exchange of content with the cytoplasm or nucleoplasm (Brangwynne et al., 2009; Lee et al., 2013; Shin and Brangwynne, 2017). Generally, these phase condensates represent ribonucleoprotein (RNP) complexes where protein–protein, RNA–protein, and RNA–RNA interactions promote condensation (Banani et al., 2017; Fay et al., 2017; Protter and Parker, 2016; Shin and Brangwynne, 2017; Van Treeck et al., 2018; see text box).
Phase-separation bodies include nuclear speckles, paraspeckles, Cajal bodies, and nucleoli in the nucleus, as well as PBs and SGs in the cytoplasm. Neuronal granules are also used for synaptic transport and regulation. Each of these compartments serves distinct roles and therefore consists of specific RNA species. lncRNAs have roles in the organization and function of these phase-separation bodies. Specifically, some of the most well-characterized lncRNAs, such as MALAT1 and NEAT1, accumulate in nuclear speckles and paraspeckles, respectively (Clemson et al., 2009; Hutchinson et al., 2007; Sasaki et al., 2009; Sunwoo et al., 2009; Fig. 1). Nuclear speckles are phase-separated bodies that are enriched in and regulate pre-mRNA splicing factors (Spector and Lamond, 2011), while paraspeckles are subnuclear bodies that exist near speckles and also contain splice factors, but are largely thought to function in regulating gene expression via sequestering mRNAs and proteins (Chen and Carmichael, 2009; Hirose et al., 2014; Imamura et al., 2014; see text box). NEAT1 is essential for the formation and structural integrity of paraspeckles (Clemson et al., 2009; Sasaki et al., 2009; Sunwoo et al., 2009), which are dependent on specific RNA–RNA interactions with NEAT1 domains (Lin et al., 2018). NEAT1 and MALAT1 are large lncRNAs, and the length of these speckle-enriched lncRNAs aligns with their role in sequestering distinct RNA species, directly or indirectly, through capturing of RNA binding proteins (RBPs). MALAT1 is known to regulate pre-mRNA splicing and is specifically required for the recruitment of several serine- and arginine-rich splicing factors (SRSF1, 2, and 3) to nuclear speckles (Tripathi et al., 2010). Notably, a “bird’s nest” model for NEAT1 has been described, in which NEAT1 broadly interacts with many RBPs, specifically the non-POU domain containing octamer binding (NONO)/polypyrimidine tract-binding protein-associated splicing factor (PSF) heterodimer, to build an extensive scaffold (bird’s nest) structure and recruit pri-miRNAs and the microprocessor in close proximity; this positioning functions to enhance the efficiency of pri-miRNA processing in paraspeckles (Jiang et al., 2017; Fig. 1). Multiple RNA regions of NEAT1 were important to this recruitment, but a distinct “pseudo pri-miRNA” in the NEAT1 3′ end served an important role in attracting the microprocessor via its stem loop, further underscoring that lncRNA secondary structures help to define their potential roles, in addition to stability.
Although the presence of lncRNAs in cytoplasmic PBs and SGs has been demonstrated, there are still many questions regarding their roles and mechanisms of recruitment. In the cytoplasm, PBs are constitutively present under homeostatic conditions, where untranslating mRNAs associate with the RNAi machinery, miRNAs, and the RNA decay machinery (Hubstenberger et al., 2017; Schütz et al., 2017; Standart and Weil, 2018; Teixeira et al., 2005; see text box). Similarly, SGs are largely composed of untranslating mRNAs, yet SGs form in response to translational arrest triggered by stress stimuli (Aulas et al., 2017; Kedersha et al., 2005; Panas et al., 2016; Tauber and Parker, 2019; see text box). The lack of translation of these mRNAs, in addition to their length, has been determined to influence their recruitment to these bodies (Khong et al., 2017; Matheny et al., 2019). These observations regarding mRNAs raise questions of whether and how untranslated lncRNAs are recruited to these bodies. Studies profiling the transcriptome of PBs and SGs have identified the presence of lncRNAs. However, compared with mRNAs, the abundance of lncRNA species seems to be overall low (Khong et al., 2017; Namkoong et al., 2018). Although low abundance of lncRNAs could be interpreted as being due to their minimal functional significance in these bodies, a recent study (Pitchiaya et al., 2019) tracking the dynamics of individual lncRNAs with PB foci suggests that the transient nature of interactions of lncRNAs with the periphery of PBs may make it difficult to capture lncRNA PB/SG interactors with the biochemical isolation methods used by previous transcriptome inventories (Khong et al., 2017; Namkoong et al., 2018). That work also demonstrated that lncRNAs interacting with known PB-enriched proteins such as IGF2BP1 (THOR lncRNA) and HuR (ARlnc1 lncRNA) interact more frequently and less randomly than lncRNAs lacking such binding sites for protein partners (Hubstenberger et al., 2017; Pitchiaya et al., 2019; Fig. 1). When it comes to the role of lncRNAs in these cytoplasmic phase condensates, it has been hypothesized (Pitchiaya et al., 2019) that the transient interactions of lncRNAs with the PB periphery may function to deposit mRNAs to PBs for storage or remove mRNAs from PBs to be translated in the cytoplasm. Notably, one lncRNA found to be significantly enriched within SGs is NORAD (Khong et al., 2017; Namkoong et al., 2018; Fig. 1). While the function of NORAD in SGs has not been explored in more detail, it has been proposed that the propensity of NORAD to condense via RNA–RNA interactions may play a role in SG recruitment; a recent study reported that the ATP-helicase eIF4A prevents NORAD SG recruitment via destabilizing RNA–RNA interactions, independently of translation (Tauber et al., 2020). Alternatively, it was suggested that NORAD SG recruitment was due to the presence of AU-rich elements in the transcript, which may facilitate enrichment via AU-rich element–binding proteins, such as TIA-1 or TIAR (Namkoong et al., 2018), which are essential for SG formation (Gilks et al., 2004). Together, these studies suggest a multifaceted role for lncRNAs being both structurally and functionally important players in phase-separation bodies.
lncRNAs at the cell periphery
Recently, the shear stress–induced lncRNA LASSIE was shown to be enriched in and stabilize endothelial adherens junctions through interaction with the cell–cell adhesion component PECAM-1 at areas of cell–cell contact (Stanicek et al., 2020; Fig. 1). Furthermore, the lncRNA lncMER52A, which is specifically expressed in hepatocellular carcinoma, was found to interact with and stabilize p120 catenin, a key member of adherens junctions, to promote Rho GTPase signaling (Wu et al., 2020). Although it is not clear whether this latter interaction occurs at cell–cell junctions (Wu et al., 2020), it further demonstrates the potential of lncRNAs to localize at areas of cell–cell contact. These findings, in addition to the discovery of RNA complexes and large sets of mRNAs and miRNAs interacting and colocalizing with cadherin complexes at epithelial cadherin junctions (Kourtidis et al., 2017; Kourtidis et al., 2015; Nair-Menon et al., 2020), as well as of mRNAs interacting with the gap junction component Connexin 43 in HEK293 cells (Benhalevy et al., 2018), support the notion that the cell periphery may also be an lncRNA-hosting subcellular compartment.
In addition, the discovery of lncRNAs as cargo in exosomes revealed potential roles of lncRNAs outside of the cell (Ahadi et al., 2016; Berrondo et al., 2016; Dong et al., 2016; Gezer et al., 2014; Huang et al., 2013; Li et al., 2015). Indeed, a growing body of evidence suggests that lncRNAs are actively sorted into exosomes, based on observations that levels of lncRNAs found in exosomes do not correlate with the levels measured in their parental cells (Gezer et al., 2014; Kogure et al., 2013; Koldemir et al., 2017) and that distinct lncRNAs are enriched in exosomes compared with their donor cells (Chen et al., 2016). Strikingly, exosomal lncRNA signatures are distinct, depending on whether exosomes are secreted from the apical or basolateral surface of polarized colon epithelial cells (Chen et al., 2016). Functionally, exosomes potentially mediate intercellular transfer of MALAT1 (Zhang et al., 2018) and H19 lncRNAs (Iempridee, 2017), which leads to increased cancer cell growth and transformation (Fig. 1). Additionally, packaging of the TERRA lncRNA into exosomes stimulates innate immunity (Wang et al., 2015; Fig. 1). The exact mechanisms of lncRNA-specific loading into exosomes and acceptor cell uptake are currently elusive. It is possible that similar determinants known to direct mRNA and miRNA exosomal loading, such as sequence domains (Bolukbasi et al., 2012) and association with the RNAi machinery and specific RBPs (Cha et al., 2015; Melo et al., 2014), may direct lncRNA sorting as well. However, the apparent lack of knowledge regarding active lncRNA loading to exosomes offers opportunities for further investigation.
Mechanisms regulating lncRNA localization
The instructions for the localization of lncRNAs are generally thought to be encoded in their sequence. The presence or absence of certain motifs can facilitate interactions with certain RBPs or chromatin and direct either cytoplasmic export or nuclear retention (Guttman and Rinn, 2012; Mercer and Mattick, 2013). A concept that has gained traction in the field postulates that for long RNAs (lncRNAs and mRNAs), nuclear export is the default pathway and certain sequence elements are required for nuclear retention (Palazzo and Lee, 2018). For example, several nuclear lncRNAs contain nuclear retention elements, and when these sequence domains were deleted or mutated, the lncRNAs were exported to the cytoplasm (Lubelsky and Ulitsky, 2018; Miyagawa et al., 2012; Zhang et al., 2014). Further, fusing these sequence elements to reporters or documented cytoplasmic RNAs was sufficient to sequester them in the nucleus (Lubelsky and Ulitsky, 2018; Shukla et al., 2018; Zhang et al., 2014). Independently, several of these studies demonstrated that the degree of lncRNA nuclear localization increases in a dose-dependent manner, in response to the number of nuclear retention elements in an lncRNA sequence (Carlevaro-Fita et al., 2019; Lubelsky and Ulitsky, 2018).
Several types of sequence motifs are recognized to promote nuclear retention. These motifs have been identified in more focused efforts, such as exploring the localization signals of individual lncRNAs and evaluating retention domains of interest (Carlevaro-Fita et al., 2019; Miyagawa et al., 2012; Zhang et al., 2014), as well with large-scale sequencing screens (Lubelsky and Ulitsky, 2018; Shukla et al., 2018; Yin et al., 2020). Examples of nuclear retention elements include a 15-nt C-rich motif (Shukla et al., 2018), a distinct pentamer motif in BORG (Zhang et al., 2014), and a larger, highly structured 156-bp repeat in FIRRE, which is bound by heterogeneous nuclear ribonucleoprotein U, facilitating the interaction of FIRRE with chromatin (Hacisuleyman et al., 2014; Fig. 2). Some lncRNAs have nuclear retention elements repurposed from transposable elements, known as repeat insertion domains of lncRNA (RIDLs; Carlevaro-Fita et al., 2019; Chillón and Pyle, 2016; Johnson and Guigó, 2014; Lubelsky and Ulitsky, 2018; Nguyen et al., 2020; Shukla et al., 2018). For example, a 42-nt motif derived from an Alu repeat that contains C-rich elements facilitates nuclear retention in a subset of lncRNAs through binding to heterogeneous nuclear ribonucleoprotein K (Lubelsky and Ulitsky, 2018; Fig. 2). It is important to note that this heterogeneous nuclear ribonucleoprotein K–mediated nuclear enrichment was more active in some cell types compared with others (Lubelsky and Ulitsky, 2018; Zuckerman and Ulitsky, 2019). This cell type–related specificity highlights the fact that sequence alone does not solely determine whether an lncRNA will be localized to a distinct compartment, but it is also influenced by the availability and expression levels of other binding partners that act as trans-regulators to help mediate localization. Along the lines of trans-regulator expression in distinct cell types, it has been recently reported that conserved lncRNAs in human embryonic stem cells are spliced more often that their mouse counterparts, resulting in cytoplasmic export (Guo et al., 2020a). In mouse embryonic stem cells, these conserved lncRNAs were preferentially retained in the nucleus as a result of splicing suppression by the trans factor PPIE, a member of the spliceosome associated peptidyl-prolyl cis-trans isomerase family (Schiene-Fischer, 2015), whose expression levels decrease from mouse to primates.
Together, these studies suggest that the presence of certain motifs in an lncRNA does not directly correlate with nuclear retention. Multiple motifs and interaction partners may cooperatively contribute to or compete to regulate nuclear retention, whereas the secondary structure and posttranscriptional modifications of an lncRNA can be important to allow access to this motif. With this idea in mind, a recent study evaluated the ability of computational models that incorporate multiple factors (i.e., splice efficiency, gene architecture, chromatin marks, and sequence elements) to predict nuclear or cytoplasmic enrichment of lncRNAs and protein coding genes (Zuckerman and Ulitsky, 2019). By testing these models on RNA-seq data in nine cell lines from the ENCODE project, that study found that a combination of these factors was able to predict only 15–30% of the variance in localization for lncRNAs, emphasizing that lncRNA localization is complex and directed by still-unknown factors. This work did provide insights on surprising new localization mechanisms, distinct from those influencing mRNAs. For example, it is well established that efficient splicing strongly enhances mRNA export efficiency (Hocine et al., 2010; Valencia et al., 2008). Generally, lncRNAs are inefficiently spliced, partially due to weaker binding of the critical splice factor U2AF65, their smaller polypyrimidine tracts, the greater distance of the branch point to the 3′ splice site, or their overall lack of conserved splice sites (Melé et al., 2017; Schlackow et al., 2017; Tilgner et al., 2012; Zuckerman and Ulitsky, 2019). Inefficient splicing of lncRNAs could indeed result in lower export efficiency and nuclear enrichment (Guo et al., 2020b; Zuckerman and Ulitsky, 2019; Fig. 2). However, while inefficient splicing was the major predictor of nuclear retention for protein coding genes, Pol II pausing and chromatin marks played a larger role in predicting lncRNA localization (Zuckerman and Ulitsky, 2019). Pol II promoter–proximal pausing was associated with increased nuclear export, independently of expression level. It was suggested that this pausing could allow for increased association with export factors. Notably, this was less predictive of lncRNAs and protein coding genes with highly conserved splice sites, and therefore the Pol II pausing mechanism was proposed as an alternative export mechanism for lncRNAs with poor splice site conservation. It was also noted that H3K27 acetylation and H3K4 di-/trimethylation positively correlated with cytoplasmic enrichment and could be related to increased Pol II dwelling, highlighting additional roles for epigenetic or posttranscriptional modifications to lncRNAs (Zuckerman and Ulitsky, 2019). Interestingly, lncRNAs seem to use nuclear export mechanisms similar to those of mRNAs, such as the TREX, TREX-2, and NXF1 complexes; however, a large proportion of lncRNAs are specifically dependent on NXF1, particularly inefficiently spliced lncRNA transcripts (Zuckerman et al., 2020; Fig. 2).
In contrast to other lncRNAs, circular RNAs (circRNAs), a class of lncRNAs with covalently linked ends, are distinctly enriched in and predominantly found in the cytoplasm (Jeck et al., 2013; Salzman et al., 2012; Wilusz, 2018). circRNAs are formed when pre-mRNAs are “back-spliced,” which occurs when a splice donor site joins an upstream splice acceptor (Wilusz, 2018; see text box). It was demonstrated that the UAP56 (DDX39B) helicase is responsible for the nuclear export of relatively long circRNAs (>800 nt), whereas the URH49 (DDX39A) helicase is responsible for nuclear export of short circRNAs (Huang et al., 2018; Fig. 2). In addition to being exported, their abundance in the cytoplasm can be attributed to their natural resistance to exonuclease activity (Salzman et al., 2012; Wilusz, 2018).
Intriguingly, the rate-limiting step for mRNA transport is thought to be access to and release from the nuclear pore complex (Grünwald and Singer, 2010; Ma et al., 2013), where m6A modification is reported as a fast-track signal for mRNAs (Fazal et al., 2019; Fustin et al., 2013; Roundtree et al., 2017; Zheng et al., 2013). Recent findings demonstrate that the m6A modification can also increase circRNA export to the cytoplasm (Chen et al., 2019). However, it remains unclear whether localization of lncRNAs is regulated by m6A modification. In a recent study, overexpression of the methyltransferase METTL3 increased nuclear localization of the lncRNA RP11 (Wu et al., 2019), suggesting that m6A modification may exert different effects on the distribution of mRNAs and lncRNAs. Further studies are required to fully elucidate these potential differences.
Additional important questions remain in regard to the targeting of lncRNAs to organelles in the cytoplasm. A study that found a significant relationship between GC-rich RIDLs and cytoplasmic enrichment also reported that ribosome-associated lncRNAs enriched in polysomal fractions are distinctly capped, have longer 5′ UTRs, and are depleted of repetitive elements, which could antagonize polysome binding through interactions with other protein complexes (Carlevaro-Fita et al., 2019; Fig. 2). While nuclear-encoded lncRNAs have been identified in the mitochondria, the sequence features that facilitate mitochondrial import are unexplored. Some evidence describing the import of certain mRNAs with stem loop structures (Wang et al., 2010) suggests that interactions with polynucleotide phosphorylase (PNPASE), a protein enriched in the intermembrane space (Chen et al., 2006), may play a role in transport of similar lncRNAs across the membrane. Furthermore, studies showing that mitochondrially encoded lncRNAs are found both in the cytoplasm (Mercer et al., 2011) and in the nucleus (Landerer et al., 2011) open additional questions about whether mitochondrial export or retention is the default pathway for lncRNAs generated from the mitochondrial genome. They also pose the question of how these lncRNAs are shuttled from the mitochondria into the nucleus. A recent study linked aberrant shuttling of lncRNAs from the nucleus to mitochondria (MALAT1) and conversely from the mitochondria to the nucleus (lncCytB) as a means to enhance cancer cell fitness (Zhao et al., 2019). However, the evidence for anterograde shuttling of MALAT1 to the mitochondria and retrograde shuttling of lncCytB from the mitochondria to the nucleus is not conclusive and has been challenged by previous findings showing that MALAT1 does not shuttle between the nucleus and the cytoplasm (Miyagawa et al., 2012). Taken together, these studies emphasize the need to obtain a better picture of the factors and rules directing lncRNA localization under homeostatic conditions and in disease. Overall, there is still much to be understood about the set of directions that regulate lncRNA localization, and it is likely that subsets of lncRNAs are governed by distinct instructions.
Considering lncRNA isoforms when assessing localization and function
As discussed, lncRNAs undergo extensive alternative splicing (Deveson et al., 2018; Melé et al., 2017; Tilgner et al., 2012; Zuckerman and Ulitsky, 2019). As a result, the vast majority of lncRNAs appear in multiple isoforms, considerably complicating our understanding of their localization and function (Fig. 3). A number of studies have set out to unravel this critical and mostly overlooked aspect of lncRNA biology. For example, differential localization of transcript isoforms using APEX-RIP followed bysequencing has been reported (Fazal et al., 2019), which supports the overall notion that different isoforms of the same gene may display alternative subcellular localization, depending on specific exonic regions that are retained or lost in each isoform. Indeed, transcripts containing transposable element–derived sequences are more enriched in the nucleus, compared with isoforms derived from the same gene locus (Carlevaro-Fita et al., 2019). Additional studies have globally mapped the contribution of alternative spliced variants in lncRNA localization (Zuckerman and Ulitsky, 2019). However, studies that address localization and function of specific lncRNA isoforms are still sparse. For example, it is recognized that the well-studied GAS5 lncRNA undergoes extensive alternative splicing (Goustin et al., 2019; Zuckerman and Ulitsky, 2019). Independent studies have also shown that GAS5 localizes both in the cytoplasm and in the nucleus, serving distinct functions (Meng et al., 2020; Renganathan et al., 2014; Yu et al., 2015). However, it is still unclear which of the numerous GAS5 alternative spliced isoforms contributes to each localization pattern and function (Fig. 3 A), blurring our knowledge and rendering it hard to make conclusions and informed predictions on the function of GAS5. Nevertheless, lncRNA isoform-specific studies are gaining traction, shedding light on lncRNA localization and function. For example, the lncRNA FAST (FOXD3-AS1) is positionally conserved in its sequence, but not conserved in its splicing and localization. The FAST isoform expressed in human embryonic stem cells (hESCs), namely hFAST, localizes to the cytoplasm (Guo et al., 2020a; Fig. 3 B). This enables its binding to the E3 ubiquitin ligase β-transducin repeats-containing protein (b-TrCP), blocking interaction of b-TrCP with phosphorylated β-catenin, effectively preventing its degradation and activating Wnt signaling. In contrast, in mouse ESCs (mESCs), the mouse isoform mFast is not spliced as efficiently, due to high expression of the splicing factor PPIE that occurs specifically in mESCs, but not in hESCs. As a result, mFast is retained in the nucleus and is not required for pluripotency, unlike hFAST (Guo et al., 2020a). Adding to the complexity, distinct lncRNA isoforms may localize in the same subcellular compartment but serve different functions, such as cytoplasmic isoforms of lncRNA-PXN-AS1 that differentially affect translation of PXN (Yuan et al., 2017; Fig. 3 C).
In addition to alternative splicing, differential processing of lncRNA transcripts seems to contribute to distinctive subcellular localizations and functions. For example, the CCAT1 locus generates two distinct isoforms, with the short isoform (CCAT-S) being potentially produced from processing of the 3′ end of the long isoform (CCAT-L; Xiang et al., 2014). However, the additional sequence that CCAT-L possesses in its second exon allows it to accumulate in the nucleus and promote MYC expression in cis in colorectal cancer, whereas the CCAT-S isoform is mainly cytoplasmic (Kam et al., 2014; Xiang et al., 2014; Fig. 3 D). Furthermore, the extensively studied NEAT1 lncRNA also undergoes differential processing, due to alternative transcription termination sites, resulting in two isoforms, NEAT1_1 and NEAT1_2. NEAT1_1 is a shorter, polyadenylated, and more abundant isoform, whereas NEAT1_2 expression is cell type specific (Isobe et al., 2020; Li et al., 2017). Interestingly, although localization and function at paraspeckles of both isoforms can be redundant, NEAT1_1 also localizes outside paraspeckles, in nuclear foci termed “microspeckles,” implying paraspeckle-independent functions (Isobe et al., 2020; Li et al., 2017; Fig. 3 E). Therefore, alternative lncRNA processing may instruct both cell type–specific expression and alternative localization and function.
The above, still limited, examples, taken together with the fact that the vast majority of lncRNAs undergo extensive alternative splicing and posttranscriptional processing, further underscore that we have only seen the tip of the iceberg in lncRNA biology: the existence of multiple isoforms for each lncRNA, together with their potential to be differentially localized within the cell and the distinct functional significance of their subcellular localization, suggests that the actual contribution of lncRNAs to numerous cellular functions has been vastly underappreciated. In fact, the majority of studies have largely overlooked the role of isoforms as a critical aspect of lncRNA biology, at least until recently, which begs for expansion of lncRNA studies to always specify isoforms as a key element to assess lncRNA function.
Conclusion
Cumulatively, current evidence on lncRNA localization, but also existing gaps in the knowledge, can altogether be summarized into two key take-home messages: (a) subcellular localization of lncRNAs is an additional essential layer of complexity that is required to be taken into account to fully understand the roles of lncRNAs in any cellular function; and (b) we are still scratching the surface on this aspect of lncRNA biology. Beyond the obvious basic science questions regarding lncRNA function, determining localization patterns of lncRNAs is key for another major reason: they may dictate choosing of the proper approach to manipulate them, either for basic research or for clinical purposes. For example, it is well documented that subcellular localization of lncRNAs is crucial to determine whether to target them using either RNAi or antisense oligonucleotides (Lennox and Behlke, 2016). Antisense oligonucleotides have proven to be more effective in targeting nuclear-localized lncRNAs, whereas RNAi-based methodologies are more appropriate to target cytoplasmic ones (Lennox and Behlke, 2016). In addition, the advent of lncRNA-based therapeutics (Adams et al., 2017; Zucchelli et al., 2015) makes it even more critical to fully understand the biology of each of these lncRNAs, including their subcellular localization, since this would affect both targeting efficacy and the potential of adverse side effects. These are challenges that need to be addressed in the years to come. However, the above also open new avenues of investigation in lncRNA biology that can significantly enrich our knowledge on numerous cellular functions and diseases and provide new opportunities for the development of specific and potent novel therapeutic approaches. The current pace of progress in the still relatively young field of lncRNA biology and the increased attention on the importance of the subcellular localization of these molecules in numerous biological functions lay the groundwork for more exciting discoveries.
Acknowledgments
Figures 1 and 2 were created using BioRender.com. We apologize to all our colleagues whose work could not be cited due to content and space limitations.
A. Kourtidis was supported by National Institutes of Health grants P20 GM130457-01A1, R21 CA246233-01A1, and P30 DK123704-01; M.C. Bridges was supported by National Institutes of Health training grants TL1 TR001451 and UL1 TR001450.
The authors declare no competing financial interests.
Author contributions: M.C. Bridges and A. Kourtidis conceived and wrote the manuscript; A.C. Daulagala conceived and designed figures. All authors read, edited, and approved the manuscript.