Vertebrate-striated muscle is assumed to owe its remarkable order to the molecular ruler functions of the giant modular signaling proteins, titin and nebulin. It was believed that these two proteins represented unique results of protein evolution in vertebrate muscle. In this paper we report the identification of a third giant protein from vertebrate muscle, obscurin, encoded on chromosome 1q42. Obscurin is ∼800 kD and is expressed specifically in skeletal and cardiac muscle. The complete cDNA sequence of obscurin reveals a modular architecture, consisting of >67 intracellular immunoglobulin (Ig)- or fibronectin-3–like domains with multiple splice variants. A large region of obscurin shows a modular architecture of tandem Ig domains reminiscent of the elastic region of titin. The COOH-terminal region of obscurin interacts via two specific Ig-like domains with the NH2-terminal Z-disk region of titin. Both proteins coassemble during myofibrillogenesis. During the progression of myofibrillogenesis, all obscurin epitopes become detectable at the M band. The presence of a calmodulin-binding IQ motif, and a Rho guanine nucleotide exchange factor domain in the COOH-terminal region suggest that obscurin is involved in Ca2+/calmodulin, as well as G protein–coupled signal transduction in the sarcomere.
Sarcomeres, the smallest contractile units of striated muscles, are assembled from thousands of protein subunits into the largest and most regular macromolecular complex known. Sarcomeres are assembled during the embryonic differentiation of heart and skeletal muscle, but also on a continuous basis during the physiological turnover of muscle. New sarcomeres are also formed at a high rate in hypertrophying muscle: either as a result of exercise, increased pressure and volume load of the heart, or pathological or hormonal stimulation. The mechanisms which cooperate to regulate muscle-specific gene transcription are only beginning to emerge (Chien, 2000). It remains largely unclear how signaling at the molecular level within the sarcomere and the control of assembly are coordinated. Therefore, identifying and characterizing key elements of sarcomeric signal transduction and their roles in the control of myofibrillogenesis are essential to elucidate basic mechanisms of the cell biology of muscle, leading to a molecular understanding of associated diseases.
The process of myofibril assembly requires both spatial and temporal coordination of protein interactions with high precision (Gautel et al., 1999). To achieve this long-range coordination, two giant modular proteins, acting as molecular scaffolds or blueprints, are found in vertebrate muscle. Titin (Wang et al., 1979), also known as connectin, (Maruyama, 1976) and nebulin provide specific attachment sites for other proteins and thus specify their sarcomeric positions (Trinick, 1996; Trinick and Tskhovrebova, 1999). Recently, it was shown that the deletion of titin leads to a total loss of myofibril assembly despite the persisting expression of other sarcomeric proteins (Van der Ven et al., 2000). Apart from binding sites for other sarcomeric proteins, these giant proteins contain potential signaling domains: a COOH-terminal Src homology 3 (SH3)* domain in nebulin (Labeit and Kolmerer, 1995a), and multiple phosphorylation sites and a COOH-terminal catalytic protein kinase domain in titin implicated in myofibril assembly (Mayans et al., 1998). These domains suggest that the molecular scaffold proteins of the myofibril receive and propagate signals from various pathways.
Nematodes contain two large muscle proteins, encoded by the unc-22 and unc-89 genes in Caenorhabditis elegans. The unc-22 product is twitchin, which is localized along the myosin filament and shows homology to titin (Benian et al., 1989). Unc-89 has been implicated in the assembly of the sarcomeric M band (Benian et al., 1996); a mammalian homologue of unc-89 has not been identified to date. Titin, twitchin, and unc-89 are all at least partly associated with the myosin filament. These proteins share a similar molecular architecture, being largely composed of 100-residue domains of the intracellular Ig superfamily, and also contain domains involved in signal transduction (Benian et al., 1989, 1996; Labeit et al., 1992; Heierhorst et al., 1994). Titin and twitchin contain a myosin light chain kinase–like protein kinase domain which has been implicated in the control of myofibril formation in titin (Mayans et al., 1998), whereas unc-89 contains a G protein–activating GDP/GTP exchange factor domain (GEF domain; Benian et al., 1996).
Among the giant proteins, the complex modular architecture of titin is probably the one best understood at the functional level, and may therefore serve as a paradigm for the analysis of other large modular proteins. Apart from Ig-like domains, titin contains unique sequences which are involved in signal transduction or interactions with other sarcomeric proteins (Gautel et al., 1999). Specific titin domains interact with myomesin, myosin, myosin-binding protein C, α-actinin, and telethonin along the distance from M-line to Z-disk (Gautel et al., 1999; Gregorio et al., 1999; Trinick and Tskhovrebova, 1999; Sanger et al., 2000). These interactions specify the sarcomeric localization of these other protein components and define the length of myosin filaments or the thickness of the Z-disk (Gautel et al., 1999). In the I-band, titin is composed of Ig domains arranged in tandem and a unique sequence, the so-called PEVK region (Labeit and Kolmerer, 1995b). These two secondary structure elements act as serial springs and are responsible for the passive elasticity of muscle (Gautel and Goulding, 1996; Linke et al., 1996), with the tandem Ig stretch being largely responsible for extension at low forces and resisting stretch at higher forces (Trinick, 1996).
Several pleiotropic signaling pathways control the formation of new sarcomeres in differentiating or hypertrophying muscle, including the Src and p38–mitogen-activated kinases, the phosphatidyl inositol 3 (PI3)-kinase pathways, and the Rho-family of small G-proteins (Thorburn et al., 1997; Wang et al., 1998; Cuenda and Cohen, 1999; Zetser et al., 1999; Puri et al., 2000; Wu et al., 2000). Surprisingly, it is as yet unclear whether these pathways directly link extracellular signals with the transcriptional machinery, or whether they may actually communicate with, and be regulated by, components of the sarcomere.
We report here the identification and initial characterization of a novel giant protein which we term obscurin. Obscurin is a modular protein of ∼800 kD which contains a GTPase nucleotide exchange factor (GEF) domain, is localized in the myofibril, and therefore joins the family of giant sarcomeric signaling proteins. Thus, obscurin provides a possible direct link between the sarcomere and the G-protein regulated pathways which control the formation of new myofibrils.
An obscurin cDNA sequence contains an ∼20-kb ORF
We have identified obscurin as a novel titin interacting protein in a yeast two hybrid screen (see below). Searches of the EMBL and GenBank databases with this sequence retrieved no identical cDNA sequence, suggesting the identification of a novel protein. We have named this protein obscurin, after the adjective obscure, defined in the New Oxford Dictionary as meaning: (a) difficult to see or make out, (b) not well known, or (c) not easily understood. All three meanings are appropriate to obscurin, which has not been identified previously and has proven difficult to characterize because of its complexity, large size, and relatively low abundance.
Starting with the 750 bp of obscurin cDNA sequence from positive yeast two hybrid clones, a lambda phage cardiac cDNA library was screened in order to extend this sequence in both directions. Multiple rounds of screening yielded a panel of 18 cDNA clones which can be assembled into a 20.4 kb contig (Fig. 1 A). This cDNA encodes an ORF of 19,860 bp. The ORF is preceded by an in-frame stop codon 27-bp upstream of a putative start methionine. The sequence preceding this methionine codon fits to the Kozak sequence found around the start methionine of eukaryotic genes (Kozak, 1989). The first domain starts nine amino acids after the putative start methionine. At the 3′ end of the ORF there is an in-frame stop codon followed by 500 bp of predicted 3′ untranslated sequence which contains stop codons in all reading frames. No consensus polyadenylation signal was identified, although the 3′ untranslated region shows perfect homology to several entries from the EST database which were cloned using oligo dT primers. The ORF is predicted to encode a 720-kD protein of 6,620 amino acids. One shorter splice variant was sequenced (termed obscurin-a1). Attempts to further characterize alternative splice variants by reverse transcriptase PCR failed because of the low abundance of the obscurin message and the extreme homology over most of the tandem Ig-regions (see below).
Obscurin is a modular protein built from adhesion modules and signaling domains
The deduced amino acid sequence from the cDNA reveals a highly modular architecture (Fig. 1 B). The NH2-terminal region consists of 49 Ig and 2 fibronectin (Fn)3 domains, arranged mostly in tandem. This repetitive region is followed by a complex region, consisting of four further Ig domains separated by nonmodular sequences. Just NH2-terminal to the first of these Ig domains is an IQ motif. After this region there is an SH3 domain next to tandem dbl homology (DH) and pleckstrin homology (PH) domains. The COOH-terminal end of the protein consists of two tandem Ig domains followed by a nonmodular region of 417 amino acids containing several copies of a consensus phosphorylation motif (SPXR) for ERK kinases similar to that found in the NH2-terminal region of titin (Gautel et al., 1996a). Data base searches with the nonmodular interdomain sequences show no clear homology to known protein domains apart from those described. In our domain nomenclature we label both Ig and Fn3 domains in the longest cardiac isoform consecutively from NH2 to COOH terminus with the prefix Ob (i.e., Ob1, Ob2…Ob57). This 57-domain splice form is called obscurin. Further splice variants are called obscurin-a1 and so forth, and their additional domains, Oba1, etc.
The obscurin gene structure
Using our cDNA sequences as a guide, we analyzed the draft human genomic sequence (Lander et al., 2001) and identified 10 contiguous sequences from chromosome 1q42 which cover most of the obscurin gene, designated OBSCN. After correct orientation and ordering of these contigs, the assembled sequence reveals the genomic organization of the obscurin gene with nine gaps of undefined length (Fig. 1 C). This partial sequence contains exons corresponding to all of our cDNA sequence with the exception of three regions corresponding to the obscurin domains Ob3–Ob4, Ob11, and Ob18. For the most part the tandem Ig and Fn3 domains are encoded one domain per exon, similarly to the titin gene (Kolmerer et al., 1996). The signaling domains, nonmodular sequences, and Ig domains at the COOH terminus, however, do no follow this pattern. Putative exons encoding ten additional Ig domains not represented in our cDNA sequence were identified within the tandem Ig region of the molecule. This provides evidence for further splicing in this region, which is supported by additional partly sequenced cDNA clones from this region (not shown). The incomplete obscurin gene available to date contains at least 68 Ig/Fn3 domains (precluding a final domain nomenclature for the moment), which are likely to be expressed as numerous splice variants.
A multiple sequence alignment and phylogenic tree of the Ig domains (Fig. 2, A and B) reveals no obvious repeat or super-repeat pattern, which is in contrast to the arrangement of Ig domains in titin (Labeit et al., 1992; Gautel, 1996) or the helical repeats in nebulin (Labeit and Kolmerer, 1995a; Wang et al., 1996; Fig. 2, A and B). The Ig domains between the two Fn3 domains are all 88–92 amino acids in length and arranged with no linker sequence between them (except for short linkers between Ob2 and Ob3, and between Ob24 and Ob25). In this region two clusters of extremely homologous domains can be identified (Fig. 2 B). The domains in these two clusters (Ob36–42 and Ob9–18; purple and green in Fig. 2 B) are between 72 and 90% identical to each other at the protein level. At the DNA level there are as few as 20 bases which differ between DNA sequences encoding adjacent domains. Titin I–band domains show a compact β-barrel domain (Improta et al., 1996), partly due to the short length of the loops linking adjacent β-strands F and G. In the central obscurin domains, the length of this loop is equally short. The Ig domains at the NH2 and COOH terminus of the molecule (1–3 and 53–57) cluster together in the phylogenetic tree (Fig. 2 B), but are less homologous to each other than the tandem Ig domains in the center of the molecule. They are more similar to the terminal domains of titin, e.g., M5 (Pfuhl and Pastore, 1995), and vary more in size, (e.g., Ig 55 contains 100 amino acids). It is likely that they formed part of a smaller ancestral molecule which grew much larger by rapid duplication of the central tandem domains. One splice variant, obscurin-a1, was sequenced in which an additional Ig domain (Oba1) replaces Ob10–16 (Fig. 1 B).
The nonmodular insertion NH2-terminal to Ig52 contains an IQ motif, so called for the conserved sequence Ile-Gln found in these α-helical peptides. This motif is a recognized binding motif for calmodulin or calmodulin-like proteins, such as myosin light chains (Rhoads and Friedberg, 1997). The IQ motif in obscurin shows highest homology to similar motifs in the neuronal proteins neuromodulin and neurogranin (Slemmon et al., 1996).
Further signaling domains follow from amino acid 5,601 onwards. This region contains an SH3 domain which is 43% identical to the SH3 domain of the C. elegans protein UNC-89. However, it is not very similar to other SH3 domains, including those of the muscle Z-disk proteins nebulin or ArgBP2 (Wang et al., 1997). Adjacent to the SH3 domain is a DH domain, also known as a RhoGEF domain. As in all DH domain–containing proteins, a PH domain follows immediately thereafter. The obscurin DH and PH domains are most homologous (∼25% identity) to the similarly arranged domains in dbl, Vav, trio, kalirin and unc-89. The obscurin DH domain contains a proline rich sequence not found in other DH domains.
Obscurin is a giant protein expressed in cardiac and skeletal muscle
To monitor the expression of obscurin protein and gain estimates of the size distributions of the polypeptide, the rabbit polyclonal antisera α-Ob19–20, α-Ob48–49, α-Ob-DH and α-Ob51–52 were raised (Fig. 1). α-Ob48–49, α-Ob51–52 and α-Ob-DH were affinity purified and used to detect obscurin on Western blots from low porosity SDS polyacrylamide gels. Using α-Ob48–49 and α-Ob-DH to probe blots of human Vastus lateralis muscle, a very high molecular weight protein was detected (Fig. 3). This protein was seen to migrate slightly slower than the visible nebulin band. A band of similar molecular weight was detected on blots of cardiac tissue (Fig. 3). The blots were also probed with anti-titin antibodies. Although titin can be detected, the obscurin band does not react with several anti-titin antibodies (S54-4 and CH11, Whiting et al., 1989; Gautel et al., 1996b). α-Ob48–49 and α-Ob-DH react with neither nebulin nor titin. Nebulin has a molecular weight of ∼700–900 kD (Labeit and Kolmerer, 1995a; Wang et al., 1996) and thus obscurin is expected to be of a similar or slightly larger size. This is in agreement with the molecular weight of at least 720 kD predicted for obscurin from the cDNA sequence. A band of similar size is also detected using the α-Ob-DH antibody (Fig. 3). On Coomassie-stained low porosity gels with normal loading (20–40 μg total protein) of adult muscle, there is no appreciable protein in the region between nebulin and titin (Fig. 3), suggesting that obscurin is expressed at much lower levels than either of these proteins. Estimations by densitometric analysis of double-probed Western blots of adult skeletal muscle suggest that the ratio of nebulin to obscurin is at least 10:1.
All our obscurin cDNAs were obtained either from a human skeletal muscle cDNA library or from a human cardiac lambda phage library. On multiple tissue Northern blots, the obscurin message was barely detectable, probably due to low abundance and difficulties in blotting such a large mRNA. Using dot blots of total RNA, an obscurin probe hybridized specifically to RNA from cardiac and skeletal muscle (not shown). The EST databases contain entries corresponding to COOH-terminal regions of obscurin. Most of these entries are derived from cardiac or skeletal muscle mRNA. Together these data suggest that obscurin is an ∼700–800 kD protein expressed in striated muscle.
Z-disk titin interacts with obscurin by homotypic binding to two specific obscurin Ig domains
Obscurin was identified in a systematic search for proteins interacting with the peripheral Z-disk region of titin, using the bait Z7-Z10 to screen a skeletal muscle cDNA library in the two-hybrid system. The bait is ultrastructurally located at the comb-like transition region of the peripheral Z-disk (Gautel et al., 1996a; Yajima et al., 1996). The yeast two-hybrid screen yielded over 200 HIS3 and β-galactosidase positive clones from several complexities of the library. Nine of these were sequenced and all were found to encode Ig domains 48 and 49 of obscurin followed by either a truncated (e.g., clone no. 27) or complete (e.g., clone no. 25) Fn3 domain (Ob50; Fig. 4 A).
The obscurin binding site was mapped on titin by testing for interaction with titin Z7-Z8 and Z9-Z10. The interaction with obscurin was mapped to Z9-Z10 (Fig. 4 A). The individual titin Ig domains Z9 or Z10 show only a very weak interaction with the same obscurin clone, whereas the individual obscurin Ig domains Ob48 and Ob49 do not interact with titin Z9-Z10 (not shown). The two-hybrid interactions were confirmed in an in vitro assay using the recombinant fragments titin Z9-Z10 and obscurin Ob48–49. Titin Z9-Z10 fragment was bound via a 6xHistidine tag to a Ni–NTA matrix. The untagged obscurin fragment is specifically retained on the column only when mixed with Z9-Z10 (Fig. 4 B), confirming the interaction of both proteins. Isothermal titration calorimetry indicated reproducible Kd values of 100–200 nM (not shown).
When an obscurin construct encompassing Ob48–51 was transfected into neonatal rat cardiomyocytes, the transfected protein was found to colocalize at the Z-disk with the titin Z-disk epitope T12 (Fig. 4 C). This supports the notion that the obscurin Ig domains Ob48–49, via their interaction with titin Z9-Z10, constitute a functional Z-disk targeting signal in obscurin.
Calcium-insensitive binding of calmodulin to the IQ motif
Ligands for the obscurin IQ domain were identified by a yeast two hybrid screen using domains 51–52 as bait. The IQ domain is immediately NH2-terminal to Ob52 and is preceded by ∼160 amino acids of nonmodular sequence. Six of seven clones from this screen were found to encode full-length calmodulin. To verify this interaction and to test for Ca2+ sensitivity of the interaction, binding of recombinant Ob51–52 was assayed on calmodulin-Sepharose beads in the presence and absence of Ca2+. The Ob51–52 fragment was found to bind to calmodulin-beads in a Ca2+-independent way (Fig. 5). Control obscurin fragments show no calmodulin binding (not shown).
Assembly of obscurin into nascent sarcomeres
The in vitro interaction of obscurin and titin near the Z-disk raises the question of the relative order of their incorporation, and the final localization of both proteins in the myofibril. We therefore investigated their appearance in cultured cardiomyocytes and in developing hearts, using a panel of four anti-obscurin antibodies against different epitopes along the molecule (Fig. 1, Table I).
In neonatal rat cardiomyocytes, endogenous obscurin was found to localize at the M-band as detected by all four antibodies. Occasional weak Z-disk staining was observed only with α-Ob48–49, which binds to the titin binding site (Fig. 6). Since myofibrillogenesis in vivo is not always faithfully reflected in detail in cultured cardiac myocytes (Ehler et al., 1999), we used the same antibodies on whole-mount preparations of embryonic chicken, mouse and rat hearts spanning early to late stages of heart development.
The earliest epitope to show sarcomeric localization is that of the titin binding domains, Ob48–49. In agreement with the transfection data and in vitro binding results, α-Ob48–49 stains Z-disks in early chicken embryos up to about the 10-somite stage (Fig. 7). At these early stages, epitopes close to Ob48–49 in the obscurin cDNA (Fig. 1 B), α-Ob-DH and α-Ob51–52 are not detectable, or later on weakly expressed and diffusely localized. At about S10, noticeable staining of the M-band is observed with α-Ob48–49, which increases further with development and which is concomitant with a loss of Z-disk staining. In parallel with this shift in epitope localization, the epitopes COOH- and NH2-terminal to Ob48–49 become localized to the sarcomere and are detected at the M-band. Similar observations were made in the myofibrils of rodent hearts, since α-Ob48–49 shows both weak Z-disk as well as strong M-band staining (Fig. 7 D) in E9.5 mouse hearts, which is shortly after the onset of beating. In the fully matured myofibrils of E14.5 rat hearts, adult rat and mouse hearts, or neonatal rat cardiomyocytes, all obscurin antibodies investigated label the M-band (Figs. 6 and 7, Table I). The epitope of α-Ob19–20 remains undetectable until after birth, suggesting that these domains may not be expressed in the early embryonic isoforms. These data demonstrate that obscurin is a sarcomeric protein, which is transiently detected at the Z-disk and whose GDP/GTP exchange factor domain is localized at the M-band of mature myofibrils.
We describe here the identification of a new giant muscle protein, obscurin, which is part of the sarcomeric cytoskeleton. Obscurin is a new member of the intracellular Ig superfamily, and shows similarity to both titin and unc-89 in its modular architecture. Yet it is different from both proteins and contains additional domains previously unknown in vertebrate giant muscle proteins. It is therefore likely to perform a biological role distinct from the known giant muscle proteins.
The obscurin gene
Obscurin is encoded by an ∼150-kb gene, OBSCN, on chromosome 1q42 with a structure reminiscent in part to that of the titin gene in that individual Ig and Fn3 domains are encoded by separate exons. The additional exons identified in the partial genomic sequence encode at least 10 additional Ig domains not present in the two cardiac isoforms described here (Fig. 1 C). This suggests that even further splice variants are likely to be expressed from this gene. Three of the gaps in the partial gene sequence are localized within the highly homologous tandem Ig domains, highlighting the value of our cDNA approach to complete the genomic sequence based on the draft human genome. A detailed analysis of the promoter and splice sites will be presented elsewhere.
The Ig domains: homology to titin and implications for sarcomeric function
Most of the obscurin primary structure is constructed from intracellular Ig domains. The Ig domains are for the most part joined without any linker sequences or insertions. This arrangement of domains is most homologous to the tandem Ig domains in the I-band region of titin. This suggests that the titin and obscurin tandem Ig domains might fulfil analogous functions in the sarcomere. The titin tandem Ig domains have been shown to be an extensible chain that is stretched at low forces and resists stretch at high forces (Gautel and Goulding, 1996; Gautel et al., 1996b; Linke et al., 1996). The isoforms obscurin and obscurin-a1 we describe here contain 52–57 Ig domains, each with a predicted dimension of ∼4 nm. Obscurin is therefore expected to have a length of at least 208 nm. This is too short to span whole sarcomeres or even A-bands but the protein could link the M-band peripheries to other cytoskeletal structures, which would also explain its low abundance. Whether obscurin participates in cross-linking thick filaments in the M-band or anchors M-band peripheries to other cytoskeletal structures remains to be elucidated.
It is therefore plausible that obscurin acts as a flexible linker between titin and other cellular structures at its ends. The NH2- and COOH-terminal domains show less conservation between them, arguing for functional diversity but also for a rapid expansion of the central tandem Ig-region by evolutionary recent duplication events. This expansion of tandem Ig copies may be facilitated by the arrangement of the Ig domains on individual exons (Fig. 1 C). Interestingly, the arrangement of tandem Ig domains shows no discernible super-repeat patterns similar to the I-band region of titin (Gautel, 1996).
In titin, the second function of the Ig domains is the interaction with other sarcomeric proteins. Several of these interactions are mediated via a pair of adjacent titin Ig domains, including the interactions with the Z-disk protein telethonin (Mues et al., 1998), the A-band protein myosin-binding protein-C (Freiburg and Gautel, 1996) and the M-band protein myomesin (Obermann et al., 1997). Obscurin binds to two titin Ig domains, Z9 and Z10, both of which are necessary for this interaction. Similarly, both obscurin Ig domains Ob48 and Ob49 are required for this homotypic interaction. The Ig domains at the NH2 and COOH termini differ in sequence consensus, and it is therefore likely that they will also be involved in ligand interactions rather than in strictly mechanical functions. However, transfections with several other obscurin fragments did not identify further specific sarcomeric localization signals. Similarly, extensive two-hybrid screening covering the NH2 and COOH termini failed to identify further ligands. The understanding of the functions of obscurin will therefore require the identification of further protein interactions by other means. The two Fn3 domains show homology to similar domains in titin and other sarcomeric proteins. It is interesting to note that all sarcomeric proteins containing Fn3 domains known to date are associated with myosin (titin, twitchin, myomesin and M-protein); in titin, Fn3 domains are found exclusively in the thick filament region (Bennett and Gautel, 1996). Obscurin, as a transiently Z-disk associated protein, may be the first exception to this pattern; however, the function of the two single Fn3 domains might be important for the association with mature M-bands and remains to be further elucidated.
The GEF domain
DH domains (also called Rho-GEF domains) catalyze the exchange of GDP for GTP in small G-proteins of the Rho-family (Hart et al., 1991), thereby activating G-protein–regulated signaling cascades (Cerione and Zheng, 1996). These can lead to a multitude of cellular responses, ranging from cytoskeletal reorganization with remodelling of the actin cytoskeleton to transcriptional regulation and cell cycle control (Ridley, 1999; Bishop and Hall, 2000). DH domains are invariably linked to a COOH-terminal PH domain. PH domains are found also as independent domains in membrane-associated proteins where they interact with phosphatidylinositol-phosphates (PiP) (Musacchio et al., 1993). The DH linked PH domains form a separate group, not all of which are predicted to bind PIPs. Structural analysis of the unc-89 PH domain suggest it to be of a “nonliganding” type based on its negative electrostatic potential (Blomberg et al., 1999). Therefore, it is possible that the PH domains in the giant muscle proteins unc-89 and obscurin do not play a role in phospholipid signaling. This is supported by the sarcomeric, rather than membrane localizations, of these PH domains in muscle. However it is not possible to rule out PiP binding for the obscurin PH domain based solely on its limited homology to UNC-89, especially since their isoelectric points are very different (6.53 for unc-89 and 10.33 for obscurin).
The invariant presence of a PH domain adjacent to DH domains suggests that they play a vital role in DH domain function. This is probably best characterized for the Rho, Cdc-42, and Rac1 GEF-protein Vav. In Vav, the GEF activity of the DH domain is dually autoinhibited by an NH2-terminal helix blocking Rho-access to the DH domain by an inhibitory tyrosine, and by the PH domain protecting this closed conformation. Upon phosphorylation of PH-bound PiP2 by PI3-kinase, the autoinhibitory tyrosine becomes accessible for phosphorylation by Src-like kinases and autoinhibition is fully relieved (Aghazadeh et al., 2000). The sequence of obscurin suggests that the same mechanism is unlikely to regulate its GEF activity. Indeed, the recent crystal structures of the Sos DHPH domain and of Rac1 in complex with the DHPH domain of Tiam suggest that the structural arrangement and regulatory mechanism of each DHPH domain may be different (Worthylake et al., 2000). Interestingly, the obscurin DH domain contains a specific polyproline stretch that might serve as an intramolecular ligand for the SH3 domain, which may therefore also play a regulatory role. However, efforts to elucidate the activation mechanism of the obscurin DH domain have been frustrated by its lack of apparent activity upon transfection in cultured cells and its insoluble expression in any systems tried to date. Clearly, understanding the cellular functions of obscurin's GEF domain will require the future combination of a wide range of methods, including genetic approaches.
The IQ motif
IQ motifs are Ca2+-dependent as well as Ca2+-independent calmodulin binding motifs (Rhoads and Friedberg, 1997) in proteins as diverse as Ca-insensitive protein kinase A, utrophin, dynein, and several G-protein regulating proteins like Ras-GRF. In Ras-GRF, Ca2+-mediated binding of calmodulin was shown to activate the nucleotide exchange activity of the Cdc25 Ras-GRF domain (Farnsworth et al., 1995). Our genetic and in vitro binding assays with the obscurin fragment Ob51–52 demonstrate that the obscurin IQ motif is a functional motif that interacts in a Ca2+-independent manner with calmodulin. Though calmodulin binding is Ca2+-independent it is plausible that the bound calmodulin may act as a Ca2+ sensor which could confer Ca2+ sensitivity to some aspect of obscurin function. In Ras-GRF, the IQ motif is found in the immediate vicinity of the catalytic Cdc25 domain. In obscurin, the IQ motif is separated from the DH domain by a further four Ig domains, suggesting that CaM binding to obscurin modulates activities other than guanine nucleotide exchange. In neuromodulin and neurogranin, CaM binding is abrogated by PKC phosphorylation (Chakravarthy et al., 1999); the phosphorylation site SFR in these proteins is changed to AFK in obscurin, making a similar mechanism unlikely.
Obscurin expression and sarcomeric integration
Our analysis of the incorporation of obscurin into the sarcomere, using four independent antibodies, reveals a complex picture. By its secondary structure of tandem Ig domains, a localization in the I-band like titin would have been most plausible. Ob48–49 interacts with titin Z9-Z10 in transfections and binding assays. In agreement with this, the epitope α-Ob48–49 is localized to the Z-disk in chicken embryos up to about the 10 somite stage. However, at later stages in mature myofibrils, both in vivo (embryonic heart whole mounts) as well as in cultured cells, α-Ob48–49 colocalizes in the M-band with the other obscurin epitopes which were initially diffuse or absent. This suggests that the obscurin–titin interaction may be transient and developmentally regulated. Failure to detect some obscurin epitopes, like ObDH or Ob19–20 at early stages or their change in localization at later stages could also be due to epitope masking, although we feel this possibility is less likely as all antibodies show similar behavior. Alternatively, it is possible that the regions of obscurin which are responsible for targeting it to the M-band are either spliced out, leading to the sequential appearance of different obscurin isoforms, or they are switched off, e.g., by phosphorylation at early developmental stages. Identification of such targeting signals will be a key to understanding this putative Z-disk to M-band translocation. A sequential appearance, and integration, of different epitopes is also observed for titin. In titin, the M-band region becomes organized after the incorporation of the Z-disk anchoring NH2-terminal region, and transiently colocalizes with the NH2-terminal region (Fürst et al., 1988; Van der Loop et al., 1996; Ehler et al., 1999; Van der Ven et al., 1999). Some titin M-band epitopes are apparently masked at this stage (Van der Ven et al., 1997). The LIM protein DRAL is found both in the M-band and the Z-disk in heart muscle, suggesting that certain signaling molecules can communicate with both compartments (Scholl et al., 2000). Recently, a putative titin kinase regulator was also found at the Z-disk despite its interaction with M-band titin (Centner et al., 2001), and a substrate for titin kinase, telethonin, is also found at the Z-disk (Mayans et al., 1998). Similarly, it seems that obscurin isoforms can occupy different sarcomeric binding sites in a developmentally regulated way, and that therefore the signaling functions of obscurin during early stages of muscle development may be different from mature muscle. The possible obscurin analogue unc-89 is also found at the M-band, but its developmental integration is unknown (Benian et al., 1996).
Analysis of the distribution of obscurin in early embryonic hearts compared with later stages demonstrates that Z-disk association of the protein is mainly observed in cells where the actin cytoskeleton is not yet regularly aligned in parallel myofibrils, but rather along the cell membranes and in an irregular, criss-cross pattern (Fig. 7, A–D). M-band localization is observed in cells where myofibrils begin to be arranged in parallel bundles in the cell centre (Fig. 7 E). In mature cardiomyocytes, myofibrils are arranged in parallel and obscurin is found at the M-band (Figs. 6 and 8). This coincidence of the redistribution of obscurin epitopes with remodelling of the actin cytoskeleton suggests that obscurin may be involved in this structural transition.
The mechanisms that control the rearrangement of the actin cytoskeleton during myofibrillogenesis are to date unknown. It is interesting to note that DH domains in other cytoskeletal proteins, like trio, (Debant et al., 1996) activate GTPases of the Rho family, which are involved in actin cytoskeletal reorganization, like the formation of stress fibres by Rho and membrane ruffling induced by Rac. It is compelling to speculate that the SH3-DH-PH triad in obscurin could be involved in signaling by Rho-like GTPases necessary for the cytoskeletal rearrangement during myofibrillogenesis. However, transfection assays with the obscurin SH3-DH-PH motif into neonatal rat cardiomyocytes, or into C2C12 myoblasts, did not lead to apparent changes in the actin cytoskeleton or in perturbations of sarcomere assembly (not shown). This suggests that the obscurin GEF domain is tightly regulated and that full activity may require specific activation steps within the correct cellular context.
Recent evidence suggests that Rho-like GTPases also play crucial roles in the control of muscle gene transcription during development and hypertrophy (Finkel, 1999; Chien, 2000; Clerk and Sugden, 2000). Unlike nonmuscle cells, Rho-activation in cardiac muscle seems to act predominantly on the regulation of gene expression rather than of actin morphology (Thorburn et al., 1997). The GEF proteins responsible for this muscle-specific signaling pathway are unknown. Our identification of a Rho-GEF in the vertebrate sarcomere is the first evidence that direct communication between the sarcomere and the G-protein pathways leading to new sarcomere formation may be possible.
Materials And Methods
Cloning of cDNA Constructs
The cDNA constructs for yeast two-hybrid analysis, transfection, and protein expression were amplified by PCR. For titin total human cardiac cDNA (CLONTECH Laboratories, Inc.) was used as the template and primer design was based on the human cardiac titin sequence (EMBL/GenBank/DDBJ accession no. X90568). Domain nomenclature for titin is as described in Labeit and Kolmerer, (1995b). All cloning procedures followed standard protocols (Ausubel et al., 1995). The identity of the derived constructs was verified by restriction digest and in some cases by DNA sequencing.
The constructs used (with the amino acid residues in brackets) were as follows: titin, Z7–Z10 (1,406–1,885), Z7–Z8 (1,406–1,604), and Z9–Z10 (1,657–1,885); obscurin, Ob19–Ob20 (1,711–1,897), Ob48–Ob49 (4,334–4,521), Ob48–Ob51 (4,334–4,714), Ob51–Ob52 (4,619–4,991), and ObDH (5,678–5,886).
Yeast two-hybrid screening and analysis
Titin Z7-Z10 or obscurin Ob51-Ob52 constructs were cloned into a modified pLexA vector for screening. Two-hybrid screening and analysis were performed as described previously (Young et al., 1998).
λ-phage library screening and sequencing
A human cardiac muscle λ-Zap II cDNA (936208; Stratagene) was screened with obscurin cDNA probes and cDNA clones were isolated using standard protocols (Ausubel et al., 1995). Inserts were sequenced from in vivo excised pBluescript SK plasmids manually and in some cases by SeqLab laboratories. In some cases, nested deletions were generated using Exonuclease III/Mung bean nuclease (Stratagene). Sequence data were compiled and analyzed using the Wisconsin Genetics Computer Group (GCG) package (Devereux et al., 1984). Database searches were performed at the National Centre for Biotechnology Information (www. ncbi.nlm.nih.gov) using the BLAST search service (Altschul et al., 1997). Multiple sequence alignments were prepared using Pileup (GCG package) and ClustalW (Higgins et al., 1996). The complete obscurin cDNA sequence has been deposited in EMBL/GenBank/DDBJ under accession no. AJ002535.
Genomic sequence analysis
A search of the obscurin cDNA sequence against the draft human genome sequence was carried out using the ENSEMBL server (www.ensembl.org). Three clones were found to contain parts of the obscurin gene (EMBL/GenBank/DDBJ accession nos. AL359510, AL353593, and AC026657). Contigs from these clones were correctly ordered and oriented according to the cDNA sequence. Combining information from these clones allowed some gaps to be filled and gave an overall gene structure consisting of 10 contigs and 9 gaps. Sequence analysis was done using the GCG package of sequence analysis programs (Devereux et al., 1984). Annotation of the genomic sequence was done using the Artemis software (The Sanger Centre). The sequences have been deposited in EMBL/GenBank/DDBJ under accession nos. AJ314896, AJ314898, AJ314900, AJ341901, AJ314903, AJ314904, AJ314905, AJ314906, AJ314907, and AJ314908.
Protein expression and purification
Titin Z9-Z10 and obscurin Ob19-Ob20, Ob48-Ob49, Ob51-Ob52, and ObDH fragments were expressed with an NH2-terminal 6-histidine tag and a tobacco etch virus protease cleavage site in the Escherichia coli strain BL21[DE3] (Young et al., 1998). Phasing of the domains was done in analogy to titin Ig domains (Politou et al., 1994). Initial purification was carried out on Ni-NTA agarose columns (QIAGEN). Further purification was carried out by ion exchange chromatography on a MonoQ or S-Sepharose column (Amersham Pharmacia Biotech). The ObDH construct was expressed insolubly and solubilized and purified on Ni-NTA agarose in presence of 6 M urea. The purified protein was then dialyzed in 20 mM Tris-HCl, pH 8, 1 mM DTT, 1 mM EDTA. For immunization and binding experiments, histidine tags were cleaved off by a tobacco etch virus protease.
Ni column binding assay
50 μg of each of the protein fragments titin Z9-Z10 and obscurin Ob48-Ob49 were mixed in a total volume of 400 μl binding buffer (20 mM K2HPO4/KH2PO4, pH 8, 50 mM NaCl, 5 mM imidazole, 0.02% Triton X-100, 20 mM β-mercaptoethanol), and loaded on a minicolumn packed with 50 μl Ni-NTA agarose (QIAGEN). After washing with 2 × 100 μl binding buffer, bound proteins were eluted in 100 μl of elution buffer (200 mM imidazole, 20 mM β-mercaptoethanol). The flow through, wash, and eluate fractions were collected and analyzed on 15% SDS-PAGE gels (Laemmli, 1970). As a control, the same procedure was followed for the obscurin fragment Ob48-Ob49 alone.
Calmodulin binding assay
The obscurin fragment Ob51–52 was incubated with 50 μl of either calmodulin-coupled Sepharose beads (Amersham Pharmacia Biotech) or with beads coupled to an unrelated protein. After 30 min incubation beads were washed with 3 × 100 μl assay buffer (20 mM HEPES, pH 7, 50 mM NaCl, 1 mM DTT with either 1 mM CaCl2 or 1 mM EDTA). Bound protein was eluted from the beads with either 5 mM EDTA or 6 M urea and analyzed by SDS-PAGE.
Polyclonal rabbit sera against recombinant proteins were raised using a standard immunization protocol. The obscurin fragments used as antigens were Ob19–20, Ob48–49, Ob51–52, and ObDH. The antisera were affinity-purified essentially as described (Harlow and Lane, 1988). For Western blotting, the antibodies NSH3 (raised against the SH3 domain of nebulin, EMBL/GenBank/DDBJ accession no. X83957; residues 6,604–6,669) and S53-4 (which specifically binds to an epitope of I-band titin in the skeletal N2-A isoform; Gautel et al., 1996b) were used. For immunofluorescence, the T12 monoclonal antibody, which recognizes a Z-disk epitope of titin (Fürst et al., 1988), or the antibody B4 against myomesin (Grove et al., 1984) were used.
Human Vastus lateralis muscle was a gift from Professor Allenberg (Heidelberg University, Heidelberg, Germany). Protein samples were prepared by homogenizing the tissue under liquid nitrogen and then solubilizing in Laemmli sample buffer at 90°C by vortexing. Agarose reinforced 3 or 4% polyacrylamide SDS gels were prepared according to Tatsumi and Hattori (1995) and electrophoresis of protein samples carried out according to (Laemmli, 1970). Proteins were blotted onto nitrocellulose membranes for 2–3 h at 0.8 mA/cm2 in a SemiPhor blotting apparatus (Hoefer Scientific Instruments) and major bands on the blots detected by Ponceau S staining. One lane was cut from each gel before blotting and analyzed by Coomassie staining. Probing of blots with various antibodies followed standard procedures (Harlow and Lane, 1988) using ECL (Amersham Pharmacia Biotech). Stripping of blots for subsequent reprobing was achieved by washing for 30 min in SDS-PAGE running buffer at 95°C. Quantitation of band density from double-labeled blots was carried out using the NIH Image software.
Whole mount preparations
Hearts were dissected from 2-d-old chicken embryos (staged by counting the somites) or from embryo day (E)9.5 mouse and E14.5 rat embryos, and fixed for 1–1.5 h in 4% PFA/PBS. After brief washing in PBS, the hearts were treated with 1 mg/ml hyaluronidase (Sigma-Aldrich) for 30 min (45 min for the rat). After washing with PBS, the hearts were permeabilized with 0.2% Triton X-100 in PBS for 30 min (45 min for the rat). Unspecific binding sites were blocked by incubation with 5% normal (preimmune) goat serum in 1% BSA/TBS for at least 30 min (45 min for the rat). Incubations with primary and secondary antibodies at their appropriate dilutions were performed at 4°C on a rocking table over night. After the antibody incubations, the specimen were washed in an excess of PBT (0.002% Triton X-100 in PBS) for at least 6 h with buffer changes every 30 min. After the immunostaining, the specimens were mounted on slides in mounting medium (0.1 M Tris-HCl, pH 9.5, glycerol, 3:7) containing 50 mg/ml n-propyl gallate as an antifading reagent (Ehler et al., 1999).
An inverted microscope (DM IRB/E; Leica) equipped with a true confocal scanner (TCS NT; Leica), a 63×/1.4 oil immersion objective (PL APO; Leica), and an argon/krypton mixed gas laser was used for the recording of confocal data sets. Image processing was performed on a Silicon Graphics workstation using “Imaris” (Bitplane AG), a three-dimensional multichannel image processing software specialized for confocal microscopy images.
Cell culture and transfection
Neonatal rat cardiomyocytes were isolated from day three Wistar rats as described previously (Sen et al., 1988; Komiyama et al., 1996). The cells were plated on collagen-coated 35-mm dishes and grown in M199 medium, 5% fetal bovine serum, 5% horse serum, 10 μM cytosine-arabinoside, and 10 μM phenylephrine for 3 d days at 37°C. Cells were fixed in 2% paraformaldehyde and processed for immunofluorescence microscopy following standard methods. For transfections, obscurin fragments were cloned into a modified pCMV-5 vector (Andersson et al., 1989) with an NH2-terminal T7-tag® sequence (MTGGQQMGR). Plasmid DNA was transfected using a modified CaPO4 protocol (Komiyama et al., 1996). After transfection, cells were cultured for another 48 h before fixing and staining. The transfected obscurin fragments were detected with the mouse monoclonal anti-T7 tag antibody (Novabiogen). Counterstaining was carried out using titin T12 or myomesin B4 monoclonal antibodies, or rhodamine-phalloidin for F-actin.
We are grateful to Nathalie Bleimling and Evelyne Perriard for superb technical assistance, to Raymond Kelly for assistance with isolation of cDNA clones, and to Roger Goody for critical reading of this manuscript and many helpful discussions. We are most grateful to Jean-Claude Perriard and to the late Matti Saraste for generous support.
This work was supported by the Deutsche Forschungsgemeinschaft, grant Ga405/3-6.
Paul Young's present address is Department of Neurobiology, Duke University Medical Center, Durham, NC 27710.
Abbreviations used in this paper: DH, dbl homology; E, embryo day (gestation days); Fn, fibronectin; GEF, guanine nucleotide exchange factor; PH, pleckstrin homology; PI3, phosphatidyl inositol 3; PiP, phosphatidylinositol-phosphates; SH3, Src homology 3.