Basement membranes contain several proteoglycans, and those bearing heparan sulfate glycosaminoglycans such as perlecan and agrin usually predominate. Most mammalian basement membranes also contain chondroitin sulfate, and a core protein, bamacan, has been partially characterized. We have now obtained cDNA clones encoding the entire bamacan core protein of Mr = 138 kD, which reveal a five domain, head-rod-tail configuration. The head and tail are potentially globular, while the central large rod probably forms coiled-coil structures, with one large central and several very short interruptions. This molecular architecture is novel for an extracellular matrix molecule, but it resembles that of a group of intracellular proteins, including some proposed to stabilize the mitotic chromosome scaffold. We have previously proposed a similar stabilizing role for bamacan in the basement membrane matrix. The protein sequence has low overall homology, apart from very small NH2- and COOH-terminal motifs.
At the junctions between the distal globular domains and the coiled-coil regions lie glycosylation sites, with up to three N-linked oligosaccharides and probably three chondroitin chains. Three other Ser-Gly dipeptides are unfavorable for substitution. Fusion protein antibodies stained basement membranes in a pattern commensurate with bamacan, and they also Western blotted bamacan core protein from rat L2 cell cultures. The antibodies could also specifically immunoprecipitate an in vitro transcription/translation product from a full-length bamacan cDNA. The unusual structure of this proteoglycan is indicative of specific functional roles in basement membrane physiology, commensurate with its distinct expression in development and changes in disease models.
Basement membranes are thin sheetlike structures that form a highly specialized part of the extracellular matrix, located in immediate proximity to the surface of epithelial, endothelial, muscle, and peripheral nerves. They are important in cell migration, differentiation, and growth, during in vivo development, remodeling, and regeneration. In addition, they may have specialized functions, e.g., renal glomerular basement membrane and neuromuscular junction (for reviews see 35, 49, 53). Basement membranes contain proteins from several distinct families, some of which are characteristic for these structures, such as type IV collagens, laminins, heparan sulfate proteoglycans (HSPGs)1, and entactin/nidogen (7, 35, 37, 49, 53). The only proteoglycan core proteins from basement membranes that have been sequenced to date are perlecan and agrin (7, 37, 51), and the major glycosaminoglycan type in both cases is heparan sulfate. However, we have also provided evidence for the presence of chondroitin sulfate glycosaminoglycan in basement membranes. Chondroitin 6-sulfate glycosaminoglycan was demonstrated first with an mAb in immunohistochemical studies (8), and later a chondroitin sulfate proteoglycan (CSPG) with a core protein of 140–150 kD was purified from rat Reichert's membranes (21). Subsequently, mAbs against the core protein of a basement membrane CSPG (BM-CSPG) revealed that nearly all basement membranes in the adult rat contain BM-CSPG (31, 32, 33). However, in adult rat kidney, BM-CSPG was distinctly localized to mesangium and Bowman's capsule basement membrane, but absent from the glomerular basement membrane (33). Immunological evidence also indicated that BM-CSPG was unrelated to perlecan and appeared to be a distinct basement membrane component (11). Developmental studies also revealed differences in distribution, and perhaps expression, between perlecan and BM-CSPG (7, 9, 33, 35, 45). In several organ systems, BM-CSPG is expressed late in organogenesis, suggesting a role in basement membrane stability. Consistent with this hypothesis, changes in its distribution are also seen in models of disease where basement membrane stability is compromised (7, 13, 34). This is in contrast with other components, consistent with distinct biological function and regulation of this proteoglycan.
In this report the complete cDNA-derived primary structure of the BM-CSPG core protein is described. This multidomain core protein, which has an estimated molecular mass of 138 kD, contains two domains predicted to form coiled-coil structures, not seen in other proteoglycans. Moreover, it is unlike any other extracellular matrix molecule so far described. It does, however, have structural features in common with proteins described in diverse phyla with examples from yeast, nematodes, and vertebrates. In some cases these proteins have roles in stabilizing the chromosomal scaffold at mitosis (SMC proteins; 18). We have adopted the name of bamacan (basement membrane-associated chondroitin proteoglycan) based on the localization and glycosaminoglycan modification of this molecule (11).
Materials And Methods
All reagents were of molecular biology grade and obtained from Sigma Chemical Co. (St. Louis, MO) unless otherwise specified. Radionucleotides [32P]dCTP (3,000 Ci/mmol) and [35S]dATP (1,000 Ci/mmol) were obtained from Amersham Corp. (Arlington Heights, IL). Sepharose and DEAE-Sephacel were from Pharmacia Fine Chemicals (Piscataway, NJ).
Antibodies and cDNA Probes
Monoclonal antibody 2D6 against rat BM-CSPG (32) and polyclonal antibody R63 raised against mouse chondroitin/dermatan sulfate proteoglycans from the Engelbreth-Holm-Swarm tumor matrix have been characterized previously (11). Polyclonal antibodies R664 and R665 were raised against core protein fusion proteins FP4a and FP15a generated from fragments (F) of the 5′ regions of cDNA clones 4a and 15a of the core protein in pGEX (Pharmacia Biotech; see Fig. 1). The cDNA probe F15a, corresponding to the 5′ region of clone 15a, was also used in cDNA library screening and Northern analysis.
Library Screening and Identification of Clones
A rat yolk sac carcinoma (L2) cDNA library, random and oligo(dT) primed in Uni-ZAPTM XR Vector (Stratagene, La Jolla, CA), was screened with antibody R63 and by hybridization with appropriate cDNA fragments. Library immunoscreening was according to the Pico Blue protocol. Briefly, 1 × 106 plaques were grown in 150-mm plates in LB agar medium at 42°C for 3.5 h before incubation of the plates at 37°C for 3.5 h with nitrocellulose filters (Schleicher & Schuell, Keene, NH) pretreated with 1 mM isopropylthio-β-d-galactoside. Filters were probed with primary antibody R63, which had been preadsorbed with Escherichia coli bacterial lysate and Uni-ZAPTM XR phages, according to published protocols (2), followed by alkaline phosphatase–conjugated goat anti–rabbit IgG (affinity purified 1:3,000 dilution) and color development according to the manufacturer's instructions (Bio Rad Laboratories, Richmond, CA). Bluescript SK− plasmids were obtained from rescreened and purified positive plaques by in vivo excision according to the Stratagene protocol.
To extend sequences obtained from immunoscreening in the 5′ direction, fragment F15a (see Fig. 1) from one of the primary clones (15a) was used as a cDNA probe to rescreen the cDNA library. One million plaques were grown and transferred to nitrocellulose filters. These were treated with 50% formamide, 5× Denhardt's solution, 5× SSC, 0.1% SDS, and 100 μg/ml denatured salmon sperm DNA and hybridized (overnight, 42°C) under the same conditions with 1 × 106 cpm/ml F15a probe prelabeled with 32P by random primer extension (Promega, Madison, WI). Filters were washed twice in 2× SSC, 0.1% SDS at room temperature for 20 min each, and then twice in 0.2× SSC, 0.1% SDS at 50°C for 20 min before air drying and exposure to x-ray film (X-Omat AR; Eastman Kodak Co., Rochester, NY).
DNA Sequencing and Computer Analysis
Plasmids were sequenced by a modified dideoxynucleotide chain termination method with Sequenase 2.0 (United States Biochemical Corp., Cleveland, OH) using either polylinker primers SK and T7, or synthetic oligonucleotides. Alignment of nucleotide sequences and comparisons were initially analyzed using DNAstar (DNASTAR, Inc., Madison, WI). Databank searching was carried out using the FASTA program in the GCG package (Genetics Computer Group, Madison, WI). Predictions for coiled-coil structure were generated by STRIPE for Macintosh version 1.2 (29).
mRNA Isolation and Northern Analysis
Total RNA was prepared from confluent cultures of L2 cells and rat tissues by the Ultraspec II total RNA isolation kit (BIOTEX, Houston, TX) according to manufacturer's methods. For Northern analysis, 30 μg of total RNA was electrophoresed on a 0.7% agarose formaldehyde gel, transferred to nitrocellulose filters, and probed with random primed cDNA probe F15a, followed by autoradiography. In rat tissue blots, the same filters were hybridized with a cyclophilin cDNA probe for control purposes.
Genomic Southern Blot Analysis
Rat genomic DNA was obtained from Clontech (Palo Alto, CA). 1 μg of DNA was digested with either EcoRI, BamHI, BglI, BglII, PstI, or EcoRV, electrophoresed in a 0.7% agarose gel, and transferred to a Hybond N+ membrane. The membranes were hybridized with a 32P-labeled probe that contained the full-length cDNA sequence of bamacan, a 584-bp segment of the 5′ end of the T2 clone or a 1.4-kb 3′ fragment of clone 12b. Prehybridization and hybridization were done at 42°C in the presence of 50% formamide, 5× SSC, 5× Denhardt's solution, 1% SDS, and 100 μg/ ml salmon sperm DNA, followed by washes and autoradiography as described above.
Fusion Proteins, Antisera, and In Vitro Transcription/Translation
The cDNA fragments F4a or F15a (see Fig. 1) were excised from clone 4a or 15a by digestion with EcoRI and XhoI, and then cloned directly into pGEX 5X1 or 5X2, respectively (Pharmacia Biotech). The resulting constructs encoded fusion proteins of Mr 60K (FP4a) and 55K (FP15a). These constructs were transformed into E. coli XL1-Blue and fusion proteins induced by growth in the presence of isopropylthio-β-d-galactoside. FP4a was insoluble and purified by sonication in PBS, centrifugation to remove soluble bacterial elements, resuspension of the pellet with 8 M urea, and dialysis to 10 mM Tris-HCl, pH 7.4. This resulted in a partially soluble fusion protein, which was affinity purified on glutathione-agarose beads. FP15a was soluble, directly affinity purified after bacterial lysis by sonication, eluted with 10 mM glutathione, and dialyzed to PBS. Rabbits were immunized with FP4a or FP15a to produce R665 and R664, respectively. R664 antibodies were affinity purified on FP15a coupled to CNBr-activated Sepharose 4B (Pharmacia Fine Chemicals), eluted with 3.4 M MgCl2, and dialyzed against PBS. In some cases the antiserum was preadsorbed with glutathione-S-transferase.
The overlapping clones T2 and 15a (see Fig. 1) were used to construct full-length bamacan cDNA. The 5′ portion of T2 was excised by EcoRI and BsmI and inserted into clone 15a plasmids treated with the same enzymes. The full-length construct was then excised from pBluescript SK− plasmid and cloned into pcDNA3 vector (Invitrogen, San Diego, CA) using EcoRI and ApaI restriction sites.
In vitro transcription and translation were performed with a plasmid containing full-length cDNA of bamacan, using the TNT-coupled reticulocyte lysate system (Promega) according to the manufacturer's instructions. Briefly, 2 μg purified plasmid was transcribed and translated in the presence of [35S]methionine (translation grade, 110 mCi/ml; DuPont-New England Nuclear, Wilmington, DE). Products, including negative and positive (luciferase) controls, were resolved on 3–15% SDS-PAGE and fluorography. Some samples were immunoprecipitated with R664 polyclonal antibody, using the appropriate preimmune serum as a control (R664/1). Antibody–antigen complexes were precipitated with protein A–Sepharose 4B (Pharmacia Fine Chemicals), washed with PBS, and resolved on SDSPAGE followed by fluorography.
A further fusion protein was expressed, encompassing the NH2-terminal portion of bamacan domain V, to ascertain whether one or both serine residues 1074 and 1081 could be glycanated. The forward primer 5′-ATGAGAATTCCCAGTTAACCTTCAAACA-3′ and reverse primer 5′-CTGAGTCGACTAAATGAGAGCAAGGGCT ACCA-3′ were synthesized. The former included a 5′ EcoRI site, the latter a stop codon and an SalI cleavage site. In a PCR with the full-length bamacan cDNA as a template, a 320-bp product corresponding to amino acids 1,029–1,128 was synthesized. It was purified and reconstructed into the eukaryotic expression vector pRK5F10–protein A (54) at an EcoRI restriction site. Transient transfection of COS-7 cells (2 × 105 cells/well in 12-well plates) in DME with 2 μg/well plasmid were carried out by the DEAE-dextran method. Control plasmids included empty pRK5F10–protein A vector and pcDNA3 (Invitrogen). Cells were then labeled with 50 μCi/ml Trans 35S-label, a mixture of [35S]methionine and [35S]cysteine (>1,000 Ci/mmol, 51006; ICN Biomedicals, Inc., Irvine, CA) in DME containing 10% FBS. 1 ml of conditioned medium was mixed with 10 μl of 1 M Tris-HCl, pH 7.5, containing 5% Triton X-100 and 2% sodium azide, and clarified by centrifugation at 10,000 g for 5 min. 20 μl IgG-agarose beads (Sigma Chemical Co.) were added to the supernatant and mixed overnight at 4°C. The beads were extensively washed with 50 mM Tris-HCl, pH 8.0, 0.15 M NaCl, and 0.02% sodium azide, and some samples were further washed in chondroitinase or heparitinase buffers (11) containing 1 μg/ml pepstatin A, 1 μg/ml leupeptin, and 1 mM PMSF. These were digested with 10 mU chondroitinase ABC (EC 184.108.40.206) or 20 mU heparinase III (EC 220.127.116.11; Seikagaku America, Ijamsville, MD) for 3 h or overnight at 37°C, respectively. Controls were incubated without enzyme under identical conditions. Samples were treated with 2% SDS sample buffer for 5–15% gradient SDS-PAGE and autoradiography as described previously (11).
Other transiently transfected cultures were labeled with 50 μCi/ml d-[6-3H]glucosamine HCl (40 Ci/nmol; DuPont-New England Nuclear) in low glucose medium for 48 h, and the medium and cell layer were harvested separately for fusion protein purification as described above and previously (3). The beads were then subjected to β-elimination, followed by glycosaminoglycan purification on DEAE-Sephacel beads (Pharmacia Biotech). The radiolabeled glycosaminoglycans were then subjected to chondroitinase ABC or heparinase III digestions as before (3). Empty vector controls, as above, were also used.
Indirect immunofluorescence microscopy was performed as described previously (31). Sections of frozen rat kidney or paraffin-embedded human skin were incubated with affinity-purified R664 serum at 50 μg/ml, followed by affinity-purified FITC-conjugated goat anti–rabbit IgG F(ab′)2 fragments (Organon Teknika-Cappel, Malvern, PA). Sections were examined and photographed on an Optiphot epifluorescence microscope (Nikon Inc., Garden City, NY).
Proteoglycan Preparation, Electrophoresis, and Immunoblotting
Total proteoglycans from L2 yolk sac cell conditioned medium were prepared as described previously (11), eluted from DEAE-Sephacel with 4 M guanidine-HCl, precipitated in 70% ethanol at −70°C overnight, and resolved by 3–15% SDS-PAGE with or without enzyme treatment. Samples were digested overnight at 37°C in 15 μl heparinase buffer (0.1 M sodium acetate, 0.1 mM calcium acetate, 0.1% Tween 20, pH 7.0), containing 0.5–1 mU heparinase III and/or 1–2 mU chondroitinase ABC. 12 μl SDSPAGE sample buffer containing 20 mM dithioerythreitol was added before heating at 100°C for 3 min and electrophoresis. Resolved proteins were transferred to nitrocellulose membranes and incubated with primary antibodies, followed by goat anti–rabbit IgG conjugated to alkaline phosphatase as before (10, 11).
Isolation of cDNA Clones
The L2 rat yolk sac carcinoma cDNA library was screened with antiserum R63, raised against murine basement membrane chondroitin/dermatan sulfate proteoglycans, which has activity against the core proteins of both BM-CSPG and perlecan (11). 12 positive clones were isolated, purified by rescreening, and sequenced. Three cDNA clones showed high homology (>90%) to mouse perlecan cDNA (36, 38). The remaining nine clones had overlapping sequences, which encoded a novel protein sequence and contained a long open reading frame and a stop codon followed by a 480-bp 3′ untranslated region terminating with a poly (A) tail. A 5′ fragment (F15a) from the cDNA clone 15a generated two additional positive clones when used to rescreen the same cDNA library. One, clone T2, contained both the 5′ untranslated region and 3′ regions that overlapped with all of the original clones (Fig. 1). Thus, 11 overlapping cDNA clones yielded a continuous complete cDNA sequence of this protein (Fig. 2). Bamacan cDNA sequence contains 4143 bases, with a single long open reading frame of 1,191 amino acids, predicting a molecular mass of 138 kD (Fig. 2). The 3′ untranslated region of 480 bases extended to a poly (A) tail, preceded by a polyadenylation signal. Some clones, such as 12a, 13b, and 15b, had a shorter (375 bp) 3′ untranslated region, linked to a poly (A) tail. The initiator methionine lies in a nucleotide sequence homologous to Kozak consensus sequences (27) and could initiate translation when a complete cDNA was subjected to in vitro transcription/translation (see below).
Domain Structure of the Proteoglycan Core Protein
The predicted core protein sequence can be divided into five domains (I–V; NH2 to COOH terminus). The first 165 amino acids comprise domain I, lacking cysteine and being largely hydrophilic, with many β turns predicted by ChouFasman rules (4), perhaps indicative of globular structure. One sequence NGSG at the NH2 terminus is potentially glycosylated but does not lie in an acidic context shown to be optimal for glycanation (54). Domain II, comprising 335 amino acids, has a very high potential for coiled-coil structure (Fig. 3) with one small interruption between amino acids 216 and 223, and a larger one between amino acids 349 and 394. Consistent with a coiled-coil structure, proline residues are absent, except at residue 349 marking the initiation of an interruption. Each break also contains a single tryptophan residue, which is otherwise absent from domain II. Adjacent to the first predicted interruption is one potential N-glycosylation site, followed 14 residues downstream by a Ser-Gly dipeptide, which is a possible glycanation site since it is surrounded by acidic residues (54).
Domain III lies between two coiled-coil regions and comprises 165 amino acids with no apparent glycosylation sites. Four cysteine residues are present in this domain: one pair approximately one third from the NH2 terminus of this domain, and the other pair close to its COOH terminus. Between the first and third cysteine residues are six proline residues, consistent with some flexibility in this region. There is a VTXG sequence at residues 552–555, which may be a cell adhesion site homologous to that of thrombospondin 1 (50).
Domain IV consists of a second predicted coiled-coil region (Fig. 3) of very similar size to domain II (364 amino acids), starting at residue 665 and terminating at residue 1,029. There are again two internal interruptions in the putative coiled-coil structure (residues 737–756 and 947– 961), which are both small and contain a single proline residue in an otherwise proline-depleted sequence. The second break in the coiled-coil structure is also bounded on each side by a single cysteine residue. Two potential N-glycosylation sites are present close to the junction of domains IV and V. An LRE sequence is also present (at residues 850–852), and this has previously been shown to be a cell adhesion site in the laminin β2 chain (S-laminin; 20). The COOH-terminal domain V contains 162 amino acids and four Ser-Gly dipeptides that are possible sites of glycosaminoglycan substitution. The most NH2-terminal two Ser-Gly dipeptides, in particular, have associated acidic residues and are proximal to each other. The fourth is very close to the COOH terminus, while the third lies at the beginning of a putative helix-loop-helix structure.
Proteins with Homology to the Bamacan Core Protein
A GenBank database search revealed no other proteoglycan core proteins with this structure. However, several other proteins, from yeast (40, 44, 47), Caenorhabditis elegans (5), and vertebrates (Xenopus, chick, and human; 17, 42, 43), have overall homology (Fig. 4). Each is predicted to have a similar five domain structure, with two internal coiled-coil domains, as described above (for review see 18). However, the overall sequence homology of bamacan with these other proteins, even of vertebrate origin, is low (Fig. 4). Some of these proteins have been proposed to be involved in stabilization of mitotic chromosomes (SMC) and are therefore intracellular. Bamacan, however, is an extracellular, secreted proteoglycan (see below). Despite high homology between chick and amphibian proteins ScII and XCAP-E (67% amino acid identity), that between the chick and the bamacan sequence shown here is very low (<20%). The human SB1.8 gene (42) is homologous with yeast Smc1 protein since there is 30% amino acid identity throughout the five domains, with up to 50% identity in regions of domains I and V. However, the SB1.8 gene, despite being mammalian, has only a 17% identity to bamacan. This is indicative of different functional attributes and is underscored by dendrogram analysis, showing the distant relationship between bamacan and the SMC proteins including the SB1.8 gene product (Fig. 4).
Two small regions of all these proteins (one each in domains I and V) have a conserved region with higher sequence homology to bamacan (Fig. 5). At the NH2 terminus in domain I, a sequence NGSGKSN is completely conserved, together with some additional flanking sequence. This structure is homologous to a P-loop and, perhaps, part of an NTP-binding site (GXXXXGKS), but other requirements are not met in this or some of the other related proteins (5, 18, 43).
Domain V contains a 35–amino acid region that, in yeast Smc1 protein and possibly the other proteins, apparently contributes to a helix-loop-helix domain. However, none of these proteins has a preceding basic region, common in DNA-binding helix-loop-helix motifs (18). In bamacan, an acidic region containing potential glycosaminoglycan substitution sites lies upstream. Most striking among all these proteins is the conservation of the initial SGG, as well as the subsequent acidic, some hydrophobic, and two proline residues. Alone among the vertebrate sequences, bamacan possesses a single cysteine residue in the center of this motif, in place of tyrosine or phenylalanine. Also of note, the yeast Smc1 protein has a nuclear localization sequence within domain V that is not shared by the bamacan or other sequences.
Northern and Genomic DNA Analyses
When total RNA isolated from L2 cells was subjected to Northern analysis, one band, migrating faster than the ribosomal 28S, was detected when probed with a full-length, random-primed bamacan construct. The estimated size was 4.2 kb (Fig. 6,A), which corresponded to the size of the cDNA sequenced from the L2 cell library. mRNA of a very similar size was detected in Northern blots of numerous rat tissues (Fig. 6,B). All these tissues contain basement membranes and expressed bamacan mRNA in varying amounts. A Southern blot of rat genomic DNA cut with restriction enzymes for which no internal sites in the cDNA were present and probed with a 1.4-kb probe from the 3′ end of the cDNA revealed in each case two major bands (Fig. 7). A similar result was obtained in Southern blots probed with either full-length cDNA or the most 5′ region (584 bp). The results indicate that in all probability there is a single gene for bamacan.
Reactivity of Fusion Protein Antibodies
Two antisera were prepared. Antiserum R665 generated from FP4a fusion protein could detect the bamacan core protein in immunoblots but did not recognize the native protein in immunofluorescence microscopy of rat tissue sections. In contrast, affinity-purified R664 antiserum against FP15a fusion protein recognized both native and denatured protein. Both R665 and R664 antisera recognized one major band, representing the core protein of a CSPG in Western blots of L2 proteoglycan preparations (Fig. 8). This core protein had Mr of ∼190K and was seen only after chondroitinase ABC treatment, not after heparinase III treatment. Intact proteoglycans with chondroitin chains attached did not enter the resolving gel. These antisera also detected a smaller CSPG core protein of Mr ∼130K after chondroitinase ABC treatment, possibly resulting from degradation of the 190-kD protein (11). When the antiserum R63, which was used to perform primary screening of the cDNA expression library, was used as a positive control in immunoblotting, the same sizes of CSPG core protein were detected (11) in addition to perlecan. In contrast, the fusion protein antisera did not react with rat perlecan.
When sections of adult rat kidney and human skin were stained with affinity-purified R664, only basement membranes were labeled (Figs. 9 and 10). R664 specifically labeled the dermo-epidermal junction and the basement membranes of blood vessels and nerves. In kidney, tubular basement membranes were labeled, together with the mesangium and Bowman's capsule basement membranes of glomeruli, while the glomerular basement membrane was negative for this antigen (Fig. 10 B). The specific distributions were closely similar to those seen with the initial screening antibody R63 and identical to that of BM-CSPG (now termed bamacan), described earlier with monoclonal core protein-specific antibodies (31, 33).
In Vitro Transcription/Translation of Bamacan cDNA
A full-length construct, prepared from two overlapping clones, was subjected to in vitro transcription and translation in a TNT-coupled reticulocyte lysate system. The cDNA was efficiently transcribed and translated to reveal a single 140-kD polypeptide as anticipated from the predicted sequence (Fig. 11). The polypeptide was immunoprecipitated by antiserum R664 raised against domain II fusion protein, but not by the corresponding preimmune serum.
Chondroitin Sulfate Substitution of Bamacan Domain V
At residues 1074/5 and 1081/2 are Ser-Gly dipeptides potentially suitable for glycanation. A PCR product was generated, corresponding to the first 99 amino acids of domain V (residues 1,029–1,128), and ligated in-frame with DNA encoding the IgG-binding portion of protein A in the pRK 5F10 vector. Transfection into COS-7 cells followed by metabolic labeling with 35S–amino acids gave rise to a product that was resolved as a broad smear on SDSPAGE in untreated or heparinase III–treated samples, but as a discrete product after chondroitinase ABC (Fig. 12). This is consistent with galactosaminoglycan substitution of the fusion protein. Negative controls of pRK5F10 or pcDNA3 vector did not yield patterns consistent with proteoglycan synthesis.
In further experiments (not shown), the same transfected cells were grown in the presence of [3H]glucosamine. Only the vector containing the NH2-terminal region of domain V yielded a labeled product binding to IgG-agarose and sensitive to β-elimination and chondroitinase ABC. Heparinase III failed to degrade this metabolically labeled product. Overall, the results show that the Ser-Gly dipeptides at the NH2-terminal region of bamacan domain V are favorable for chondroitin/dermatan sulfate substitution.
This report describes the cDNA cloning of a novel basement membrane proteoglycan core protein. While this CSPG was described some years ago (21, 31, 32), its relationship to other basement membrane components and proteoglycans was unclear. With this description of its cDNA sequence, we can now determine that it is both totally distinct from other known basement membrane components, including the large HSPGs perlecan and agrin, and from any other previously cloned proteoglycan core protein. It does not appear to be a member of any of the currently known extracellular matrix proteoglycan families, such as the aggregating proteoglycans containing aggrecan, versican, neurocan, and brevican (52), or the leucinerich interstitial proteoglycans such as decorin, lumican, and biglycan (28). Whether this CSPG (bamacan) represents the prototypical form of a novel proteoglycan family remains to be seen, and it is still not clear whether this core protein represents the only basement membrane CSPG. One aspect of similarity of bamacan with neurocan in terms of glycosylation may be functionally relevant. In both cases, chondroitin chains are present near the junctions between both of the NH2- and COOH-terminal globular domains and the central, internal rod segment. In addition, the three potential N-glycosylation sites in bamacan are also in these regions (Fig. 13; reference 41).
The predicted mass of the core protein is ∼138 kD, close to previous estimates of 150–160 kD obtained from chondroitinase ABC–digested Reichert's membrane proteoglycans (21, 32), which would still bear any N- or O-linked oligosaccharides and glycosaminoglycan stem structures. Western blots of L2 and EHS tumor CSPGs revealed a core protein of ∼Mr = 190K after chondroitinase ABC treatment (11), perhaps indicative of more extensive N- or O-linked glycosylation. The potential for three N-linked oligosaccharides is predicted from the putative sequence.
The most striking feature of the core protein is that it contains two long regions (domains II and IV) predicted to form coiled-coil structures that comprise >50% of the protein sequence. These are flanked by NH2- and COOHterminal noncoiled regions and interrupted by a central cysteine and proline-containing domain III (Fig. 13). This structure has not been seen in any proteoglycan core protein previously documented and is unlike any other basement membrane component. However, it has been demonstrated that laminin isoforms participate in extensive coiled-coil structures, forming heterotrimeric complexes stabilized by disulfide bonds (14). Currently, we do not have information regarding the nature of the coiled-coil interactions undertaken by bamacan core protein, but presumably it may form homo- or heteropolymeric complexes. One feature shared with laminin β2 chain is the presence of an LRE sequence in domain IV. This has been shown to be a cell attachment site unique to this laminin chain, but whether the site in bamacan can fulfill the same function is unknown. Certainly the structural context is distinct, and the surrounding sequences are dissimilar (20). The only other recognizable potential cell attachment site is a VTXG motif in domain III, shown previously to be active in thrombospondin 1 (50). There are no RGD-containing sequences in this core protein.
Database searching reveals that, while there are no extracellular proteins with very high homology to bamacan, related molecules have been identified in diverse organisms from yeast to vertebrates. Some are nuclear proteins proposed to take part in stabilizing chromosome scaffolds at mitosis (SMC proteins; 18). Neither fusion protein antibodies nor the original R63 antibody used in library screening had the capacity to stain mitotic figures in colcemid-treated fibroblasts, where some SMC proteins such as avian ScII become localized (17, 40, 43, 44, 47). Antibodies against the chick ScII protein did stain these structures (Woods, A., R.R. Wu, and J.R. Couchman, unpublished observations). In overall terms, the sequence homology between bamacan core protein and the SMC proteins is low (Figs. 4 and 5). However, a striking similarity in predicted protein structure is apparent, each having the same five domain (head-rod-tail) pattern, with each domain comparable with its homologues in terms of size and predicted secondary structure. Each protein, therefore, like bamacan, has two similarly sized domains (II and IV) predicted to form coiled-coil structures. However, while the homology between the other proteins from distant species can be high (Fig. 4), that with bamacan is low, even with the human SB1.8 protein. This is indicative of distinct function and supported by the dendrogram showing a wide separation between bamacan and vertebrate SMC proteins (Fig. 4). In contrast, our preliminary sequence data for human bamacan cDNA indicates >95% identity with the rat core protein, further suggesting wide functional and evolutionary separation of SMC proteins and the proteoglycan, yet highly conserved bamacan domains, including the coiled-coil regions. A recent review of SMC proteins suggested three separate subfamilies, based on sequence and structural data (18). We propose bamacan belongs to a fourth distantly related but structurally homologous subfamily, having distinct roles and interactions.
Bamacan has many distinct features not shared with the other homologous proteins. Its core protein has three potential N-glycosylation sites and six Ser-Gly peptides. However, not all of these Ser-Gly motifs are in an acidic environment (54), which indicates that not all of them may be substituted with chondroitin sulfate chains. Two are within motifs shared by all members of this group, but the three most favorable for substitution (in domains II and V; Fig. 13) are not shared with the other proteins. Indeed, two closely spaced SG dipeptides in domain V within the sequence VEGSQSQDEGEGSGESERGSGSQSSVP lie in a region totally lacking homology with all other related proteins, and it is perhaps an inserted region substituted with chondroitin sulfate chains. This region was expressed in COS-7 cells as a fusion protein with the Ig-binding domain of protein A. It was recovered from supernatants in a glycanated form, sensitive to chondroitinase ABC but not heparinase III (Fig. 12). This further confirms the proteoglycan nature of bamacan. Perlecan has three closely spaced SG dipeptides at its NH2 terminus, again lacking in homology with other molecules, seemingly indicative of a requirement for unique protein sequences that bear glycosaminoglycans (6, 22, 37). The sequence GEGSGE found within this unique region of bamacan domain V is also present in the glycosaminoglycan attachment (V1) region of the aggregating CSPG PG-M (24). It is also present in the hybrid transmembrane proteoglycan, syndecan 1 (30,46), where again it is believed to be substituted with chondroitin sulfate (26). The other potential chondroitin sulfate substitution site in syndecan 1 with the sequence ETSGE is, remarkably, also present in the bamacan core protein, at residues 247–251 at the NH2-terminal region of domain II, and is favorable for glycosaminoglycan substitution. However, despite these very small sequence identities, bamacan is not at all similar to syndecan 1 in any other respect.
The yeast Smc1 protein and others contain an incomplete dNTP-binding site, some elements of which were conserved in the proteoglycan core. One other site was similarly conserved, where part of domain V potentially forms a helix-loop-helix structure. This motif is commonly associated with DNA binding, unlikely in the case of the basement membrane proteoglycan. An upstream cationic region found in DNA-binding structures is absent in bamacan. Helix-loop-helix structures are also found in the matrix glycoprotein SPARC and proteoglycan testican (1, 19), where they form Ca++-binding EF-hand motifs also common to many intracellular proteins. The bamacan sequence, as those of the SMC proteins, does not appear to form EF-hand structures.
That this is, indeed, a basement membrane core protein was verified in several ways, in addition to verification of the suitability of domain V to be chondroitin sulfate substituted. Affinity-purified fusion protein antibodies recognized the appropriate CSPG core protein prepared from cultures of rat L2 yolk sac carcinoma cells, and stained rat and human tissue sections in a basement membrane–specific manner. Especially noteworthy was the absence of staining in rat glomerular capillary basement membranes, a hallmark of BM-CSPG distribution (31, 33). In all respects, the staining obtained with the fusion protein antibodies matched that described before with core protein– specific mAbs. This pattern is therefore distinct from the structurally related SMC proteins, which have an intracellular localization (18, 40). When a full-length bamacan core protein cDNA construct was subjected to in vitro transcription/translation, it yielded a product of Mr ∼140K, which could be immunoprecipitated by fusion protein antibodies.
Many proteins, in the course of evolution, have acquired novel functions associated with new locations. The leucinerich repeat family of proteins, for example, includes interstitial matrix proteoglycans, as well as cytoplasmic, cell surface, or nuclear proteins (for review see 25). In each of these examples, the leucine-rich repeat structure is integral to the protein function, and all take part in protein– protein interactions. More than one structural motif is conserved between the SMC proteins and bamacan core protein, indicating that the entire molecule is required for structural and/or functional activity, possibly determining homo-or heteropolymerization (18). It has been proposed that, in Xenopus, head-to-tail heterodimers of XCAP-C and XCAP-E assemble, in which case the overall structure of the five domains may be required (17). This would also be consistent with the overall symmetry of the molecule, the two coiled-coil domains being of approximately equal size. However, as was recently pointed out, antiparallel association has not been described in proteins with long coiled-coil domains (18). In the case of yeast Smc2 protein, it has been shown that homotypic and heterotypic association with other SMC proteins may form (48). It will be interesting to determine the structural relationships of the bamacan core protein.
We have hypothesized previously that bamacan may act as a stabilizing factor for the basement membrane matrix (7). This was based on indirect evidence arising from its localization through development of such organs as skin and kidney (9, 33), and from the fact that, in rat models of polycystic kidney disease and streptozotocin-induced diabetes, the distribution of this basement membrane component undergoes marked changes (13, 34). In addition, we and others have previously shown that chondroitin sulfate is abnormally diminished or absent in basement membranes of the dermal-epidermal junction in patients with dystrophic forms of epidermolysis bullosa, as well as in some epidermal tumors (15, 39). Such conditions are known to be associated with basement membrane fragility. Analogous to the structural role proposed for the SMC proteins, the overall structure of this core protein, with its potentially rigid coiled-coil domains, is consistent with a role in basement membrane integrity. Domains I and V, in addition to potentially interacting with each other, may undergo heterotypic interactions to incorporate the proteoglycan in the matrix, while other interactions may involve the glycosaminoglycan chains. This proteoglycan core protein is quite distinct from perlecan, but it is nevertheless not the only basement membrane proteoglycan core protein that may bear galactosaminoglycan chains. Perlecan itself is usually an HSPG, but it can bear additional or alternate dermatan sulfate chains (11, 12, 16, 23). The significance of this variable glycanation is not understood but presumably impacts its interactions and functional attributes. Bearing in mind the quite distinctive structural features of bamacan core protein, it is likely that it has a quite distinct and separable function, which will be the subject of future research now that the core protein structure has been more fully elucidated.
We thank Ms. Wen Chen Peng for help with some constructs and transcription/translation experiments, and Dr. Anne Woods for critical help with the manuscript. We also thank Dr. W.C. Earnshaw (Johns Hopkins School of Medicine, Baltimore, MD) for helpful discussions and a gift of antibodies against the ScII protein, and Dr. J.D. Esko (University of Alabama at Birmingham) for the gift of the pRK5F10 vector and much helpful discussion. Grateful acknowledgment is also made to Dr. Kevin J. McCarthy (University of Alabama at Birmingham) for L2 cell cultures and the original suggestion for the name bamacan. We also thank Cynthia Webster for secretarial help in manuscript preparation.
This work was supported in part by National Institutes of Health grant AR36457 to J.R. Couchman and a fellowship from the Helen Keller Eye Research Foundation to R-R. Wu.
Abbreviations used in this paper
Address all correspondence to John R. Couchman, Department of Cell Biology, University of Alabama at Birmingham, 1670 University Boulevard, Volker Hall 201, University Station, Birmingham, AL 35294-0019. Tel.: (205) 934-2626. Fax: (205) 975-9956. e-mail: email@example.com