We have developed a new technique for proximity-dependent labeling of proteins in eukaryotic cells. Named BioID for proximity-dependent biotin identification, this approach is based on fusion of a promiscuous Escherichia coli biotin protein ligase to a targeting protein. BioID features proximity-dependent biotinylation of proteins that are near-neighbors of the fusion protein. Biotinylated proteins may be isolated by affinity capture and identified by mass spectrometry. We apply BioID to lamin-A (LaA), a well-characterized intermediate filament protein that is a constituent of the nuclear lamina, an important structural element of the nuclear envelope (NE). We identify multiple proteins that associate with and/or are proximate to LaA in vivo. The most abundant of these include known interactors of LaA that are localized to the NE, as well as a new NE-associated protein named SLAP75. Our results suggest BioID is a useful and generally applicable method to screen for both interacting and neighboring proteins in their native cellular environment.
The elucidation of protein–protein interactions represents a significant barrier to the understanding of complex biological processes. In recent years it has become increasingly clear that the functions of many proteins can only be fully understood in the context of networks of interactions. Furthermore, the description of such networks provides keys to our understanding of disease processes (for an example see Sang et al., 2011). Biochemical and genetic techniques, including affinity-capture complex purification and yeast two-hybrid strategies have provided powerful tools in the search for new molecular associations. However, these methods also display fundamental limitations. For high-throughput genetic approaches, protein interactions are commonly assessed in a cellular environment different to that in which they would normally occur, often lacking the proper machinery for post-translational modifications and the normal complement of associated binding partners, including molecular chaperones. This can lead to incomplete or erroneous datasets. Biochemical approaches suffer loss of candidates through protein insolubility and transient or weak interactions. As a consequence of these limitations many proteins remain refractory to conventional methods used to screen for protein interactions. These issues are more relevant than ever, as we collectively look to the daunting task of unraveling the protein “interactome”.
Here we describe an approach to screen for proximate proteins in a relatively natural cellular environment. We took as our guide the DamID method devised by van Steensel and Henikoff (2000) to detect DNA–protein interactions. DamID takes advantage of the prokaryotic Dam methylase, which is fused to a potential DNA-binding protein. When expressed in eukaryotic cells, the fusion protein will uniquely methylate DNA sequences with which it comes in to contact, thereby leaving a chemical trace of its interactions. Our method to identify neighboring and potentially interacting proteins is based on the use of a promiscuous prokaryotic biotin protein ligase. Analogous to DamID, the biotin ligase is fused to a protein of interest, and then introduced into mammalian (or other) cells where it will biotinylate vicinal proteins upon supplementation of the culture medium with biotin. Biotinylated proteins can then be selectively isolated and identified by conventional methods, most notably mass spectrometry. We have applied this strategy, which we call BioID, to identify candidate proteins that are proximate to and/or interact with human lamin A (LaA), a well-characterized component of the nuclear envelope (NE), a specialized extension of the endoplasmic reticulum that surrounds the nuclear contents during interphase.
LaA is an intermediate filament protein and member of the A-type lamin family that is encoded by the LMNA gene (Gerace and Huber, 2012). Together with B-type lamins, the A-type lamins are constituents of the nuclear lamina, a filamentous protein meshwork that is intimately associated with the inner nuclear membrane (INM), the membranous portion of the NE that faces the interior of the nucleus. This association is mediated, at least in part, by multiple interactions with integral INM proteins. In addition, nuclear pore complexes (NPCs), large multi-protein channels that span the nuclear membranes and which mediate nucleocytoplasmic trafficking of macromolecules, are anchored to the nuclear lamina (Aaronson and Blobel, 1975; Dwyer and Blobel, 1976). Although the bulk of the A- and B-type lamins are localized to the nuclear lamina, a nucleoplasmic population is thought to function in various aspects of nuclear metabolism, including transcription and replication (Moir et al., 2000; Goldman et al., 2002).
In mammalian somatic cells, the nuclear lamina is roughly 15–20 nm thick and is considered to represent an important structural element of the NE (Gerace and Huber, 2012). Indeed, the role of the nuclear lamina as a determinant of both NE and global nuclear architecture has been highlighted by findings that mutations in the LMNA gene are linked to multiple human diseases including muscular dystrophy, lipodystrophy, and premature aging syndromes (Worman et al., 2009; Worman, 2012). Many of these disorders, known as laminopathies, are associated with often-times gross perturbations in nuclear and NE organization. To better understand the etiology of the laminopathies, much effort has been focused on identifying lamin-interacting proteins. However, both A- and B-type lamins are highly insoluble and consequently it has proven extremely difficult to define their molecular associations using conventional approaches. For these reasons we felt that LaA represented an ideal candidate with which to evaluate the utility of BioID as a general proximity-based approach to screen for potential protein–protein interactions. At the same time BioID introduces a new strategy with which to further explore LaA function.
Our development and use of BioID to identify LaA-proximal proteins has revealed a number of abundant candidates among which are known interactors of LaA. These include integral proteins of the INM as well as NPC components. Less abundant candidates fall into functional categories that include transcription, chromatin regulation, RNA processing, and DNA repair. An uncharacterized protein was also among the more prominent candidates revealed by LaA BioID. We demonstrate that this protein, which we have named SLAP75, is a novel constituent of the NE that appears to be expressed in a cell type–specific fashion. Taken together, these findings demonstrate that BioID is an effective method to screen for proximate and interacting proteins. This relatively simple and rapid technique has broad applicability to monitor protein behavior in live cells, providing a number of advantages over existing methods.
We sought to generate a method for labeling proteins in a proximity-dependent manner in mammalian cells. With the DamID method as a guide, we envisioned a system based on the fusion of a protein of interest to an enzyme that could selectively modify vicinal proteins in vivo (Fig. 1 a). There are two requirements for such a system. The first and most obvious is that the fusion protein must be targeted appropriately when expressed in cells. The second is that the modification itself must facilitate isolation of the specifically labeled proteins. Because it is relatively uncommon in vivo and amenable to selective isolation, biotinylation was the most obvious modification on which to focus.
BirA is a 35-kD DNA-binding biotin protein ligase in Escherichia coli that regulates the biotinylation of a subunit of acetyl-CoA carboxylase and acts as a transcriptional repressor for the biotin biosynthetic operon (Chapman-Smith and Cronan, 1999). BirA has been harnessed for experimental applications, including use in eukaryotic cells. The BirA acceptor-peptide system takes advantage of the extreme specificity of BirA in biotinylating its substrate peptide (Beckett et al., 1999). With this system, a minimal recognition sequence, a biotin acceptor tag (BAT), is fused to a protein of interest and coexpressed with BirA. This leads to the biotinylation of the BAT sequence permitting one-step high affinity (Kd = 10−14 M; Green, 1963) avidin/streptavidin-mediated purification of the tagged protein. Because biotinylation is a rare modification, in mammalian cells it is restricted primarily to only a few carboxylases (Chapman-Smith and Cronan, 1999); BAT-independent binding is minimal. Biotinylation by BirA is a two-step process. The first of these combines biotin and ATP to form biotinoyl-5′-AMP (bioAMP; Lane et al., 1964). This activated biotin is held within the BirA active site until it reacts with a specific lysine residue of the BAT sequence in the second step. For our purposes, the problem with BirA lies with its stringent selectivity for its endogenous substrate. What we desired was a far more promiscuous biotin ligase. This requirement led us to certain BirA mutants that prematurely release the highly reactive yet labile bioAMP (Kwon and Beckett, 2000; Streaker and Beckett, 2006). One such BirA mutant (R118G, hereafter called BirA*), which is defective in both self-association and DNA binding (Kwon et al., 2000), displays an affinity for bioAMP two orders of magnitude less than that of the wild-type enzyme (BirA-WT; Kwon and Beckett, 2000). In E. coli, BirA* expression results in promiscuous protein biotinylation because free bioAMP will readily react with primary amines. More significantly however, it has been demonstrated in vitro that BirA* will promiscuously biotinylate proteins in a proximity-dependent fashion (Choi-Rhee et al., 2004; Cronan, 2005).
We explored the possibility that BirA* would promiscuously biotinylate proteins in live mammalian cells. To this end we generated myc epitope-tagged, humanized BirA-WT and BirA* for transient expression in HeLa cells. Western blot analysis using streptavidin-HRP revealed modest levels of biotinylated proteins with BirA* as compared with BirA-WT (Fig. 2 a). Addition of 50 µM biotin to tissue culture medium, however, results in a massive stimulation of promiscuous biotinylation by BirA* but not BirA-WT (Fig. 2 a). By fluorescence microscopy the distribution of biotinylated proteins appears similar to that of myc-BirA* itself, which is predominantly nuclear with a subpopulation found in the cytoplasm (Fig. 2 b). These results indicate that BirA* promiscuously biotinylates proteins in mammalian cells. Furthermore, the level of that biotinylation is primarily regulated by the concentration of available free biotin. In conventional tissue culture media formulations fetal calf serum is the source of biotin. Our results indicate that the concentrations of biotin in standard complete media are insufficient to fuel significant biotinylation by BirA*. This has also been demonstrated for BAT biotinylation by BirA-WT (Nesbeth et al., 2006; Kulman et al., 2007), suggesting that it is not a BirA*-specific phenomenon.
Application of BioID to the nuclear lamina
We next wished to determine whether BirA* could be used as a tool to identify vicinal proteins in vivo. To this end we fused myc-BirA* to the N terminus of human LaA, a well-characterized constituent of the nuclear lamina. During interphase LaA has a relatively restricted distribution within the cell. It is detected for the most part at the NE with a subpopulation found throughout the nucleoplasm (Goldman et al., 2002). To provide consistent and controllable expression levels, we generated HEK293 cells that stably and inducibly express myc-BirA*LaA (Fig. 1 b). In these cells, myc-BirA*LaA localizes predominantly to the nuclear envelope, similar to both endogenous LaA and LaA harboring an N-terminal GFP tag, a modification that does not appear to alter the function of LaA (Broers et al., 1999; Shumaker et al., 2006). Biotinylation of endogenous proteins in cells expressing myc-BirA*LaA, either in the presence or absence of exogenous biotin, was monitored on Western blots probed with streptavidin-HRP. As is the case with myc-BirA* alone, the presence of 50 µM biotin in the culture medium strongly stimulates biotinylation of a wide range of endogenous proteins, in addition to myc-BirA*LaA itself (Fig. 3 a). Microscopy using fluorescent streptavidin reveals that the bulk of these biotinylated proteins must reside at the NE and colocalize with myc-BirA*LaA (Fig. 3 b). The implication is that proteins in the vicinity of myc-BirA*LaA are preferentially biotinylated. It should be noted that not only does the intracellular localization of these biotinylated proteins differ between myc-BirA* (predominantly nucleoplasmic) and myc-BirA*LaA (predominantly at the NE), but their electrophoretic mobilities and hence identities also differ as revealed by Western blot analysis (Figs. 2 a and 3 a). These results suggest that BirA* can be targeted to a specific cellular location and will biotinylate endogenous proteins in a proximity-dependent manner.
Temporal regulation of BioID
The requirement for exogenous biotin suggested to us a means to modulate BirA* activity. To explore this further, HEK293 cells expressing myc-BirA*LaA were analyzed by Western blot at various times after addition of 50 µM biotin to their culture medium (Fig. 3 c). Levels of biotinylated proteins increase in parallel with the duration of biotin exposure. This effect reaches saturation within 6 to 24 h with no obvious increase observed at later time points. A similar increase in biotinylation can also be observed by fluorescence microscopy (Fig. 3 d). Both methods reveal a time-dependent accumulation of biotinylated proteins, the majority of which appear to be endogenous and which evidently colocalize with myc-BirA*LaA. These studies indicate that by controlling access to biotin we can temporally regulate biotinylation by BirA*. This opens up the future possibility of performing pulse–chase type experiments using this technique.
Identification of vicinal proteins with BioID-LaA
We next set out to test our hypothesis that proteins biotinylated by myc-BirA*LaA should be enriched with known interactors of LaA as well as with near neighbors within the nuclear lamina and INM, and to a lesser extent within the nucleoplasm. To accomplish this we induced myc-BirA*LaA expression in HEK293 cells in the presence of doxycycline and 50 µM biotin for 24 h and then lysed the cells under stringent denaturing conditions using an SDS-containing buffer (Fig. 1 b). Parental HEK293 cells, processed in parallel, were used as controls. For these experiments 4.0 × 107 cells (four confluent 10-cm dishes) were analyzed. Biotinylated proteins were captured with streptavidin immobilized on paramagnetic beads, rigorously washed, and bound proteins analyzed by mass spectrometry. Proteins unique to the BioID-LaA (myc-BirA*LaA) pull-down (Table S1), and not detected with identical pull-downs from control cells (Table S2), were categorized based on localization and function (Fig. 3 e). The relative abundance of the identified proteins within each category is given as a percentage of the total. The bulk of the proteins identified by BioID-LaA are known NE components, including a number of INM proteins. The most abundant of these are the β and γ isoforms of lamina-associated polypeptide 2 (LAP2, TMPO) and lamina-associated polypeptide 1 (LAP1, TOR1AIP). LAP1 has a documented association with LaA (Foisner and Gerace, 1993), as does LAP2α, a soluble LAP2 isoform that also appeared prominently in the dataset (Dechat et al., 2000). Two other INM proteins, emerin (EMD; Lee et al., 2001) and MAN1 (LEMD3; Mansharamani and Wilson, 2005) identified by BioID-LaA are also known to interact with LaA. An additional INM protein detected in our screen was SAMP1 (TMEM201). Also known as NET5, SAMP1 was originally identified in a proteomic analysis of rat liver nuclear membrane proteins (Schirmer et al., 2003). A recent study (Gudise et al., 2011) suggests that SAMP1 is part of a protein network that includes A-type lamins and LINC complexes. The latter are evolutionarily conserved protein assemblies that span the NE and couple nucleoskeletal and cytoskeletal structures (Burke and Roux, 2009).
12 proteins associated with nucleocytoplasmic transport were detected by BioID-LaA. The three most prominent of these, Nup153, Nup50, and ELYS have been localized to the nucleoplasmic face of NPCs where they would be situated in the vicinity of the nuclear lamina (Sukegawa and Blobel, 1993; Guan et al., 2000; Walther et al., 2001; Rasala et al., 2008). At least one of these, Nup153, has previously been shown to interact directly with LaA (Al-Haboubi et al., 2011). A fourth NPC protein, Tpr, which is itself associated with Nup153 (Hase and Cordes, 2003; Krull et al., 2004), also appeared in the BioID screen. The detection of these NPC proteins by BioID-LaA is consistent with an NPC anchorage function for the nuclear lamina.
Several additional classes of proteins were represented among the BioID-LaA candidates, albeit at lower levels. These included proteins associated with DNA repair, transcription, chromatin regulation, and RNA-processing. Proteins considered to be components of a nucleoskeleton were also detected, the most abundant of which was filamin A (FLNA; Castano et al., 2010).
Identification of a novel NE constituent detected by BioID-LaA
An uncharacterized protein of 75 kD, FAM169A (KIAA0888), featured prominently in the BioID-LaA dataset. FAM169A has no predicted transmembrane domain and lacks any sequence motifs that might provide clues to its function. To test the possibility that FAM169A is a novel NE constituent we examined the localization of the endogenous protein in HEK293 cells by immunofluorescence microscopy. Fig. 4 a clearly shows that FAM169A is concentrated at the NE. Differential permeabilization of HEK293 cells with digitonin versus Triton X-100 indicates that FAM169A resides on the nuclear face of the NE (Fig. S1). We also introduced human HA epitope–tagged FAM169A into HeLa cells, which do not appear to express this protein. Consistent with the findings in HEK293 cells, recombinant FAM169A, detected using the anti-FAM169A antibody (Fig. S2), localizes predominantly to the NE (Fig. 4 b), although in both cell types we could always observe what appeared to be a nucleoplasmic population. In neither cell line was there any obvious association with NPCs. Taken together, these findings indicate that FAM169A is a novel NE component that must be enriched at the nuclear lamina or at the interface of the lamina and INM. We therefore propose to name this protein, SLAP75 (for soluble lamina-associated protein of 75 kD). Besides SLAP75, only two other soluble proteins (other than the lamins themselves), barrier to autointegration factor (BAF; Segura-Totten et al., 2002) and germ cell-less (GCL; Holaska and Wilson, 2006), have been shown to accumulate at the nuclear lamina. Proteomic screens have identified scores of membrane proteins that are enriched at the nuclear periphery (Schirmer et al., 2003). Our identification of an entirely new peripheral membrane constituent of the NE highlights the use of BioID as a valuable complement to these earlier studies. Furthermore, it confirms the use of BioID as an effective proximity-based tool to screen for neighboring and potentially interacting proteins. With this in mind, Table S1 lists 10 other uncharacterized proteins, including UPF0428 protein CXorf56, UPF0414 transmembrane protein C20orf30, UPF0552 protein C15orf38, and uncharacterized protein C9orf78. We are currently in the process of determining whether any of these, like SLAP75, represent novel NE or LaA-associated proteins.
We have devised a simple and rapid technique, BioID, which provides a means of identifying neighboring and potentially interacting proteins in vivo. The method takes advantage of BirA*, a highly promiscuous form of the E. coli BirA biotin protein ligase. BirA* may be targeted to specific subcellular locations by fusion to a “bait” protein. Nearby proteins, biotinylated by BirA*, can then be recovered in a single step on streptavidin-coated beads and identified by mass spectrometry. The only requirement for BioID is the expression of a single fusion protein. Consequently, BioID should be applicable to map protein associations in essentially any accessible cell type, mammalian or otherwise.
There are currently two strategies that are widely used to detect protein interactions. The first of these involves the yeast two-hybrid (Y2H) system and takes advantage of the ability of hybrid transcription factor domains to functionally associate, thereby driving expression of reporter genes. The second strategy is based upon coimmunoprecipitation or pull-down, frequently involving expression of single- or double-tagged bait proteins. Immunoprecipitated proteins are then identified by mass spectrometry. A significant attribute of the Y2H approach is that because it is based on a cDNA library screen, it is more likely to detect weak interactions or interactions between low abundance proteins. Furthermore, it is the method of choice where the focus is on proteins that may only be expressed in rare cell types. On the other hand, it is contingent upon proteins, or protein fragments maintaining their ability to fold correctly and to associate when removed from their normal cellular environment because by definition these interactions must take place within yeast, often in subcellular regions unlike that which they normally inhabit and without their normal complement of associated proteins and post-translational modifications. In many situations this may present a significant problem, especially when membrane proteins enter the equation. The other side of the coin is that incorrectly folded “bait” or “prey” proteins, while failing to interact with their cognate partners, may display other spurious interactions and hence give rise to false positives.
The pull-down approach has provided valuable data in a variety of systems. However, it has two limitations. The first of these, which indeed it shares with BioID, is the problem of scale when dealing with low abundance proteins. Simply put, such proteins may not be detected where it proves impractical to prepare or manipulate sufficient start material. The second limitation concerns solubility. Conditions required to solubilize many bait proteins may not be compatible with preserving interactions with partner proteins and vice versa. This becomes especially significant when considering weak interactions. In the case of lamin A, a highly insoluble protein, this has proved to be a serious stumbling block in the reliable identification of interacting proteins. Recently, Kubben et al. (2010) have introduced a work-around for this problem. They have used chemical cross-linking to stabilize lamin complexes before solubilization and pull-down. Significantly, this approach detected many of the same putative LaA interactors that we have now identified using BioID-LaA. Cross-linking certainly represents a valuable enhancement to the pull-down strategy. However, as an added variable it may in turn introduce additional artifacts such as aggregation.
We believe that BioID provides a useful complement to both of these more-established approaches in the characterization of potential protein–protein interactions and near-neighbor analyses. BioID uniquely combines two important attributes. The first of these is that it detects potential interactions in their normal cellular context. The second is that it sidesteps issues associated with bait or prey protein solubility. Because the key step of biotinylation occurs before solubilization it should detect both weak and transient interactions. Both of these features are highlighted in our BioID-LaA data where both soluble and membrane proteins were efficiently detected.
As with any method there are limitations that must be appreciated. BioID relies on the expression of an exogenous protein that is fused to BirA*, a protein slightly larger than GFP. Clearly, it is essential that the fusion protein displays the same targeting and assembly properties as the wild-type or endogenous molecule. This is an issue that must be addressed on a case-by-case basis. With respect to myc-BirA*LaA, the fusion protein appears to be targeted appropriately to the nuclear lamina where it shares the same solubility properties as both wild-type and GFP-tagged LaA. A more subtle issue may arise through biotinylation. Although we observed no evidence of a detrimental effect, our studies have used the addition of excess biotin to cell culture media to enhance the biotinylation of vicinal proteins. The covalent attachment of biotin to primary amines, predominantly lysines, leads to the loss of charge on these sites and at the same time could inhibit other secondary modifications. These effects might in turn alter the behaviors of both the fusion protein and neighboring proteins. The efficacy of BioID is obviously contingent upon the ability to biotinylate neighboring proteins, which is in turn dependent on the number and availability of primary amines in these proteins. Consequently, the abundance of the biotinylated proteins should not be used to indicate the strength or abundance of an association. Similarly, the absence of biotinylation does not rule out interaction or proximity. Most importantly, BioID-mediated biotinylation cannot be used to validate an actual protein interaction, but instead should be used as a screen to identify candidates that can be subsequently investigated systematically or in a hypothesis-based manner. Given the mechanism of BioID, biotinylated proteins can be placed into three categories; (i) direct interactions, either transient or stable, (ii) indirect interactions, or (iii) vicinal proteins that do not interact directly or indirectly. Given these limitations, BioID in its current guise should only be used as a screen for potential interactors or vicinal proteins.
Based on our mass spectrometry results we see clear evidence that BioID identifies well-characterized protein interactors of LaA, including a number of proteins detected by Kubben et al. (2010) using complex purification in combination with chemical cross-linking (LAP1, LAP2 isoforms, Emerin, and MAN1). It is clear that most (if not all) of the more abundant proteins identified with BioID-LaA, amounting to more than 50% of those detected, largely reside in close proximity to the INM. These could fall into the transient, indirect, or vicinal categories. Furthermore, BioID identified SLAP75, a previously uncharacterized protein that is clearly enriched at the nuclear lamina. Other NE proteins in the dataset are lamins B1 and B2 and lamin B receptor (LBR). The relatively low level of detection of B-type lamins could be a reflection of findings that A- and B-type lamins may be segregated into separate filament systems (Shimi et al., 2008). LBR is not known to interact with A-type lamins and its appearance could be simply a consequence of indirect interactions and/or proximity. However, it was detected by Kubben et al. (2010) using their approach of LaA affinity-capture combined with chemical cross-linking.
Also included in the list of identified proteins, albeit at reduced levels, are many nuclear proteins associated with DNA repair, transcription, chromatin regulation, and RNA processing. These proteins are not predominantly enriched at the NE, raising the question of how they were biotinylated by BioID-LaA. We propose that these represent either a subpopulation of nuclear proteins that transiently associate with LaA at the NE and/or were biotinylated by nucleoplasmic BioID-LaA (Goldman et al., 2002). Several of these proteins have what could be described as a circumstantial connection to LaA and might therefore be part of a LaA interaction network. PARP1, MDC1, NUMA, and NONO were all detected as part of the BAF proteome (Montes de Oca et al., 2009), i.e., they associate either directly or indirectly with BAF, while BAF itself is known to associate directly with LaA (Holaska et al., 2003). Consequently, the detection of these proteins by BioID-LaA could be a reflection of these reciprocal associations. However, BAF itself was not picked up in the BioID-LaA screen, potentially due to its small size (89 residues) and/or due to limited association in these cells.
Mutations in LMNA and EMD, the gene encoding emerin, both give rise to Emery-Dreifuss muscular dystrophy (EDMD2, 3, and EDMD1, respectively; Bione et al., 1994; Bonne et al., 1999, 2000; Raffaele Di Barletta et al., 2000). LaA and emerin are known to interact (Lee et al., 2001); indeed, emerin was one of the more abundant proteins detected in the BioID-LaA screen. Defects in the genes encoding at least three other proteins, nesprin-1, nesprin-2, and FHL1 (four-and-a-half LIM protein 1) are also known to cause EDMD (EDMD4–6, respectively; Zhang et al., 2007; Gueneau et al., 2009). Both nesprin-1 and nesprin-2 are LINC complex and NE components. FHL1, although apparently nucleoplasmic and cytoplasmic (there are three splice isoforms), was detected by BioID-LaA. This raises the possibility that these proteins may constitute an interaction network that if disrupted gives rise to the common phenotype of EDMD (Simon and Wilson, 2011).
Some of the proteins identified by BioID-LaA are classified as either cytoplasmic or ER residents. The latter are all membrane proteins. It is possible that at least some of these could have access to the INM, although not concentrate there. Certainly there is precedent for this (Torrisi and Bonatti, 1985; Torrisi et al., 1987). Alternatively, these cytoplasmic and ER proteins might become biotinylated during mitosis when the NE breaks down and lamins are dispersed throughout the cytoplasm. We are currently investigating the application of BioID in synchronized cell populations that may shed light on these possibilities. It should be noted that ACE (angiotensin-converting enzyme), a type-I membrane protein synthesized in the ER, was likely identified due to nonspecific binding, as there are no available primary amines for biotinylation in its cytoplasmic domain.
Several cytoplasmically oriented NPC proteins, including Nup214 and Nup358, were found to be biotinylated. As with the cytoplasmic and ER proteins, it is possible that this biotinylation occurs during mitosis. It is also possible that this might occur during nuclear import of myc-BirA*LaA. However, the fact that import receptors were not detected in the screen places a question mark over this. On the other hand, the large size of these nucleoporins may have biased their identification by mass spectrometry. The significantly more abundantly represented nucleoplasmic NPC proteins such as Nup153, Nup50, ELYS, and TPR likely reflect the close association between the NPCs and the lamina (Daigle et al., 2001). Certainly, Nup153 has already been shown to interact with LaA (Al-Haboubi et al., 2011). These interactions could explain the altered distribution of NPCs observed in LMNA-deficient cells (Sullivan et al., 1999).
Probably the most important question that remains to be answered concerns the activity radius of BirA* because this will define the resolution of the BioID technique. At present we have no way of measuring how far on average bioAMP molecules will diffuse from the parent BirA* enzyme before reacting with a primary amine. However, based on our results here, we can make some rough estimates. We can conservatively estimate that 50% of the proteins detected by BioID-LaA predominantly reside in the INM, the nuclear lamina, or the nucleoplasmic face of NPCs. The nuclear lamina in mammalian somatic cells is generally agreed to be ∼15–20 nm thick (Aaronson and Blobel, 1975; Dwyer and Blobel, 1976) and is closely apposed to the INM (Gerace and Huber, 2012). Nup153 and Nup50 appear to be associated with the nuclear ring of NPCs, as does the N-terminal region of TPR (Guan et al., 2000; Krull et al., 2004). This would place these nucleoporins at about the level of the nuclear lamina. Taken together, these findings suggest that roughly 50% of detected proteins likely reside within ∼20–30 nm of the nearest LaA molecule, and could well be much closer. More accurate measurements are limited by the population of nucleoplasmic mycBirA*LaA and the considerable mobility of most proteins over the 24-h labeling period.
A significant finding that lends additional credence to the utility of BioID is the observation that histones, which are lysine rich and highly abundant in the nucleus, constitute a disproportionately small fraction of the identified proteins. This indicates that BioID is not generating widespread biotinylation, but is more selectively labeling only those proteins in immediate proximity to the fusion protein. This could also be inferred from our fluorescence microscopy data where the streptavidin labeling is colocalized with the myc-BirA*LaA and restricted largely to the NE. It should also be noted that low levels of histones are reported to be biotinylated in vivo (Kuroishi et al., 2011). We detected biotinylated histones (H1.3/H1.4 and H1.0) in our control preparations at levels substantially lower than the four endogenously biotinylated mammalian carboxylases (Table S2).
In summary, we would suggest that BioID provides a powerful new approach to probe protein interactions and proximity in a variety of cell types. It is a technique that should be accessible to a broad range of researchers comfortable working with conventional molecular and cell biology techniques, and does not require specialized equipment other than the proteomic analysis that has become a commonly available service. We will continue to explore the advantages intrinsic to the BioID system. This includes its application in various subcellular compartments and the ability to monitor interactions of proteins at different time points after their synthesis, or at different stages of the cell cycle by regulating biotin availability and/or fusion protein expression. And although current studies are limited to mammalian cells, we predict that BioID has applications in cells from a wide variety of species as well as in model organisms.
Materials and methods
Humanized BirA (Mechold et al., 2005) was mutated to R118G by overlap extension PCR. Products for both the WT and R118G contain a 5′ SalI site and at the 3′ end, an XhoI, stop codon and AflII. These were digested with SalI and AflII and inserted into pcDNA3.1 C-terminal to a myc-epitope digested with XhoI and AflII. Human LaA was excised from pcDNA3.1 by XhoI and AflII and inserted in frame with the mycBirA* in pcDNA3.1 using the same restriction sites. The entire myc-BirA*LaA sequence was removed from pcDNA3.1 by NheI and AflII, bunted and inserted into pRetroX.Tight.puro that was digested with EcoRI and blunted. Clones were screened for proper directionality. pRetroX.Tight.puro is a puromycin selectable mammalian expression vector that contains a Tet-on–based tetracycline-inducible promoter to inducibly regulate expression.
Cell culture and generation of stable cell lines
pRetroX Tet-ON Advanced HEK293 cells (Takara Bio Inc.) that stably express the doxycycline-regulated transactivator protein were transiently transfected with pRetroX-Tight.puro myc-BirA*LaA with Lipofectamine 2000 (Invitrogen; Roux et al., 2009). Cells began selection with 0.5 µg/ml puromycin 48 h after transfection. Upon colony formation, subclones were isolated and screened by immunofluorescence after induction by the addition of 1 µg/ml doxycycline for 24 h.
Cells were fixed with 3% paraformaldehyde/PBS and permeabilized in 0.4% Triton X-100/PBS (Roux et al., 2009). Differential permeabilization was performed after paraformaldehyde fixation with 0.001% digitonin at 4°C for 10 min (Crisp et al., 2006; Liu et al., 2007; Roux et al., 2009). Mouse anti-myc (1:10 9E10; American Type Culture Collection) and streptavidin-568 (1:1,000; Invitrogen) were used to identify myc fusion proteins and biotinylated proteins, respectively. Other antibodies include rabbit anti-FAM169A/SLAP75 (1:200; Sigma-Aldrich), mouse anti-HA (1:200 12CA5; Covance), mouse anti-Nup153 (1:2, SA1; Bodoor et al., 1999), and mouse anti-LaA (1:100, XB10; Horton et al., 1992). Proteins were visualized with goat anti–mouse, goat anti–rabbit, or streptavadin coupled to Alexa Fluor 488 or -568 (1:1,000; Invitrogen). DNA was detected with Hoechst dye 33258. Coverslips were mounted in 10% Mowiol 4-88. The majority of images were obtained at 25°C using either a Leica DMRB microscope (40x/1.00 PL FLUOTAR oil PH3 and 63x/1.32 HCL PL APO oil PH3 Leica objectives) running IPLab/IVision software, or an Applied Precision DeltaVision Core system based on an Olympus IX71 microscope equipped with a 60x NA 1.42 lens. Image acquisition and processing was accomplished using DeltaVision Resolve3D and Softworx 4.1.0 software. Both microscope systems were equipped with Photometrics CoolSnap HQ cameras. Some conventional epifluorescence images of HA-SLAP75–transfected HeLa cells were acquired using a Zeiss Axioimager.Z1 equipped with a 63x NA 1.4 lens and CoolSnap HQ camera.
Cells were lysed in Laemmli SDS-sample buffer, separated by SDS-PAGE and transferred to nitrocellulose (Liu et al., 2007). Immunoblotting was performed (Liu et al., 2007) with rabbit anti-myc (1:50,000; Abcam). Biotinylated proteins were detected similarly with the following modifications. Membranes were blocked in 2.5% bovine serum albumin in PBS with 0.4% Triton X-100 and incubated in the same buffer with HRP-conjugated streptavidin (1:40,000; Invitrogen).
Affinity capture of biotinylated proteins
Cells were incubated for 24 h in complete media supplemented with 1 µg/ml doxycycline and 50 µM biotin. After three PBS washes, cells (for small-scale analysis, <107; for large scale analysis, 4 × 107) were lysed at 25°C in 1 ml lysis buffer (50 mM Tris, pH 7.4, 500 mM NaCl, 0.4% SDS, 5 mM EDTA, 1 mM DTT, and 1x Complete protease inhibitor [Roche]) and sonicated. Triton X-100 was added to 2% final concentration. After further sonication, an equal volume of 4°C 50 mM Tris (pH 7.4) was added before additional sonication (subsequent steps at 4°C) and centrifugation at 16,000 relative centrifugal force. Supernatants were incubated with 600 µl Dynabeads (MyOne Steptavadin C1; Invitrogen) overnight. Beads were collected and washed twice for 8 min at 25°C (all subsequent steps at 25°C) in 1 ml wash buffer 1 (2% SDS in dH2O). This was repeated once with wash buffer 2 (0.1% deoxycholate, 1% Triton X-100, 500 mM NaCl, 1 mM EDTA, and 50 mM Hepes, pH 7.5), once with wash buffer 3 (250 mM LiCl, 0.5% NP-40, 0.5% deoxycholate, 1 mM EDTA, and 10 mM Tris, pH 8.1) and twice with wash buffer 4 (50 mM Tris, pH 7.4, and 50 mM NaCl). 10% of the sample was reserved for Western blot analysis. Bound proteins were removed from the magnetic beads with 50 µl of Laemmli SDS-sample buffer saturated with biotin at 98°C. For the larger scale preparation, 90% of the sample to be analyzed by mass spectrometry was washed twice in 50 mM NH4HCO3.
Protein identification by mass spectrometry
For reduced scale experiments, proteins eluted from the streptavidin beads by SDS-sample buffer were reduced and alkylated and separated by 1D SDS-PAGE. Separated proteins were visualized by colloidal Coomassie blue staining. The whole gel lane was cut in 24 equal-sized gel bands, destained, and submitted to tryptic in-gel digestion, all using perforated microtiter plates (Proxeon) with exchange of solvents by low-speed centrifugation. Peptides were eluted into V-bottom polypropylene microtiter plates, freeze-dried, dissolved in 0.1% formic acid in water, and submitted to nano-flow HPLC coupled to a QTOF mass spectrometer (1260 nanoHPLC [Agilent Technologies] and QTOF 6554 with ChipCube [Agilent Technologies]). Separation of peptides was performed on a 150-mm × 75-µm C18 Reprosil column in a chip (Chip II; Agilent Technologies). The applied gradient was from 8% acetonitrile in water with 0.2% formic acid to 35% acetonitrile in water with 0.2% formic acid over 35 min. The mass spectrometer calibration was maintained by continuous submission of a calibrant solution and recalibration of the acquired spectra after the analytical run. The LC-MS/MS system was controlled by MassHunter Acquisition software (Agilent Technologies), 4 MS spectra per second and 3 MS/MS spectra per second were collected. The MS to MS/MS switching was done data dependent with a threshold of 1,000 counts and a charge of 2–4 for the peptides. Raw data were converted into mzdata.xml using MassHunter Qualitative Analysis software (Agilent Technologies) and database search was performed using MASCOT 3.2 (MatrixScience) and human IPI database (version 3.65). Carboxymethylated Cys was set as fixed modification, oxidized Met, deamidation of Asn and Gln, pyroGlu formation of the N terminus and acetylation of the N terminus as variable modification. The resulting .dat files were loaded into SCAFFOLD Q+ (Proteome Software). The acceptance level for proteins was two identified peptides with minimum 95% probability each. Spectra of candidates were verified visually.
For large-scale analysis, on bead tryptic digests were analyzed by 1D LC/MS/MS by the Sanford-Burnham Proteomic Facility (La Jolla, CA). Tris(2-carboxyethyl)phosphine (TCEP) was added to 100 µl of beads suspension mix and proteins were reduced at 37°C for 30 min. Iodoacetamide was added (to 20 mM) and proteins were alkylated at 37°C for 40 min in the dark. Mass spectrometry grade trypsin (Promega) was added (∼1:50 ratio) for overnight digestion at 37°C. Magnetic beads were removed by centrifugation. Formic acid was added to the peptide solution (to 2%) before on-line analysis of peptides by high-resolution, high-accuracy LC-MS/MS, consisting of a Michrom HPLC, a 15-cm Michrom Magic C18 column, a low-flow ADVANCED Michrom MS source, and a LTQ-Orbitrap XL (Thermo Fisher Scientific). A 120-min gradient of 10–30%B (0.1% formic acid, 100% acetonitrile) was used to separate the peptides. The total LC time was 141 min. The LTQ-Orbitrap XL was set to scan precursors in the Orbitrap at a resolution of 60,000, followed by data-dependent MS/MS of the top four precursors. Raw LC-MS/MS data were submitted to Sorcerer Enterprise (Sage-N Research Inc.) for protein identification against the IPI human protein database, which contains semi-tryptic peptide sequences with the allowance of up to two missed cleavages and precursor mass tolerance of 50.0 ppm. A molecular mass of 57 D was added to all cysteines to account for carboxyamidomethylation. Differential search included 16 D for methionine oxidation, and 226 D on N terminus and lysine for biotinylation. Search results were sorted, filtered, statically analyzed, and displayed using PeptideProphet and ProteinProphet (Institute for Systems Biology). The minimum trans-proteomic pipeline (TPP) probability score for proteins was set to 0.95, to assure TPP error rate of lower than 0.01.
Online supplemental material
Fig. S1 provides evidence that SLAP75 resides predominantly at the inner nuclear membrane. Fig. S2 provides evidence that anti-SLAP75 specifically detects exogenous but not endogenous SLAP75 in HeLa cells. Table S1 lists the identities and abundance of the proteins identified by BioID-LaA. Table S2 lists the identities and abundance of the proteins detected with control cells.
We would like to thank Wenhong Zhu for assistance with mass spectrometry. B. Burke and M. Raida were supported by the Singapore Biomedical Research Council and the Singapore Agency for Science, Technology and Research (A*STAR).
Funding for this project was provided by startup funds to K.J. Roux from Sanford Research/USD and from Bankhead-Coley Cancer Research Program/State of Florida Department of Health NIR award 08BN-05 to K.J. Roux.
biotin acceptor tag
proximity-dependent biotin identification
inner nuclear membrane
lamin B receptor
nuclear pore complex
soluble lamin-associated protein of 75 kD