Protein interactions are involved in all cellular processes. Their efficient and reliable characterization is therefore essential for understanding biological mechanisms. In this study, we show that combining bacterial artificial chromosome (BAC) TransgeneOmics with quantitative interaction proteomics, which we call quantitative BAC–green fluorescent protein interactomics (QUBIC), allows specific and highly sensitive detection of interactions using rapid, generic, and quantitative procedures with minimal material. We applied this approach to identify known and novel components of well-studied complexes such as the anaphase-promoting complex. Furthermore, we demonstrate second generation interaction proteomics by incorporating directed mutational transgene modification and drug perturbation into QUBIC. These methods identified domain/isoform-specific interactors of pericentrin- and phosphorylation-specific interactors of TACC3, which are necessary for its recruitment to mitotic spindles. The scalability, simplicity, cost effectiveness, and sensitivity of this method provide a basis for its general use in small-scale experiments and in mapping the human protein interactome.
One of the challenges in modern cell biology is how to reveal proteomic changes that underlie cellular perturbations, e.g., from gene mutation, RNAi, or chemical inhibition. Rapid identification of the members of protein complexes in a quantitative manner would facilitate these types of experiments. Affinity purification (AP) of proteins in combination with mass spectrometric detection of bound proteins (AP mass spectrometry [AP-MS]) identifies the components of protein complexes (Gingras et al., 2007; Köcher and Superti-Furga, 2007). AP-MS has already been the basis of large-scale interaction mapping in Saccharomyces cerevisiae (Gavin et al., 2006; Krogan et al., 2006). However, it has suffered from two principal problems. First, it is difficult to distinguish true interactors from background. Proteins binding nonspecifically to the antibodies or beads always copurify with the specific interactors. This either results in a high rate of false-positive interactions or it requires stringent purification, such as by tandem affinity tagging (Rigaut et al., 1999), often leading to loss of weak and transient binders. Second, although the prey proteins are expressed under native conditions, in tissue culture, the tagged bait protein is usually overexpressed from a cDNA under a general promoter, potentially compromising interaction data. For example, it would be very interesting to study how multiple protein complexes change with phenotypic perturbation, but such data would be difficult to interpret when not expressing the bait under endogenous control.
Bacterial artificial chromosome (BAC) recombineering (Zhang et al., 1998) is an alternative method to create the bait proteins needed for interaction proteomics. In this study, a gene of interest in its genomic context is tagged with a construct containing, e.g., GFP (Kittler et al., 2005). The BAC transgene can then be stably transfected into mammalian cell lines of choice. This allows for expression of the tagged protein at endogenous levels and ensures cell type–specific processing and regulation. BAC TransgeneOmics has been streamlined and can be readily performed for large numbers of genes in parallel (Sarov et al., 2006; Poser et al., 2008). Furthermore, recombineering technologies allow for the precise manipulation of BAC transgenes. For example, sites of protein modification can be mutated, and functional consequences can then be carefully analyzed in their native context when the endogenous protein is selectively depleted (Bird and Hyman, 2008).
Quantitative interaction proteomics can efficiently discriminate between specific and background binders without resorting to stringent purification procedures (Blagoev et al., 2003; Ranish et al., 2003; Vermeulen et al., 2008). We reasoned that combining this approach with the BAC recombineering technology would overcome most of the limitations currently associated with protein interaction screens. This strategy would avoid artifacts associated with overexpression but without the need to generate specific antibodies. Furthermore, by using GFP as the affinity tag, it would directly combine sophisticated imaging possibilities with quantitative proteomics technology (Cheeseman and Desai, 2005; Trinkle-Mulcahy and Lamond, 2007; Poser et al., 2008). Using quantitative proteomics would efficiently discriminate against background binders while preserving weak interactions. We call this technique quantitative BAC-GFP interactomics (QUBIC). Accurate quantification can be achieved by stable isotope labeling by amino acids in cell culture (SILAC; Ong et al., 2002; Mann, 2006). However, QUBIC performs as efficiently in label-free format. We demonstrate the power of QUBIC in analyzing the changing nature of protein complexes and interactions by addressing the long-standing question in mitotic spindle assembly of how the spindle protein TACC3 is recruited to spindles through its phosphorylation. We identified clathrin as a phospho-dependent spindle-associated TACC3 interactor, thereby revealing a functional role of clathrin in mitosis.
QUBIC is a rapid and efficient method to map protein complexes
QUBIC builds on large-scale BAC TransgeneOmics and powerful imaging technologies to which it adds an equally powerful quantitative protein interaction screening capability (Fig. 1). To create a platform for large-scale interaction studies in mammalian cells, we systematically engineered the various steps with a view to minimizing cost, time, and material while maximizing reproducibility and generic applicability without compromising sensitivity. Early on, we found that single-step AP was sufficient to define specific interaction partners when coupled to SILAC-based quantitative proteomics performed with high resolution liquid chromatography (LC) tandem MS (LC-MS/MS) on a mass spectrometer instrument (LTQ Orbitrap). Small magnetic beads in combination with a flow-through column system gave the best results for bait sequence coverage by MS, detection of interaction partners, and robustness while keeping background proteins at acceptable levels (Fig. S1 A). The small beads provide a large surface to volume ratio and consequently favorable binding kinetics as well as short incubation times using precoupled monoclonal mouse anti-GFP antibody. We compared different ways to release bound interacting proteins, including specific enzymatic elution, unspecific elution with 8 M urea, and a newly developed, very efficient in-column digestion procedure with trypsin. We determined that specific protease cleavage between bait and GFP tag worked efficiently for a subset of baits but poorly for others. For example, when purifying the transcription/export (TREX) complex with THOC3 as bait, most of the complex components were not identified with specific protease cleavage (PreScission; GE Healthcare; Fig. S1 B). We assume that in this case, the cleavage site was shielded by the complex. In contrast, direct enzymatic digests of proteins in the column provided high and uniform elution efficiency and allowed direct analysis of eluted peptides without protein precipitation.
We optimized all steps of the procedure using a variety of GFP-tagged cell lines. The combination of small magnetic beads with elution by in-column protease digestion of proteins helped to keep the entire pull-down procedure short (2 h including cell lysis). True interaction partners could be distinguished from background binders present in the immunoprecipitations (IPs) by their quantitative ratios. This also allowed the use of low stringency wash conditions, helping to retain weak interaction partners. We optimized LC gradients and the instrument method on our high resolution mass spectrometers for optimal peptide identification and quantitation of interaction partners. Our protocol allows automated analysis of 10 pull-downs per day. We also developed bioinformatic analysis procedures for the statistical interpretation of the quantitative pull-down data on the basis of the publicly available MaxQuant package (Cox and Mann, 2008). We found that a 15-cm dish, corresponding to ∼107 cells, provides sufficient material for QUBIC. This is at least a factor of 10 less than that commonly used in nonquantitative tandem AP (TAP)–MS.
Unraveling the interactors of the TREX complex using SILAC-QUBIC
We next applied these techniques to the characterization of the interaction network centered around the TREX complex (Reed and Cheng, 2005). Although mRNA export is similar in yeast and humans, the TREX complex is associated with the transcription apparatus in yeast and the splicing machinery in humans (Reed and Hurt, 2002; Strässer et al., 2002). In humans, the TREX complex consists of a core called the THO complex that is comprised of six proteins (THOC1, THOC2, THOC3, THOC5, THOC6, and THOC7) and two adaptor proteins (Aly/THOC4 and Bat1/UAP56; Masuda et al., 2005). The human TREX complex was only recently characterized in 2005, and this required ectopic expression of several complex members, extensive purification, MS, and Western blotting (Masuda et al., 2005).
We reasoned that the QUBIC technology might be able to define the TREX complex and its interactions in a rapid and robust manner. We performed GFP pull-downs of its six core members (THOC1–3 and THOC5–7) and the coadaptor THOC4/Aly from stable cell lines created by BAC TransgeneOmics. Immunoprecipitating the TREX complex is especially challenging because its function involves association with mRNA, which in turn associates with numerous RNA-binding proteins. This problem was minimized by the nucleic acid digestion step in the QUBIC lysis procedure, which prevents coprecipitation of mRNA and associated background proteins. SILAC pull-downs were performed in forward and reverse format, providing biological replicates and separating binders and background by their ratios in two dimensions (Fig. 2, A and B; and Fig. S2). The entire complex-mapping experiment required 16 single LC-MS/MS runs corresponding to 1.5 d of measurement.
All THOC core components specifically retrieved all other THOC core components (forward and reverse pull-down, P < 0.01), reliably defining the core complex (Fig. 2, A and B; and Fig S2, A–D). GFP fluorescence microscopy was performed in parallel on the same cell lines, which verified nuclear localization with a characteristic speckled pattern.
Fig. 2 C shows a two-way hierarchical clustering by ratio of significant TREX interactors (P < 0.1 in forward and reverse, and a ratio >2 for one of the baits). The TREX complex clusters at the top of the matrix, and the core members are separated from the known adaptor proteins, Bat1, and ARS2 as a result of their somewhat lower ratios. ARS2 has been reported as a weak and substoichiometric interactor, easily lost during purification (Masuda et al., 2005). POLDIP3 is a protein of unknown function. Its similar pattern in the TREX pull-downs suggests that it is likewise a noncore TREX interactor. Aly/THOC4, another adaptor protein, was identified in our pull-downs but not with a statistically significant ratio. It is a highly abundant nuclear protein, often seen as background binder to beads, and is involved in many cellular processes, such as acting as a chaperone in the dimerization of transcription factors and mRNA processing and mRNA export from the nucleus (Virbasius et al., 1999; Reed and Cheng, 2005). The pull-down with Aly-GFP led to only moderate enrichment of Aly itself because it binds to control beads as well. Nevertheless, THOC2, -5, -6, and -7 were enriched in the Aly pull-down (Fig. 2 C). The strongest interaction was with THOC5, with which it functionally and physically interacts independently of the TREX complex (Fig. S2 E; Katahira et al., 2009).
Below the core and adaptor proteins, there is a cluster comprising the entire T complex (TRiC), a chaperone with a role in folding nascent, unfolded protein chains (Fig. 2 C). As the T complex is only pulled down with THOC3 and THOC6, we can exclude that it binds to the entire TREX complex. Instead, it is likely involved in correct folding of the two proteins before they are assembled into the TREX complex.
Lastly, we combined the results of all forward and reverse pull-downs into a single graph (Fig. 2 D). By grouping all forward and all reverse pull-downs on the individual components into two single experiments, specific interactors of the complex are enhanced, whereas background binders are diminished. Indeed, all proteins annotated as TREX adaptors and several proteins annotated as TREX-associated proteins are clearly distinguished from background in this virtual pull-down experiment. For example, BAT1, POLDIP3, and ARS2 associate more closely with the core TREX complex than in the individual pull-downs. Further demonstrating the usefulness of this analysis, DDX39 protein was revealed as a significant interactor, although it was not statistically significant in any single pull-downs. DDX39 is an RNA helicase, and through its interaction with THOC4 and Bat1, is an already known interactor of the TREX complex (Pryor et al., 2004).
SILAC and label-free QUBIC of the anaphase-promoting complex (APC)
Although SILAC quantification is accurate and reliable, this technique requires prior labeling of the cell line under study. Because the ratios between preys binding to bait and control are generally large, we investigated whether label-free quantitation could identify complex members with the same confidence. For this study, we used the APC and performed, in addition to SILAC forward and reverse pull-downs, three pull-downs of unlabeled cells with CDC23-GFP as bait. We compared the intensities of all proteins with three pull-downs with eluates from beads exposed to untransfected HeLa cell lysates. In contrast to a recently published method that uses spectral counting as a proxy for peptide abundance (Sowa et al., 2009), we integrated total signal from all peptides from our high resolution MS measurements using the MaxQuant platform (Cox and Mann, 2008; unpublished data). By far, the simplest and most robust method to assign statistical significance to pull-down results turned out to be a t test comparing the three IPs with the three controls. We accepted proteins based on a combination of this p-value and the observed fold change (Tusher et al., 2001). A newly developed software package (QUBICvalidator) calculates a significance curve, separating binders from background in the fold change versus p-value plane (Fig. 3 B). All detectable members of APC and the known adaptors CDC20 and FZR1 were clearly inside the accepted area with a false-positive rate <0.001.
In addition, we found FBXO5/EMI1, a reported interactor of APC and of these adaptor proteins (Miller et al., 2006). Interestingly, NEK2, a serine/threonine protein kinase involved in mitotic regulation, was also a significant interactor. NEK2 contains a KEN box through which it is targeted for destruction by the APC (Pfleger and Kirschner, 2000). We were intrigued by two novel and completely uncharacterized APC binders, both quantified with >100-fold ratios. C10orf104/ANAPC16 (11.7 kD) was detected with P = 1.4 × 10−5, and C11orf51 (14.3 kD) with P = 1.4 × 10−4. They may have escaped detection by gel-based methods because of their small size. One of the proteins, C10orf104/ANAPC16, was identified in parallel studies as a genuine member of the APC core complex (Hutchins et al., 2010; Kops et al., 2010). C11orf51 was also identified by SILAC-QUBIC when using double labeling with arginine and lysine combined with tryptic digestion (Fig. S3). Furthermore, when we GFP tagged C11orf51 at both the N and C terminus, it showed a similar localization pattern to CDC23 in interphase (Fig. 3 C).
QUBIC uncovers proteins mediating phosphorylation-dependent targeting of TACC3 to the mitotic spindle
We next used QUBIC to investigate an unsolved question in mitotic spindle assembly: how does the phosphorylation of TACC3 by aurora A kinase mediate TACC3 localization to spindles? Aurora A regulates several mitotic processes (Barr and Gergely, 2007). However, how phosphorylation of specific proteins by aurora A facilitates the progression through mitosis is largely unknown. One relatively well-characterized target of aurora A is the protein TACC3, a conserved protein that has established roles in mitosis and microtubule dynamics in a variety of organisms (for review see Peset and Vernos, 2008). TACC3 localizes to the mitotic spindle and interacts and shares functions with the microtubule polymerase ch-TOG/CKAP5 (Gergely et al., 2000, 2003; Cullen and Ohkura, 2001; Lee et al., 2001). TACC3 also interacts with aurora A, which phosphorylates TACC3 on specific serine residues (Giet et al., 2002; Kinoshita et al., 2005). This phosphorylation regulates localization of TACC3 to the mitotic spindle, as depletion of aurora A or mutation of aurora A phosphorylation sites in TACC3 results in TACC3 mislocalization in several systems (Giet et al., 2002; Bellanger and Gönczy, 2003; Srayko et al., 2003; Barros et al., 2005; Kinoshita et al., 2005). Furthermore, inhibition of aurora A activity with an aurora A–specific small molecule inhibitor, MLN8054 (Manfredi et al., 2007), also delocalizes TACC3 from spindles in human cells (LeRoy et al., 2007).
Despite the many studies on TACC3 and aurora A, it is still unknown how TACC3 is recruited to mitotic spindles and why phosphorylation by aurora A is required. To elucidate the molecular mechanisms responsible for aurora A–dependent TACC3 targeting to the spindle, we wished to identify the proteins that interact with TACC3 in mitosis and to determine which of these interactions was dependent on TACC3 phosphorylation. We initially performed QUBIC on a TACC3-GFP cell line to identify interacting proteins. To validate the function of the TACC3-GFP transgene, and to subsequently combine QUBIC with functional RNAi experiments, we first made an RNAi-resistant TACC3-GFP BAC construct by recombineering based mutation of the region targeted by a 21mer siRNA. This construct, in addition to an mCherry–α-tubulin–expressing construct, was stably transfected into U2OS cells. The functionality of the TACC3-GFP protein was verified by its correct localization to mitotic spindles and by the fact that it did not show any noticeable phenotype after RNAi of the endogenous TACC3 (Fig. 4 A). We refer to this line as TACC3WT.
Because aurora A phosphorylates TACC3 during mitosis (Kinoshita et al., 2005), we next immunoprecipitated TACC3 from mitotically arrested cells to identify interacting proteins. TACC3 itself is the most enriched protein in the pull-down (Fig. 4 B), and the known interactors aurora A and ch-TOG also had significant p-values (P < 0.01). Multiple novel interactors were also identified by QUBIC. Interestingly, these included three clathrin subunits, CLTA, CLTB, and CLTC, as well as PIK3C2A, which associates with clathrins and is involved in mitosis (Gaidarov et al., 2001; Didichenko et al., 2003). These results are consistent with the finding that clathrin concentrates at the spindle apparatus in mitosis and is involved in microtubule stabilization (Okamoto et al., 2000; Royle et al., 2005). The protein GTSE1 was also recovered as a significant TACC3-binding protein. GTSE1 has been reported to localize to interphase microtubules, but its known functions are related to p53 regulation (Utrera et al., 1998; Monte et al., 2004).
The rapid availability of BAC transgene cell lines allowed us to perform reverse IP experiments using CLTC, GTSE1, and PIK3C2A as baits. This analysis revealed that these proteins all interact with each other and bind to several proteins that were initially identified as TACC3 interaction partners, including ch-TOG, CLINT1, and SEC16A (Fig. 4, C and D; and Fig. S4 A). We clustered specific interaction partners according to their variability in the replicate experiments and the ratios between bait and control. This uncovered a putative novel complex consisting of clathrin heavy and light chain subunits CLTA, CLTB, and CLTC, as well as CLINT1, SEC13, SEC16, PICALM, GTSE1, PIK3C2A, and ch-TOG (Fig. 4 E). In addition to a different cluster containing TACC3-specific interactors (Fig. 4 E, green), we found several proteins that interact with CLTC, GTSE1, and PIK3C2A but not TACC3 (Fig. 4 E, blue). Many of the proteins in the latter cluster are known to be located in clathrin-coated vesicles. This cluster likely represents clathrin-associated proteins present in vesicles in mitotic cells (Fig. 4 F) that do not interact with the spindle-associated clathrin directly.
The BAC-GFP cell lines allowed us to analyze the mitotic localization of putative spindle-associated interactors by fluorescence microscopy. We found that the clathrin (CLTC) and GTSE1-GFP constructs indeed localize to mitotic spindles similar to TACC3 (Fig. 4 F), which is consistent with an interaction. We next sought to determine through QUBIC whether any of the TACC3 interactors would fail to bind TACC3 when it is not phosphorylated by aurora A. Such proteins would be candidates for targeting TACC3 to spindles. We inhibited aurora A phosphorylation of a GFP-tagged TACC3 construct through two complementary methods: treating wild-type (WT) TACC3-GFP cells with the aurora A inhibitor MLN8054 and generating point mutations in conserved aurora A sites in the TACC3-GFP protein. For the latter, we additionally engineered three point mutations into the siRNA-resistant TACC3WT construct in conserved serines previously shown in Xenopus laevis or human cells to be phosphorylated by aurora A (S34A, S552A, and S558A [TACC3AAA]; Kinoshita et al., 2005; LeRoy et al., 2007).
The TACC3WT construct was not associated with spindles after 5 h of treatment with 500 nM MLN8054, which is in agreement with previous results (Fig. 5 A, bottom; LeRoy et al., 2007). In a complementary approach, we analyzed our phosphosite-mutated TACC3AAA line. RNAi of endogenous TACC3 in the TACC3WT line had no effect on TACC3WT localization to the spindle, whereas RNAi of endogenous TACC3 in the TACC3AAA line resulted in the loss of TACC3AAA from the spindle, which is similar to MLN8054 treatment (Fig. 5 B). This is consistent with previous data that a TACC3 cDNA transgene mutated at S558A does not target to mitotic spindles (LeRoy et al., 2007). We additionally found that when TACC3AAA was the only version of TACC3 expressed in cells, we observed perturbations in spindle integrity and chromosome alignment (Fig. 5 B). Thus, both methods of inhibiting aurora A phosphorylation of TACC3 led to mislocalization of TACC3 from spindles and defects in spindle assembly.
We then used label-free QUBIC to investigate the underlying proteomics changes associated with these phosphorylation events. We compared interaction partners of TACC3WT with cells treated with aurora A kinase inhibitor or cells expressing the TACC3AAA phosphomutant. When aurora A activity was inhibited by MLN8054 treatment, GTSE1 and CLINT1 bound much less to TACC3, as did the three clathrin subunits (CLTA, CLTB, and CLTC; Fig. 5 C). PIK3C2A, ch-TOG, and SEC16A showed some reduced binding, although to a lesser extent, whereas other interactors exhibited no phospho-dependent binding. Comparing TACC3AAA with TACC3WT interactors confirmed a differential, phospho-dependent interaction of GTSE1 and the clathrin subunits (Fig. S4 D). Strikingly, all proteins that showed differential binding to TACC3 upon aurora A kinase inhibitor treatment belong to the aforementioned novel complex (Fig. 5 E), whereas proteins that did not change clustered separately as TACC3-specific interactors in the initial pull-down. This suggests that members of this putative spindle-associated complex may either recruit TACC3 to mitotic spindles after its phosphorylation by aurora A or otherwise require this phosphorylation for localization to spindles.
To test whether clathrin or GTSE1 was required to localize TACC3 to spindles, we performed RNAi of CLTC or GTSE1 in TACC3WT cells that also stably expressed mCherry–α-tubulin. RNAi of CLTC but not GTSE1 delocalizes TACC3 from spindles (Fig. 6 D). Thus, clathrin but not GTSE1 targets TACC3 to mitotic spindles, which is likely dependent on the phosphorylation of TACC3.
To confirm and expand the spindle localization dependencies of these proteins, we additionally performed RNAi of TACC3, CLTC, and GTSE1 in CLTC-GFP and GTSE1-GFP cell lines (Fig. 6). We found that depletion of neither GTSE1 nor TACC3 resulted in mislocalization of CLTC-GFP from spindles, which is consistent with our hypothesis that clathrin recruits TACC3 to spindles and suggesting that GTSE1 is recruited through clathrin as well. GTSE1 RNAi depleted protein levels to <10%, confirming the efficiency of the siRNA used (unpublished data). Conversely, individual RNAi of both TACC3 and CLTC displaced GTSE1 from spindles, suggesting that GTSE1 is recruited downstream of phospho-TACC3 to these spindles. These results support a mechanism in which clathrin is first recruited to spindles independently of aurora A. Aurora A phosphorylation of TACC3 then allows it to interact with clathrin and to localize to spindles. In this study, phospho-TACC3 also recruits GTSE1. For confirmation of this mechanism, we next analyzed the localization of these proteins after treatment with the aurora A inhibitor MLN8054. Consistent with the aforementioned hypothesis, inhibition of aurora A activity resulted in the mislocalization of TACC3-GFP (Fig. 4 A, bottom; LeRoy et al., 2007) and GTSE1-GFP from spindles but not of CLTC-GFP (Fig. 6).
Interaction and localization analysis of pericentrin isoforms
Pericentrin is a large (>350 kD) conserved protein that localizes to centrosomes and the pericentriolar material and is required for centrosome function (Doxsey et al., 1994). Mutations in the pericentrin gene (PCNT2), including stop, missense, and splice site mutations, are linked to the MOPD II and Seckel syndrome disorders, which are characterized by dwarfism and microcephaly (Griffith et al., 2008; Rauch et al., 2008). Our aim was to use QUBIC to identify potential differences in binding partners of two reported pericentrin splice isoforms, only one of which contains a C-terminal PACT domain that can localize to centrosomes (Gillingham and Munro, 2000). The PACT domain is lost in the truncated forms of pericentrin found in patients with MOPD II and Seckel syndrome (Griffith et al., 2008; Rauch et al., 2008), but it is still unclear how the PACT domain recruits pericentrin to centrosomes.
We inserted a GFP tag directly before the stop codon of the largest and most commonly investigated pericentrin splice isoform (frequently termed pericentrin B). We refer to this construct as pericentrinlong. We engineered an additional pericentrin BAC construct, which we call pericentrinshort, to express a protein-GFP construct in which the final 11+ exons (688 amino acids) of the PCNT2 gene, including the PACT domain, are removed so that the mRNA product should be the same as a previously reported potential pericentrin splice isoform (termed pericentrin A or Pc-250; see Materials and methods; Fig. 7 A; Flory and Davis, 2003). Live and fixed imaging of pericentrinlong cells showed a localization of pericentrin to centrosomes throughout the cell cycle with increased abundance in mitosis (Fig. 7 B, top; Fig. S5; and Video 1; Doxsey et al., 1994). In contrast, pericentrinshort localized to the cytoplasm in interphase, and as cells entered mitosis, it quickly accumulated at centrosomes, persisting through metaphase. The centrosomal signal then dropped off rapidly as cells completed mitosis. (Fig. 7 B and Video 2). We confirmed these results using fixed analysis (Fig. 7 C, arrows). From these results, we conclude that centrosome localization in interphase depends on the C-terminal region of pericentrin that contains the PACT domain.
Previous results have shown that dynein–dynactin subunits bind to pericentrin. Triplicate pull-downs of both constructs, as well as of an untagged HeLa cell line, revealed common and distinct interaction partners by label-free QUBIC and showed that all identified dynein–dynactin subunits bound significantly more to pericentrinlong (Fig. 8). PCM-1, a pericentriolar protein known to bind pericentrin (Li et al., 2001) and Fam133A, an uncharacterized protein of 30 kD, also bound preferentially to the pericentrinlong construct (ratio of 3.9, P = 5.7 × 10−3; and ratio of 4.6, P = 1.4 × 10−3).
Interestingly, one centrosomal protein, CDK5RAP2/Cep215 (Graser et al., 2007; Fong et al., 2008; Haren et al., 2009), was significantly enriched in the pull-down of the short construct (5.7-fold; P = 1.1 × 10−3; Fig. 8, A and C). Although the centrosomal localization patterns of CDK5RAP2/Cep215 and pericentrin are already known to depend on each other (Haren et al., 2009), our QUBIC experiment was the first evidence of a protein–protein interaction between these two centrosome proteins. Enhanced binding to the short form was surprising because the long form should have all domains of the short form. To investigate possible further differences between the baits, we mapped all identified pericentrin peptides to both forms (Fig. 8 D). We identified 91 and 128 peptides from the pericentrinlong and pericentrinshort pull-downs, respectively. None of the peptides found in the pericentrinshort pull-down mapped to the C-terminal 688–amino acid region of pericentrin, confirming its absence from the expressed protein. Surprisingly, however, a second region of ∼500 amino acids, directly N terminal to this region, was well represented (25 peptides) in the short form but absent in the long form. This was unexpected, as the published predominant cDNA, which shares the C terminus with the pericentrinlong construct, contains these regions. Analysis of the genomic DNA of these cells confirmed that the DNA encoding this region was present in both constructs. Therefore, we assume that the observed discrepancy is the result of cell type–specific splicing or processing events. The finding that pericentrinshort contains a region not found in pericentrinlong is the likely explanation of the preferential binding of CDK5RAP2/Cep215 to this construct.
Recent developments in functional genomics using procedures such as RNAi have revolutionized the study of phenotype by scaling up the rate at which these experiments can be performed in a genome-wide manner. However, follow-up techniques, which map the proteomic changes underlying these phenotypic changes, have lagged behind these studies. With QUBIC, we have developed an effective technology for studying cell biological questions in the area of protein interactions, which addresses these challenges. Our study shows that modern techniques in MS together with BAC-based recombineering and live cell imaging allow rapid and quantitative assessment of members of a protein complex and how they change in response to acute chemical or mutational perturbations.
The QUBIC procedure described in this study has several attractive features. Interactors are captured on nanometer-sized beads, leading to favorable kinetics and therefore short incubation times, increasing the interactor to background ratio. Elution from the beads is performed by direct in-column enzymatic digestion. Among different quantification methods, we found that label-free quantification of high resolution MS data using the MaxQuant algorithms provided the best separation of background and specific binders. High resolution MS is an integral part of the QUBIC procedure because it leads to accurate quantitation of bait pull-down against control pull-down. This efficiently distinguishes specific binders from background proteins, even when the latter are of much higher abundance. The QUBIC technology has been applied on hundreds of baits in different projects in our laboratory and has proven extremely robust without requiring case-specific optimization.
In Table I, we summarize different aspects of the three existing major AP-MS approaches, which are based on tagged cDNA with TAP purification (Sowa et al., 2009), tagged cDNA with single-step purification (Glatter et al., 2009), or purification of endogenous protein complexes using specific antibodies (Trinkle-Mulcahy and Lamond, 2007), and compare them with QUBIC. TAP has been the basis of some of the most successful work so far in yeast, but it clearly only works for very stable associations. QUBIC only requires a small fraction of the large amounts of input material required in TAP-tagging approaches. Furthermore, the combination of high yields with short purification times minimizes the risk of losing weak interactions compared with TAP procedures. The cDNA approach inevitably involves ectopic expression of the gene, which can lead to incorrect localization (and therefore inappropriate binding) and forced interactions that do not occur in vivo. For example, many cDNA baits are not naturally expressed at all in the system that is used to study interactions. The second strategy of using antibodies against endogenous proteins is theoretically the best way to define in vivo interactions. However, it is not scalable, and it completely depends on the specificity of the antibody.
QUBIC is the only approach that combines the advantages of endogenous gene processing and gene expression while still retaining scalability. Because it uses BAC-GFP technology, it already comes with several desirable features. These include a large reagent base, manipulation of the bait by BAC recombineering, access to large genes that are not contained in cDNA libraries (or that are corrupted in those libraries), and of course direct coupling to powerful microscopy methods such as 96-well–based live cell imaging. The major conceptual advance in QUBIC is the extension of methods that were possible only in yeast to the mammalian system.
In addition, QUBIC exemplifies how interaction proteomics can be used to rapidly study the proteomic changes underlying phenotypic perturbation. By inhibiting phosphorylation of TACC3 either by small molecule inhibition of its upstream kinase or by point mutation of conserved phosphorylation sites, we identified several proteins that preferentially bind aurora A–phosphorylated TACC3, representing a novel complex associated with spindles in mitosis. We have identified one member of this complex required for the interaction of phosphorylated TACC3 with spindles in clathrin heavy chain (CLTC). Clathrin targeting of TACC3 to spindles suggests that reported mitotic phenotypes associated with clathrin RNAi and the observed role of clathrin in microtubule stability (Royle et al., 2005) are caused by the mislocalization of TACC3.
We also show that different forms of the protein pericentrin interact with different subsets of centrosomal proteins, which may explain their divergent localization patterns. Additionally, we found that the predominant pericentrin isoform expressed in these cells differs from the published cDNA sequence. This result illustrates a major advantage of using BACs as transgenes in that they allow the cell to process the relevant splice isoforms rather than expressing a protein from an artificial cDNA construct. High resolution MS can then characterize the isoforms expressed as shown in this study.
These applications demonstrate that QUBIC provides a versatile platform to accommodate second generation functional interaction experiments. Importantly, the quantitative nature of QUBIC makes it readily compatible with chemical inhibition or RNAi depletion, although these techniques often do not achieve full penetrance.
Despite the broad capabilities and versatility of QUBIC, it can readily be performed by nonspecialist laboratories. For BAC TransgeneOmics, BACs can be ordered and processed, and stable cell lines were generated according to published protocols (Zhang et al., 1998; Poser et al., 2008). All other steps similarly require only standard laboratory equipment or readily available reagents and only knowledge of common biochemical procedures. Costs per pull-down are very low. QUBIC does require access to high resolution MS equipment coupled to high performance LC. However, such equipment is increasingly accessible, and the MS analyses themselves are relatively standard. Data analysis can be performed using the freely available MaxQuant software suite. Thus, any laboratory can select genes of interest and perform QUBIC on them in a wide variety of formats.
To make it easy for the research community to perform QUBIC, we need to create the generic resources involved. This includes the genome-wide generation of BAC-based vectors consisting of the gene of interest fused 5′ or 3′ to the GFP-containing cassette. First, this set of DNA constructs should be available as a resource. Second, stable cell lines of at least one common model cell line should be generated with these constructs and be available to the community. We have already streamlined the BAC TransgeneOmics process (Sarov et al., 2006; Poser et al., 2008). Based on our experience and the fact that we have so far created hundreds of stable cell lines, we predict that scale up to the whole genome is entirely feasible.
Materials and methods
BACs containing the gene of interest were purchased from BACPAC Resources Center (for detailed information see Supplemental data). A LAP tag cassette (Poser et al., 2008) was recombined at the C terminus of all TREX components, CDC23, TACC3, CLTC, GTSE1, and PIK3C2A by Red E/T–based recombination (Zhang et al., 1998; Muyrers et al., 2001). Point mutations in TACC3 were introduced through recombineering using counter selection based on an RpsL-amp cassette (Guo et al., 2006; Bird and Hyman, 2008) as described in the Counter Selection BAC Modification kit (Genebridges). For the pericentrinlong construct, a GFP tag cassette was recombined at the C terminus of the PCNT2 gene, ending with the amino acid sequence QKIKQ. For the pericentrinshort construct, a GFP tag cassette was recombined into the coding region of the PCNT gene to directly follow the amino acid sequence QKTLSK, while simultaneously deleting all of the following exons until the 3′ UTR, so as to match the sequence in the 3′ end of GenBank accession no. AY179559.
Cell culture and cell lines for BAC transfection
U2OS, HeLa, and HeLa Kyoto cells were grown in DME containing 10% fetal bovine serum, 2 mM L-glutamine, 100 U/ml penicillin, and 100 mg/ml streptomycin at 37°C and 5% CO2. BAC constructs or an mCherry–α-tubulin plasmid were transfected into cells in 6-cm dishes with 20 µl Effectene (QIAGEN) following the manufacturer’s protocol, and stable line populations were selected on G418 (BACs) or puromycin. TACC3 constructs were used in U2OS cells, pericentrin constructs were used in HeLa cells, and CLTC, PIK3C2A, APC members, and TREX members were used in HeLa Kyoto cells. GTSE1 constructs for pull-downs were used in HeLa Kyoto cells, and for localization after RNAi and inhibitor treatment, were used in U2OS cells. For siRNA transfections, cells were added to prewarmed media, and transfection complexes containing 2.0 µl Oligofectamine (Invitrogen) and 80 pmol (TACC3 and control) or 40 pmol (GTSE1, CLTC, and control) siRNA added immediately afterward in a total volume of 500 µl. Media were changed after 6–8 h. Control (Silencer Negative Control #3), TACC3 (5′-GUUACCGGAAGAUCGUCUG-3′), GTSE1 (5′-CGGCCUCUGUCAAACAUCA-3′), and CLTC (5′-GGUUGCUCUUGUUACGGAU-3′) siRNAs were purchased from Applied Biosystems. For MLN8054 experiments, cells were treated for 5 h with 500 nM MLN8054.
The following antibodies were used for immunofluorescence: mouse anti–α-tubulin (DM1α; Sigma-Aldrich), rat anti–α-tubulin (AbD Serotec), rabbit anti-pericentrin (Abcam), mouse anti-GFP (Roche), and goat anti-GFP (Poser et al., 2008). Secondary antibodies used were donkey anti–mouse, –rabbit, or –rat conjugated to Alexa Fluor 488, 594, or 647 (Invitrogen).
Cells on coverslips were fixed with PFA (TREX and APC images) or −20°C methanol (pericentrin images). Cells were blocked with 0.2% fish skin gelatin (Sigma-Aldrich) in PBS. Cells were incubated with primary antibodies in 0.2% fish skin gelatin in PBS for 20 min at 37°C, washed, and repeated with secondary antibodies. Coverslips were mounted with ProLong gold with DAPI (Invitrogen) overnight and sealed.
Microscopy and image quantification
Images of TREX and APC components were acquired using MetaMorph software (version 188.8.131.52; MDS Analytical Technologies) on a microscope (Axioplan 2; Carl Zeiss, Inc.) with a 63× 1.40 NA oil differential interference contrast Plan Apochromat objective (Carl Zeiss, Inc.) and a camera (CA 742–95; Hamamatsu Photonics) at room temperature. All other fixed and live images were acquired using an imaging system (Deltavision RT; Applied Precision) with an inverted microscope (IX70/71; Olympus) equipped with a charge-coupled device camera (CoolSNAP HQ; Roper Industries). Fixed images were acquired in 0.2-µm serial z sections using a 100× 1.35 NA UPlanApo objective at room temperature. Live cell videos were acquired in 1.5-µm serial z sections at intervals of 3 (pericentrinlong) or 15 min (pericentrinshort) using a 60× 1.42 NA PlanApo N objective at 37°C. For live three-color still images of TACC3-GFP mCherry–α-tubulin lines, 100 ng/ml Hoechst 33342 was added to the media 1 h before imaging. All live cell still images were acquired in 0.5-µm serial z sections. For live cell imaging, cells were incubated in a CO2-independent medium (Invitrogen). Datasets were deconvolved using SoftWorx software (Applied Precision).
Cell culture for QUBIC experiments
For all pull-downs, ∼107 cells were used. Stably transfected HeLa and U2OS cells were cultured in media containing 400 µg/ml and 500 µg/ml geneticin (Invitrogen), respectively. For SILAC labeling, HeLa cells were cultured for 2 wk in DME (4.5 g/L glucose) without lysine and with methionine (Invitrogen) containing 49 mg/ml light (C12N14) or heavy (C13N15) lysine (Euriso-Top), 100 U/ml penicillin (Invitrogen), 100 mg/ml streptomycin (Invitrogen), and 10% fetal bovine serum dialyzed with a cut off of 10 kD (Invitrogen) at 37°C and 5% CO2. The WT cell line was treated the same as a control. Cells were harvested using trypsin, washed once with PBS, and the pellet was shock frozen in liquid nitrogen and stored at −80°C until used for IP.
Specific cell culture of TACC3 cells for QUBIC
For aurora A inhibitor experiments, triplicate experiments each using four 15-cm dishes of GFP-tagged TACC3 and two 15-cm dishes of U2OS control cells were seeded to 60% confluence and arrested in mitosis by adding 2 mM thymidine (Sigma-Aldrich) for 20 h. They were then washed with PBS, and fresh media were added. After 6 h, 100 ng/ml nocodazole was added, and after an additional 3 h, aurora A kinase inhibitor MLN8054 (provided by J. Ecsedy, Millennium Pharmaceuticals, Cambridge, MA) was added to two TACC3 dishes to a final concentration of 500 nM. 5 h later, all cells were harvested.
For TACC3 RNAi of cells before QUBIC analysis, 107 cells for each condition were resuspended in 8 ml media without antibiotics. Transfection complexes containing of 1.8 nmol siRNA and 30 µl Oligofectamine were added to cells in a 50-ml tube. Cells were incubated for 6 h at 37°C with occasional agitation and plated. 77 h after transfection, nocodazole was added to cells for 22 h, at which point cells were harvested for analysis.
Cell pellets were thawed on ice and incubated for 30 min at room temperature in 1 ml lysis buffer containing 150 mM NaCl, 50 mM Tris, pH 7.5, 5% glycerol, 1% IGEPAL-CA-630, 1 mM MgCl2, 200 U benzonase (Merck), and EDTA-free complete protease inhibitor cocktail (Roche). When studying phospho-dependent interactions, phosphatase inhibitors (Roche) were added as well. Lysates were cleared by centrifugation at 4,000 g and 4°C for 15 min to remove remaining membrane and DNA, and the supernatant was incubated with 50 µl magnetic beads coupled to monoclonal mouse anti-GFP antibody (Miltenyi Biotec) for 15 min on ice. Because of the extremely small size of the beads (50 nm), they are nonsedimenting and show fast reaction kinetics. Magnetic columns were equilibrated using 250 µl lysis buffer. Cell lysates were added to the column after incubation and washed three times with 800 µl ice-cold wash buffer I containing 150 mM NaCl, 50 mM Tris, pH 7.5, 5% glycerol, and 0.05% IGEPAL-CA-630, and two times with 500 µl of wash buffer II containing 150 mM NaCl, 50 mM Tris, pH 7.5, and 5% glycerol. Purified proteins were predigested by adding 25 µl 2 M urea in 50 mM Tris, pH 7.5, 1 mM DTT, and 150 ng EndoLysC (Wako Chemicals USA, Inc.) for SILAC experiments or 150 ng trypsin (Promega) for label-free experiments. After in-column digestion for 30 min at room temperature, proteins were eluted by adding two times 50 µl 2 M urea in 50 mM Tris, pH 7.5, and 5 mM chloroacetamide. In SILAC experiments, heavy and light eluates of transgenic cell line and the corresponding WT cell line were combined immediately after elution from the columns. Proteins were digested overnight at room temperature. The digestion was stopped by adding 1 µl trifluoroacetic acid, and peptides of each experiment were split and purified on two C18 Stage Tips and stored at 4°C (Rappsilber et al., 2007).
Pull-downs can be performed manually on a hand magnet. In our laboratory, pull-downs were performed on the automated liquid-handling platform (Freedom EVO 200; Tecan) in a fully automated manner.
Peptides were eluted from C18 Stage Tips with 2 × 20 µl solvent B (80% acetonitrile and 0.5% acetic acid). Acetonitrile was evaporated, and thereby, the volume reduced to 5 µl in a speed vacuum centrifuge. 10 µl solvent containing 2% acetonitrile and 0.1% trifluoroacetic acid was added.
Peptides were separated on line to the mass spectrometer by using an easy nano-LC system (Proxeon Biosystems). 5 µl samples were loaded with a constant flow of 700 nl/min onto a 15-cm fused silica emitter with an inner diameter of 75 µm (IntelliFlow; Proxeon Biosystems) packed in house with RP ReproSil-Pur C18-AQ 3 µm resin (Dr. Maisch). Peptides were eluted with a segmented gradient of 2–60% (for trypsin digest) and 5–60% (for EndoLysC digest) solvent B over 105 min with a constant flow of 250 nl/min. The nano-LC system was coupled to a mass spectrometer (LTQ-Orbitrap; Thermo Fisher Scientific) via a nanoscale LC interface (Proxeon Biosystems). The spray voltage was set to 2.1 kV, and the temperature of the heated capillary was set to 180°C.
Survey full-scan MS spectra (m/z = 300–1,650) were acquired in the Orbitrap with a resolution of 60,000 at the theoretical m/z = 400 after accumulation of 1,000,000 ions in the Orbitrap. The most intense ions (up to 10) from the preview survey scan delivered by the Orbitrap were sequenced by centromere identifier (collision energy 35%) in the LTQ after accumulation of 5,000 ions concurrently to full scan acquisition in the Orbitrap (TOP10 peptide sequencing). Maximal filling times were 1,000 ms for the full scans and 150 ms for the MS/MS. Precursor ion charge state screening was enabled, and all unassigned charge states as well as singly charged peptides were rejected. The dynamic exclusion list was restricted to a maximum of 500 entries with a maximum retention period of 90 s and a relative mass window of 5 ppm. Orbitrap measurements were performed with the lock mass option enabled for survey scans to improve mass accuracy (Olsen et al., 2005).
After processing raw files with the in house–developed software MaxQuant (version 184.108.40.206 or 220.127.116.11; Cox and Mann, 2008), data were searched against the human database concatenated with reversed copies of all sequences (Peng et al., 2003) and supplemented with frequently observed contaminants (porcine trypsin, achromobacter lyticus lysyl endopeptidase, and human keratins) using MASCOT (version 2.2.0; Matrix Science). For the analysis of pericentrin experiments, the mouse pericentrin sequence was added to the database. Carbamidomethylated cysteins were set as fixed, oxidation of methionine, and N-terminal acetylation as variable modification. Mass deviation of 0.5 D was set as maximum allowed for MS/MS peaks, and a maximum of two missed cleavages were allowed. Maximum false discovery rates (FDRs) were set to 0.01 both on peptide and protein levels. Minimum required peptide length was six amino acids.
Quantification of proteins in SILAC experiments was performed using MaxQuant (Cox and Mann, 2008). Methionine oxidations and acetylation of protein N termini were specified as variable modifications and carbamidomethylation as fixed modification. Maximum peptide charge was set to 6. SILAC settings were adjusted to doublets, and Lys0 and Lys8 were selected as light and heavy label, respectively. Peptide and protein FDRs were set to 0.01. The maximum PEP was set to 1, and six amino acids were required as minimum peptide length. Only proteins with at least two peptides (thereof one uniquely assignable to the respective protein group) were considered as reliably identified. Unique and razor peptides were considered for quantification with a minimum ratio count of 2. Forward and reverse experiments were analyzed together and specified as QUBICH and QUBICL in the experimentalDesign.txt. Ratios of the reverse experiment QUBICL were inverted. Specific interaction partners in SILAC experiments were determined by a combination of ratio and ratio significance calculated by MaxQuant. The p-value for the significance of enrichment had to be <0.01 in both the forward and reverse experiment. The provided R script QUBIC-SILAC.R was used to plot all identified proteins according to their ratios in the forward and reverse experiment and mark specific interaction partners (http://www.r-project.org).
Label-free quantification was performed with MaxQuant (see Supplemental data). Methionine oxidations and acetylation of protein N termini were specified as variable modifications and carbamidomethylation as fixed modification. Maximum peptide charge was set to 6. SILAC settings were set to singlets. Peptide and protein FDRs were set to 0.01. The maximum PEP was set to 1, and six amino acids were required as minimum peptide length. Only proteins with at least two peptides (thereof one uniquely assignable to the respective protein group) were considered as reliably identified. Label-free protein quantification was switched on, and unique and razor peptides were considered for quantification with a minimum ratio count of 1. Retention times were recalibrated based on the built-in nonlinear time-rescaling algorithm. MS/MS identifications were transferred between LC-MS/MS runs with the “Match between runs” option in which the maximal retention time window was set to 2 min. The quantification is based on the extracted ion current and is taking the whole three-dimensional isotope pattern into account. At least two quantitation events were required for a quantifiable protein. Every single experiment/raw file was annotated as a separate experiment in experimentalDesign.txt. Control experiments were named Control1, Control2, and Control3. Pull-downs were named with the specific bait name and the replicate number. Identification of specific interaction partners was determined using the MaxQuant-based program QUBICvalidator. The proteinGroups.txt file was loaded (Load – Generic), and a group file template, Groups.txt, was generated (Processing – Groups – Write group file template). Replicates were grouped using one unique name in Groups.txt. The file was then loaded into QUBICvalidator (Processing – Groups – Load groups). Subsequently, results were cleaned for reverse hits and contaminants (Processing – Filter – Filter category – Reverse = + and Contaminant = +). Positive intensity values were logarithmized (Processing – Transformation – LOG – Log2). Signals that were originally zero were imputed with random numbers from a normal distribution, whose mean and standard deviation were chosen to best simulate low abundance values below the noise level (Processing – Imputation – Replace missing values by normal distribution – Width = 0.3; Shift = 1.8). Significant interactors were determined by a volcano plot-based strategy, combining t test p-values with ratio information. The standard equal group variance t test was applied (Processing – Testing – Two groups). Significance lines in the volcano plot corresponding to a given FDR were determined by a permutation-based method (Tusher et al., 2001). The pull-down was selected as Group1 and the control as Group2. Threshold values (= FDR) were selected between 0.1 and 0.001 and SO values (= curve bend) between 0.5 and 2.0. The resulting table was then exported (Export – Tab separated). The second tab (Table S1 and Table S2) was selected, and values saved with the same file name were supplemented with “_sup” (e.g., Exp.txt → Exp_sup.txt). Results were then plotted using the open source statistical software R and the provided script QUBIC-LABELFREE.R. In the beginning of the script, Exp.txt and Exp_sup.txt have to be replaced with the real file names. Dynamic experiments were plotted using the script QUBIC-LABELFREE_dynamic.R. Significant TREX and TACC3 interactors were clustered using Genesis (Sturn et al., 2002).
A detailed step by step protocol and the raw data and programs associated with this manuscript may be downloaded from https://proteomecommons.org/tranche, launching Tranche, choosing “Open By Hash”, and entering the following hash: iNYsECWFuN0KDV0Q8QoE3uXxRGuBiCo5+iwydOM7h29jlyPv+Xv4+1piRkFr+mcnsy+eErYIvmcRQf9ZU/l5lxQYNQYAAAAAAABFCA==
Online supplemental material
Fig. S1 shows development of the QUBIC technology. Fig. S2 shows additional SILAC pull-downs of the TREX complex components. Fig. S3 shows an additional SILAC pull-down of CDC23. Fig. S4 shows additional label-free pull-downs of TACC3. Fig. S5 shows that pericentrinlong GFP colocalizes with anti-pericentrin antibody throughout the cell cycle. Table S1 shows specific interaction partners of label-free pull-downs of TACC3, CLTC, GTSE1, and PIK3C2A. Table S2 shows links to the University of California, Santa Cruz genome browser for used BACs, BAC length, gene length, number, and name of additional genes. Video 1 shows that pericentrinlong localizes to centrosomes throughout mitosis and the cell cycle. Video 2 shows that pericentrinshort localizes to centrosomes in mitosis but not interphase. Supplemental data show step by step QUBIC protocol, QUBICvalidator (download at Tranche), and R scripts, including test datasets (download at Tranche).
We thank Maximiliane Hilger, Michiel Vermeulen, and Trisha Davis for critical reading of the manuscript and Jennifer Yen for help with TACC3 mutant characterization.
This work was supported by the German National Genome Research Network (From Disease Genes to Protein Pathways [DiGtoP] grant) and PROSPECTS, a seventh framework program of the European Research Directorate.
bacterial artificial chromosome
false discovery rate
quantitative BAC-GFP interactomics
stable isotope labeling by amino acids in cell culture
N.C. Hubner and A.W. Bird contributed equally to this paper.