Cell biologists have been afforded extraordinary new opportunities for experimentation by the emergence of powerful technologies that allow the selective manipulation of gene expression. Currently, RNA interference is very much in the limelight; however, significant progress has also been made with two other approaches. Thus, antisense oligonucleotide technology is undergoing a resurgence as a result of improvements in the chemistry of these molecules, whereas designed transcription factors offer a powerful and increasingly convenient strategy for either up- or down-regulation of targeted genes. This mini-review will highlight some of the key features of these three approaches to gene regulation, as well as provide pragmatic guidance concerning their use in cell biological experimentation based on our direct experience with each of these technologies. The approaches discussed here are being intensely pursued in terms of possible therapeutic applications. However, we will restrict our comments primarily to the cell culture situation, only briefly alluding to fundamental differences between utilization in animals versus cells.
The ability of antisense oligonucleotides to suppress gene expression was discovered more than 25 yr ago (Zamecnik and Stephenson, 1978). For a decade or more thereafter, antisense was viewed as a promising tool for selective gene regulation in experimental and therapeutic situations. However, despite massive efforts, the therapeutic potential of antisense oligonucleotides has yet to be fully achieved, and their use as routine laboratory tools has encountered difficulties. The basis for these problems lie mainly with the chemistry of early “first generation” antisense compounds, which are now being superseded by newer second or third generation molecules with improved characteristics.
Basic mechanisms and the role of chemical modifications.
Antisense oligonucleotides base pair with mRNA and pre-mRNAs and can potentially interfere with several steps of RNA processing and message translation, including splicing, polyadenylation, export, stability, and protein translation (Sazani and Kole, 2003; Crooke, 2004). However, the two most powerful and widely used antisense strategies are the degradation of mRNA or pre-mRNA via RNaseH and the alteration of splicing via targeting aberrant splice junctions. These two strategies are based on very distinct oligonucleotide chemistries, as discussed below.
RNaseH recognizes DNA/RNA heteroduplexes and cleaves the RNA approximately midway between the 5′ and 3′ ends of the DNA oligonucleotide. This event takes place in the nucleus. Additional enzymatic processes rapidly degrade the cleaved RNA, whereas the DNA oligonucleotide can recycle and participate in further rounds of scission and degradation (Crooke, 2000). RNaseH requires a B-type heteroduplex and does not cleave RNA/RNA A-type duplexes (Fig. 1). Standard phosphodiester oligonucleotides as well as phosphorothioates, which are first generation chemically modified forms, effectively trigger RNaseH-mediated cleavage. Unfortunately, phosphodiester oligonucleotides are extremely unstable in cells, whereas phosphorothioates display reduced binding affinities for RNA targets as well as a number of other liabilities that are related to extensive nonspecific protein binding. Therefore, newer second and third generation chemistries have been devised to overcome these problems (Dean and Bennett, 2003; Kurreck, 2003; Fig. 2).
Modifications of the 2′ position of the ribose ring lead to RNA-like oligonucleotides, including fluro, 2′-O-methyl, and 2′-O-methoxy-ethyl (2′-MOE) oligonucleotides. These compounds, especially the 2′-MOE versions, show significantly increased binding affinity, good nuclease stability, and reduced protein binding as compared with phosphorothioates. A recent addition to the RNA-like oligonucleotides are the locked nucleic acids (LNAs), where the ribose 2′-oxygen connects with the 4′-carbon; these molecules have extraordinarily high RNA binding affinities (Petersen and Wengel, 2003). A variety of other oligonucleotide chemistries have been developed, including sugar ring alterations such as anhydrohexitol nucleic acids (HNAs; Kang et al., 2004), as well as backbone modifications such as peptide nucleic acids, methyl phosphonates, and morpholino nucleic acids, all of which provide uncharged oligomers (Kurreck, 2003). Each of these novel oligonucleotides displays excellent binding affinity and resistance to degradation. However, none of the RNA-like oligonucleotides support significant RNaseH activity, and nor do HNAs, peptide nucleic acids, methyl phosphonate, or morpholino backbones. Fortunately, there is a simple solution to this conundrum: the use of gapmer oligonucleotides. Gapmers contain a central sequence of five to eight phosphodiester or (more usually) phosphorothioate residues flanked by 5′ and 3′ sequences drawn from the RNA-like or backbone-modified chemistries. The gapmers can fully support RNaseH activity while retaining many of the desirable properties of the modified oligonucleotides, including high RNA binding, nuclease resistance, and low protein binding, thus leading to enhanced potency and reduced toxicity.
The very lack of RNaseH stimulation displayed by many second/third generation homooligonucleotides makes them appropriate choices for an antisense strategy involving the alteration of RNA splicing. Several years ago, Sierakowska et al. (1996) demonstrated that antisense oligonucleotides could correct the splicing of an aberrantly spliced thalassemic β-globin message. This seminal observation opened the door to many experimental and therapeutic possibilities because ∼60% of human genes are alternatively spliced. Obviously, splice-site correction requires the use of antisense oligonucleotides that do not degrade RNA; thus, RNA-like and backbone-modified compounds have been very valuable in this context. A variety of chemically modified oligonucleotides have been used to correct the splicing of introduced or endogenous genes in cell cultures as well as in vivo (Sazani and Kole, 2003). Antisense manipulation of splice-site selection is unique in that either down- or up-regulation of the target mRNA can be pursued. For example, in the thalassemic case, it is increased expression of a normal β-globin message that is desired; up-regulation was also desired in a recent study that used LNA oligonucleotides to correct the splicing of the Duchenne muscular dystrophy gene (Aartsma-Rus et al., 2004).
Efficacy and selectivity.
In pharmacological terms, efficacy relates to the magnitude of the effect on the target, whereas selectivity relates to the uniqueness of the effect on target versus nontarget entities. It is obviously desirable to use antisense oligonucleotides with the highest efficacy and greatest selectivity possible so as to attain a maximal effect on the expression of the target gene with minimal off-target or other side effects. High efficacy is often associated with high potency; that is, with effects at low concentrations. The potency of an antisense oligonucleotide is determined by many factors. Some are innate to the oligonucleotide, and some are associated with the target gene and the cell type under study. High affinity base pairing is an important contribution to potency. However, the ability of an antisense oligonucleotide to bind to RNA in a cell also depends on the secondary structure of the RNA, as well as on protein binding that may block access of the oligonucleotide (Dean and Bennett, 2003; Vickers et al., 2003). It is also important to remember that antisense processes are enzymatically mediated; thus, variations in the relative abundance of RNaseH (or proteins involved in splicing) mean that an antisense compound may have variable effects in different cells. Finally, the efficiency of intracellular delivery is a key issue. Most delivery techniques for antisense (or small interfering RNA [siRNA]) oligonucleotides result in only a fraction of the cell population attaining useful intracellular levels of the oligonucleotide. In many cases, delivery is the key determinant of efficacy. Thus, in our experience, there is often a strong correlation between the fraction of the viable cell population that displays intracellular oligonucleotides and the degree of reduction of the target message and protein. Despite these various caveats, good progress has been made in creating antisense oligonucleotides that are quite potent and effective. For example, gapmers with MOE, HNA, or LNA 5′- and 3′-flanking sequences have been efficacious in cell cultures when used at low nanomolar levels (Alahari et al., 1998; Petersen and Wengel, 2003; Vickers et al., 2003; Kang et al., 2004).
Another key parameter is selectivity. One wishes to down-regulate the message and protein levels of the gene of interest and no other gene; however, most efforts fall far short of this goal. Approaches for evaluating the selectivity of antisense (or siRNA) oligonucleotides have been rapidly evolving. Previously, selectivity was often assumed if (a) mismatched or scrambled oligonucleotides were without effect on the target or (b) the oligonucleotide failed to affect common nontarget proteins (actin and tubulin). However, this limited and simplistic approach is inadequate, and a few recent studies have used DNA array technology as a more stringent means of evaluating specificity.
There are two aspects to the selectivity problem. The first relates to the highly interconnected nature of the cellular economy and represents an essentially unavoidable situation. Obviously, an agent that reduces the expression of a key regulatory protein will have an impact on many of its downstream effectors. A good example of this comes from a DNA array study of an antisense oligonucleotide that was directed against a subunit of protein kinase A; manipulation of this key regulatory protein affected numerous downstream genes (Cho et al., 2001). In contrast, an antisense molecule targeting the MDR1 gene, which has fewer downstream connections, resulted in a much smaller number of nontarget genes being affected (Astriab-Fisher et al., 2002b). It is clear, however, that antisense oligonucleotides cause effects on gene expression that are unrelated to their suppression of the target gene (Astriab-Fisher et al., 2002b; Eder et al., 2003).
A second, more controllable, aspect of selectivity relates to the inappropriate degradation of nontarget RNAs and other side effects. This issue can largely be addressed by proper oligonucleotide design (Crooke, 2000; Sczakiel, 2000). For example, it is obviously important to use a computer algorithm to make sure that the targeted sequence is unique to the gene of interest. Furthermore, the use of an oligonucleotide that can sustain RNaseH activity along its entire length is problematic because mRNAs, which have partial matches, can recruit the nuclease, leading to off-target degradation (Lebedeva and Stein, 2001). This issue is best addressed by the use of gapmers that contain only a short segment capable of supporting RNaseH. Inappropriate side effects can also occur independently of RNA degradation. As mentioned, oligonucleotides rich in phosphorothioate residues bind strongly to many cell proteins and can potentially interfere with their function (Lebedeva and Stein, 2001). In addition, some single-stranded oligonucleotides can form three-dimensional stem–loop structures that act as aptamers; that is, they can bind to protein receptors in a manner similar to drug molecules, thus causing biological effects (Potti et al., 2004). Finally, oligonucleotides containing CpG motifs can bind to Toll-like receptors, leading to the inappropriate activation of innate immune responses (Krieg, 2002).
Another issue that relates to both potency and selectivity involves the mode of delivery of the oligonucleotide (Astriab-Fisher et al., 2002a). Interestingly, there is a striking and unexplained difference between cell cultures, in which the use of a delivery agent is required to attain antisense effects, and the situation in animals, in which effects can be achieved with “free” oligonucleotides (Juliano and Yoo, 2000). In cell cultures, various commercially available polycationic lipids or polymer preparations can be used for transfection (Thierry et al., 2003). Other approaches, such as conjugates of oligonucleotides with cell-penetrating peptides, have also been effective (Astriab-Fisher et al., 2002a; Morris et al., 2004; Moulton et al., 2004). It is important to note that most of the delivery technologies commonly used, including cationic lipids, cationic polymers, and electroporation, are inherently toxic to cells (Juliano and Yoo, 2000) and that transfection reagent effects are sometimes manifested as nonspecific changes in gene expression (Omidi et al., 2003).
Antisense in the cell biologist's toolkit.
The following is a possible approach to incorporate antisense technology into your gene regulation toolkit. First, decide on the type of oligonucleotide you will use; if you wish to knockdown an mRNA, then a phosphorothioate gapmer is a good place to start. The 5′ and 3′ “wings” flanking the gap could use any of several chemistries, but 2′-O-methyl gapmers are readily available commercially and often work well (Table S1). Next, you will need to design a number of oligonucleotides that are complementary to your target, essentially doing an “RNA walk.” Because only ∼1/10 of oligonucleotides are effective (Lebedeva and Stein, 2001), choosing ∼20 to test is reasonable. Most investigators use 15–20-mer oligos, although there are arguments that shorter oligonucleotides are actually more specific. Software such as Oligo 6 can enhance antisense design by finding sequences that have duplex melting temperatures well above 37°C and that are not self complementary. The oligonucleotides will then need to be screened for effectiveness. One simple approach is to use a commercial in vitro transcription kit to generate target mRNA, and then add the oligonucleotides to be tested and Escherichia coli RNaseH. Degradation of messages can be quantitated by Northern blot or by real-time PCR. This approach is still somewhat artificial, and perhaps the surest approach is to transfect the oligonucleotides into the cell type of interest, and then measure the extent of target mRNA knockdown. A reasonable goal would be to attain an 80–90% reduction of target messages by using concentrations of oligonucleotides in the 10–100 nM range. Significant reduction of mRNA levels usually takes place within 1 d after transfection. Although the investigator's goal may be to attain a biological effect and/or to knockdown a protein, it is vital to also test antisense oligonucleotides for their actions at the message level because, as discussed above, nonantisense effects can also occur at the protein and overall cell function levels. Once a promising antisense sequence is chosen, it is important to perform a number of controls. This includes testing the effects of scrambled, mismatched, or irrelevant sequences on the target message as well as evaluating the effect of the selected antisense on nontarget mRNAs. The most stringent way to do this is to use DNA arrays; however, at minimum, several nontarget messages should be evaluated.
The explosive growth of studies on RNA interference has provided a fascinating picture of how RNA molecules can participate in multiple endogenous gene regulatory processes. Starting with work in plants, fungi, and lower animals but with increasing emphasis on mammalian systems, investigators have found innate RNA-mediated mechanisms that regulate mRNA stability, message translation, and chromatin organization (Mello and Conte, 2004). Furthermore, exogenously introduced long double-stranded RNA (dsRNA) is an effective tool for gene silencing in a variety of lower organisms. However, in mammals, long dsRNAs elicit highly toxic responses that are related to the effects of viral infection and interferon production (Williams, 1997). To avoid this, Elbashir et al. (2001) initiated the use of siRNAs composed of 19-mer duplexes with 5′ phosphates and 2 base 3′ overhangs on each strand, which selectively degrade targeted mRNAs upon introduction into cells.
The action of interfering dsRNA in mammals usually involves two enzymatic steps (Fig. 1). First, Dicer, an RNase III–type enzyme, cleaves dsRNA to 21–23-mer siRNA segments. Then, RNA-induced silencing complex (RISC) unwinds the RNA duplex, pairs one strand with a complementary region in a cognate mRNA, and initiates cleavage at a site 10 nucleotides upstream of the 5′ end of the siRNA strand (Hannon, 2002; Paddison and Hannon, 2002; Dorsett and Tuschl, 2004). This process takes place in the cytoplasm. In mammals, the Argonaute 2 protein seems to be the key component of the RISC complex responsible for mRNA cleavage (Liu et al., 2004). Short, chemically synthesized siRNAs in the 19–22 mer range do not require the Dicer step and can enter the RISC machinery directly. It should be noted that either strand of an RNA duplex can potentially be loaded onto the RISC complex, but the composition of the oligonucleotide can affect the choice of strands. Thus, to attain selective degradation of a particular mRNA target, the duplex should favor loading of the antisense strand component by having relatively weak base pairing at its 5′ end (Khvorova et al., 2003; Schwarz et al., 2003; Meister and Tuschl, 2004). In addition to mRNA cleavage, RISC complexes can also regulate expression at the translational level; the discovery of a large number of micro-RNAs (miRNAs) as endogenous regulatory components has provided insights into these events (Bartel, 2004). RISC–miRNA complexes inhibit translation by interacting with sites in the 3′ regulatory regions of mRNA. Although siRNA-mediated message cleavage requires a perfect or near perfect sequence match, miRNA action requires a lesser degree of complementarity (Bartel, 2004; Meister and Tuschl, 2004) and, thus, is a possible source of off-target effects at the protein level. Although there has been extensive work performed on the intracellular processing of antisense oligonucleotides (Juliano and Yoo, 2000; Thierry et al., 2003), this is just beginning for siRNAs. However, at least one study suggests an interesting correlation between subcellular localization and effect (Chiu et al., 2004).
Exogenous siRNAs can be provided as synthesized oligonucleotides or expressed from plasmid or viral vectors (Paddison and Hannon, 2003; Hannon and Rossi, 2004). In the latter case, precursor molecules are usually expressed as short hairpin RNAs (shRNAs) containing loops of 4–8 nucleotides and stems of 19–30 nucleotides; these are then cleaved by Dicer to form functional siRNAs. Although a variety of vector designs are possible, in most cases pol III–dependent promoters from mouse U6 small nuclear RNA or human RNaseP are used, and the shRNA is terminated by a series of Ts that comprise a stop signal. Inducible forms of pol III siRNA vectors have also been reported (van de Wetering et al., 2003; Hosono et al., 2004). Many vectors have been used to express shRNAs; these include lentiviruses (Sumimoto et al., 2005), adenoviruses (Hosono et al., 2004), adeno-associated viruses (Xia et al., 2004; Xu et al., 2005), retro viruses (Yang et al., 2003), and transposons (Heggestad et al., 2004). In some cases, pol II–driven vectors have been used, including ones with tissue specificity, but it is not clear that these vectors are as potent as their pol III counterparts (Song et al., 2004).
The fact that shRNA can be deployed in a vector context opens opportunities to create large libraries of shRNAs for use in screening and genome-wide studies (Hannon and Rossi, 2004). There have been a number of approaches to this, including the coordinated development of large human and mouse siRNA libraries that are bar coded for convenient identification (Paddison et al., 2004). Library shRNA strategies have been used to identify new elements of the p53 pathway, the NFkB pathway, and PI-3-kinase signaling (Berns et al., 2004; Hsieh et al., 2004; Zheng et al., 2004). New approaches to the rapid generation of shRNA libraries from cDNAs have also been developed, such as the restriction enzyme–generated siRNA system (Sen et al., 2004) and others (Luo et al., 2004; Shirane et al., 2004).
siRNA design and chemical modifications.
Currently, there is a great deal of interest in identifying and designing highly effective siRNAs, including the use of chemical modifications to improve stability or potency. Current (imperfect) knowledge of design has been embedded in various computer algorithms that examine the target mRNA and predict effective siRNA sequences. Some of the design elements include G/C content, lack of internal repeats, low duplex stability at the 5′ antisense terminus (as discussed above), and BLAST searching to ensure uniqueness (Reynolds et al., 2004). Various commercial suppliers of siRNA have such software on their websites, but excellent nonproprietary tools are also beginning to appear (Cui et al., 2004). An important unresolved issue in siRNA design is the degree to which RNA folding and protein binding influence the effectiveness of siRNAs (Dorsett and Tuschl, 2004). Some studies have suggested that siRNA action is relatively independent of such factors, whereas others have indicated that siRNA, like antisense, is highly influenced by RNA structure (Kretschmer-Kazemi Far and Sczakiel, 2003; Vickers et al., 2003).
siRNAs are tolerant of a considerable degree of chemical modification (Harborth et al., 2003), although some alterations cause a loss of activity (Chiu and Rana, 2003). In general, sense strands and 3′ regions are more amenable to modification, whereas the 5′ region on the antisense strand and the central region are more sensitive (Chiu and Rana, 2003; Czauderna et al., 2003; Hall et al., 2004). The importance of chemical modifications is exemplified by a recent study of siRNA effects in animals. In the study, partial phosphorothioate backbone, 2′-O-methyl sugar modifications, and 3′ cholesterol-protected sense strands were used, resulting in significant protection against the 3′ exonuclease activity that ordinarily rapidly degrades siRNAs in serum (Soutschek et al., 2004).
Efficacy and selectivity.
RNA interference in lower organisms is incredibly potent, with the introduction of a few molecules of dsRNA leading to virtually complete gene silencing (Sijen et al., 2001). However, in mammals, siRNA is considerably less robust. Although there have been reports of siRNA effects on mammalian cells at picomolar concentrations (Hannon and Rossi, 2004; Hassani et al., 2005), this seems exceptional, and most studies find significant target knockdown in the 10–100 nM range (Attwell et al., 2003; Mitra et al., 2004; Xu et al., 2004). Similarly, although some studies have reported virtual ablation of endogenous target messages, particularly when viral vectors were used (Chen et al., 2004), it is more common to observe a 40–90% knockdown of messages and proteins when siRNA oligonucleotides are transfected (Hannon and Rossi, 2004). One very appealing feature of siRNA technology is that it apparently is possible to efficiently knockdown more than one target at the same time. There are a few examples of this in the literature using either chemically synthesized siRNAs (Fukuda et al., 2003; Mitra et al., 2004) or using hairpin vectors (Yu et al., 2003; Gondi et al., 2004).
siRNA effects are usually thought to be extremely specific. For example, there are reports of siRNAs that discriminate between wild-type and mutant forms of p53 or Ras, differing by only a single base (Brummelkamp et al., 2002; Martinez et al., 2002). However, in other cases, siRNA has been unable to make such fine discriminations (Karasarides et al., 2004). Indeed, the selectivity of siRNAs is currently rather controversial; this is well discussed in a recent review (Dorsett and Tuschl, 2004). At the mRNA level, some reports have indicated that siRNAs do not cause global changes in gene expression as evaluated by DNA array analysis (Semizarov et al., 2003). However, others find quite the opposite, with numerous off-target effects (Jackson et al., 2003; Jackson and Linsley, 2004; Persengiev et al., 2004) that include the silencing of nontargeted genes containing as few as 11 contiguous nucleotides of identity. In addition to these effects at the message level, nonspecific actions can occur at the protein level via miRNA actions on partially matched sequences (Doench et al., 2003; Saxena et al., 2003; Dorsett and Tuschl, 2004; Hannon and Rossi, 2004; Scacheri et al., 2004). Furthermore, although siRNAs were conceived as a way of avoiding nonspecific, interferon-like effects in mammalian cells, there have been reports of such effects (Sledz et al., 2003). Finally, single-strand breakdown products of siRNA molecules can potentially stimulate changes in message and protein expression that are related to innate immunity via interactions with a Toll-like receptor on the cell surface (Heil et al., 2004). Thus, siRNAs are often far from perfect in terms of selectively silencing target genes.
siRNAs in the cell biologist's toolkit.
One of the principle attractions of siRNA technology is the ease of entry for the novice. An investigator can simply plug in the target DNA sequence to any one of several vendor websites and generate lists of candidate siRNAs. The vendor will then produce any or all of the candidates. Various types of chemical deprotection and purification are offered, as is the choice of buying preduplexed double-stranded oligonucleotides or even pools of gene-targeted oligonucleotides (Table S2). Some companies are also beginning to provide simple chemical modifications that enhance stability or effectiveness. siRNAs are usually delivered to cells via cationic lipids or cationic polymer agents, which are similar to those used for plasmid transfection, or via electroporation. Although some delivery agents have been developed specifically for siRNAs, transfection strategies (and efficiencies) differ widely among various cell types and most often need to be optimized experimentally. siRNA technology affords another option in the form of vectors that express hairpin RNA oligonucleotides. Several such vectors are now available commercially. Although not universal, in many cases sequences that work well as chemically synthesized siRNAs also work when incorporated as stem–loop hairpins into vectors (although an extension of the stem might be advisable).
Once a set of siRNA oligonucleotides or vectors is chosen, the effects on target messages and protein levels can be studied by a variety of means, as described in the Antisense in the cell biologist's toolkit section. As with antisense and designed transcription factors, it is vital to perform key controls to ensure that selective RNA interference is actually taking place. Clearly, it is important to examine levels of target and nontarget mRNAs to ensure that selective message degradation is occurring. Ideally, this would include a broad screen such as that provided by a DNA array; minimally, several nontarget moieties should be examined. The use of “irrelevant” duplex RNA oligonucleotides can also be helpful in evaluating selectivity. Another strategy that is often advocated is to cotransfect an altered version of the target gene whose message should not be an siRNA target because of silent mutations at several bases. Because siRNAs can be quite potent, one should try to work at the lowest doses possible so as to minimize off-target effects. As mentioned above, although one might aspire to effects at picomolar levels, more common, good quality siRNAs work at the 10–50 nM level. A recent review provides another useful guide to practical applications of siRNAs (Elbashir et al., 2002).
Designed transcription factors
Transcription factors (TFs) are typically modular proteins containing a DNA-binding domain that is responsible for the specific recognition of base sequences and one or more effector domains that can activate or repress transcription. TFs interact with chromatin and recruit protein complexes that serve as coactivators or corepressors. Important coactivators include the CBP–p300 complex that is involved in transcriptional activation, accompanied by histone modifications, the SWI–SNF chromatin remodeling complex, and the Mediator complex that links TFs to the basal transcription machinery. Corepressors include Sin3 and NuRD complexes, which contain histone deacetylases that convert nucleosomes to a transcriptionally incompetent state (Jepsen and Rosenfeld, 2002; Kadonaga, 2004).
In the creation of new transcription factors, novel DNA-binding domains are obtained by a library screening process; these are then combined with well-known transactivating or repressor domains to form functional designed TFs (Pabo et al., 2001; Beerli and Barbas, 2002; Falke and Juliano, 2003; Jamieson et al., 2003; Blancafort et al., 2004). Commonly used examples of transactivators include domains from the herpes virus VP16 protein or from the NFκB p65 subunit, whereas the Krupple-associated box and mSin3 interaction domain modules are frequently used as repressors. Much of the work to date on designed transcription factors has used the Cys2–His2 zinc finger (Zif) as a DNA-binding entity. A typical C2H2 Zif is a compact module of ∼30 amino acids, stabilized by the zinc ion, and is arranged as two β sheets and an α-helix, with the helix being the primary DNA recognition moiety. The utility of the C2H2 Zif in TF design is based on the modular nature of Zif–DNA interaction and on the fact that the relationship between protein structures and base recognition is well understood. Thus, amino acids at positions –1 to 6, with respect to the start of the α-helix, fit into the DNA major groove and mediate the interaction essentially by recognizing a 3′ base motif; the residues at positions –1, 3, and 6 directly interact with the 3′, central, and 5′ bases of a triplet on one DNA strand. Although additional base and backbone interactions can occur, to a first approximation, the Zif–DNA interaction is modular, with each Zif recognizing a three-base site. This allows the ready creation of polydactyl multi-Zif proteins that can recognize long stretches of DNA and, thus, provide highly selective tools for gene manipulation. For example, a 6-Zif protein recognizing 18 bases would theoretically be able to uniquely bind its target within a pool of almost 70 billion base pairs (Beerli and Barbas, 2002).
Zif selection strategies.
Zif modules with novel DNA recognition abilities can be obtained by peptide combinatorial library strategies such as phage display. Typically, three Zifs are displayed on the phage surface; two of them are fixed, whereas the third is randomized at some or all of the key residues at positions –1 to 6 (Choo and Klug, 1994; Rebar and Pabo, 1994). Phages are then screened against an oligonucleotide containing the target sequence that is immobilized on a support. Phage display is a powerful strategy because it allows the screening of large libraries (>109 combinations). However, the screening process involves “naked” oligonucleotides rather than chromosomal DNA. For that reason, some groups have used in vivo screening strategies, including yeast one-hybrid (Cheng et al., 1997; Bartsevich and Juliano, 2000) and bacterial two-hybrid (Joung et al., 2000) selections in order to seek Zifs that will effectively target genes in chromatin. Whether using phage display or other approaches, several rounds of selection can often lead to a novel Zif with the ability to recognize virtually any triplet of the form GNN or ANN (triplets starting with a pyrimidine are more difficult to obtain; Dreier et al., 2001).
At this point, the process of building a novel multi-Zif protein can follow either of two routes. One strategy is parallel selection (Fig. 3); this is founded on the idea that Zifs are modular, and, thus, individual Zifs, which are selected for recognition of specific triplets, can be strung together to recognize longer regions of DNA (Beerli and Barbas, 2002). A second strategy is serial selection; this approach takes account of the fact that the modularity of Zifs is not perfect. Serial selection, as its name implies, involves starting with one novel Zif, and then screening additional Zif modules that are optimized in terms of working well together (Isalan et al., 2001; Pabo et al., 2001). Although there are reasons for thinking that the serial selection strategy may ultimately provide superior multi-Zif proteins, it is a slow and cumbersome approach, whereas the more rapid parallel selection strategy has, in practice, yielded very effective novel TFs.
There has been substantial activity in the designed TF field over the last few years, and, at this point, several groups have now established large repertories of C2H2 Zifs with known binding specificities. In some cases, these Zif repertories are available to the research community (for example, at http://www.scripps.edu/research/faculty.php?tsri_id=900), thus obviating the need to undertake extensive screening. An exciting recent development is the advent of “libraries of libraries” that comprise collections of modified Zifs targeting multiple three-base sites that are then randomly assembled into multi-Zif TFs. These libraries can be transfected into mammalian cells, which can then be screened for a phenotype or for up- or down-regulation of a particular gene product (Blancafort et al., 2003; Hurt et al., 2003; Magnenat et al., 2004), thus providing a strategy that is very comparable to the siRNA libraries discussed above. An advantage of this approach is that it directly seeks multi-Zif TFs that function well in a chromosomal context. Once a desirable Zif-TF is identified, its expression in cells can be regulated in a variety of ways. Inducible expression systems, including those based on Tet or ecdysone, can be used (Beerli and Barbas, 2002; Xu et al., 2002). Alternatively, chemical ligands that regulate the function rather than the expression of TF can be used (Lin et al., 2003).
Efficacy and selectivity.
Properly designed multi-Zif TFs can be very efficacious. Thus, >90% reduction in endogenous message and protein levels has been obtained by using designed repressors (Xu et al., 2002; Papworth et al., 2003; Segal et al., 2004). A comparison of the potency of designed TFs with antisense or siRNA oligonucleotides is difficult because, unlike oligonucleotides, TFs are expressed endogenously from vectors. However, designed TFs with binding affinities in the subnanomolar range are readily attained (Beerli et al., 2000; Pabo et al., 2001), suggesting that rather low concentrations within the cell can produce strong effects. The binding affinity of a Zif-TF to its DNA target can be substantially altered by a single base change in the target (Wolfe et al., 1999); however, some binding to “imperfect” sites may be tolerated (Segal et al., 2004). As with studies on antisense and siRNA, evaluation of the selectivity of designed TFs in cells has often been rather crude and simplistic. However, several recent publications have used DNA arrays to obtain a more comprehensive picture (Xu et al., 2002; Tan et al., 2003; Lee et al., 2004). As with antisense, designed TFs that target key regulatory genes are likely to cause multiple changes in gene expression. That aside, however, designed TFs seem to be highly selective when the target is not itself a key regulator, providing strong regulation of target gene mRNA with lesser effects on a few nontarget genes. Although it is conceivable that designed TFs could have side effects as a result of protein–protein interactions rather than transcriptional regulation, we are not aware of any such reports.
Designed TFs in the cell biologist's toolkit.
The use of designed TFs seems at first to be more technically daunting than the use of antisense or siRNA. Although this was true a few years ago, the existence of large pools of designed Zifs has now made the TF strategy much more accessible, as described in more detail elsewhere (Segal, 2002). For example, let's assume that you wish to up- or down-regulate a specific endogenous gene and that something is known about the promoter region. A reasonable initial approach would be as follows. Scan the promoter for binding sites for endogenous TFs (particularly ones that use Zifs as DNA-binding domains, if possible). Sites that are accessible to endogenous Zifs are likely within chromatin regions that are also accessible to transfected designed TFs. Look for 15–18 base sequences near these regions that are G-rich (because many existing Zifs recognize GNN sequences). Obtain information on the availability of cDNAs for Zifs that match your sequence. If available, the individual Zifs can be linked by conventional PCR techniques to form a multi-Zif protein; a transactivator or repressor domain should also be included, and the entire construct can be placed in a mammalian expression plasmid or in a viral vector. Note that there is a certain amount of controversy about the best way to link multiple Zifs (Jamieson et al., 2003); however, the consensus sequence of TGEKP often works well. The designed TF can now be directly transfected into the cells of interest, and effects on gene expression can be monitored at the message and protein levels. Obviously, a number of controls are needed. These might include a vector expressing a multi-Zif TF that binds to an irrelevant target, a vector expressing only the transactivating domain or only the multi-Zif, and an “empty” vector. Ultimately, more detailed studies of specificity should include DNA array analysis.
A comparison of gene regulation technologies
There are advantages and liabilities with each of the technologies discussed above, and the choice of a tool by a cell biologist might entail several considerations, including the overall goal of the project, ease of use, cost, effectiveness, and selectivity. At the most basic level, choices can be made based on intended use. Designed transcription factors and antisense (via splice modulation) can either up-regulate or down-regulate gene expression; siRNA is restricted to down-regulation. siRNA and designed TFs can be expressed from viral or plasmid vectors, potentially providing much more uniform delivery of the active agent than can be achieved by simple transfection; antisense is limited to delivery by transfection.
An important consideration is the ease of use, especially for the novice. siRNA is clearly the winner here. The various companies in this field have made it very simple to get started. However, entry into the antisense arena is also basically quite easy; one simply needs to design and order the oligonucleotides. One possible difference is the number of compounds that need to be screened to find an effective agent. As mentioned above, often 20 or more antisense oligonucleotides must be screened to find one effective compound. With siRNA, it is often said that the computer algorithms generate effective oligonucleotides at the level of 1/2 to 1/4 (this is not always the case, however, and in some instances, despite using the best available vendor algorithms, we have screened dozens of siRNAs without getting an effective compound). Entry into the designed TF area takes more effort. Although a library of prefabricated Zifs may be available, these still need to be assembled into multi-Zif proteins for effective DNA binding and then linked to repressor or transactivator domains. One mitigating factor is that unlike the oligonucleotide approaches, there is often not a great deal of screening with designed TFs because they are created on a rational basis.
Cost is another important consideration. On a molar basis, siRNAs are substantially more expensive than antisense oligonucleotides (unless very exotic modifications are requested). For example (depending on the vendor), 200 nmol of a 19-mer duplexed siRNA might cost $400, whereas 200 nmol of a 19-mer phosphorothioate antisense compound might be about $60, and a 2′-O-Me phosphorothioate gapmer could cost about twice that. However, the need to screen more compounds when using antisense than with siRNAs could offset the cost differential. Designed TFs are free in terms of reagent costs; however, substantial labor would go into the production of a specific multi-Zif protein.
Each of these tools can be quite efficacious and potent for gene regulation. As mentioned above, knockdowns of 90% or more have been obtained with all of the approaches. Although some investigators suggest that the IC50 levels for siRNAs are 100-fold or are lower than for antisense oligonucleotides (Dorsett and Tuschl, 2004), this probably does not take into account recent advances in antisense chemistry. We have observed similar levels of message and protein reduction for commercial siRNAs and for second or third generation antisense oligonucleotides when used in the 10 nM range (Kang et al., 2004; Xu et al., 2004).
A key issue is selectivity: can one affect the target gene with minimal effects on other genes? Here, the winner seems to be designed TFs. Work from our laboratory and others has shown that designed TFs can substantially alter target gene expression with minimal off-target effects, as evaluated by DNA arrays (as long as a key regulatory protein is not the target; Xu et al., 2002; Tan et al., 2003; Lee et al., 2004). In contrast, early generation phosphorothioate oligonucleotides have numerous off-target effects at the message and protein levels, probably primarily as a result of their propensity to bind proteins. Gapmer antisense oligonucleotides with newer chemistries promise to be more selective; however, these types of compounds have yet to be stringently evaluated. siRNAs were originally thought to be extremely specific. However, more recent findings suggest that (in the pharmacologist's lexicon) siRNAs can be “dirty drugs.” Off-target effects at the message (siRNA) and protein (miRNA) levels are possible, as is gene induction through the innate immune system. This is not to say that siRNA cannot be used in a relatively selective manner. However, investigators using siRNA must prove selectivity rather than assuming it. Thus, although there is no magic bullet for epigenetic gene regulation, a careful and skeptical approach to using antisense, siRNA, or designed transcription factors may provide the investigator with powerful and selective tools to study the role of individual genes in cell biological processes.
Abbreviations used in this paper: ds, double stranded; HNA, anhydrohexitol nucleic acid; LNA, locked nucleic acid; miRNA, micro-RNA; MOE, methoxy-ethyl; RISC, RNA-induced silencing complex; shRNA, short hairpin RNA; siRNA, short interfering RNA; TF, transcription factor; Zif, zinc finger.