The recent Journal of General Physiology perpsectives on membrane protein insertion (129:351–377) covered many valuable strategies to examine how amino acid sequence determines protein insertion into membranes and the probability that sequences form transmembrane (TM) helices. Each approach described has unique advantages, and a complete exploration of this problem clearly requires combining approaches, including approaches not discussed in the perspectives, such as the use of synthetic hydrophobic helices inserted into model membrane vesicles. Using a diverse set of biophysical methods, several groups have used the latter approach to understand fundamental issues of membrane protein structure and function, including the configuration of membrane-inserted hydrophobic helices (TM or non-TM), the effect of hydrophobic helices on bilayer structure (and vice versa), and helix–helix interaction (Bechinger, 1996; Hunt et al., 1997; Ren et al., 1997, 1999; Killian, 1998; Webb et al., 1998; Lew et al., 2000, 2003; Mall et al., 2000; Caputo and London, 2003a; Goforth et al., 2003; Liu et al., 2004; Duong-Ly et al., 2005; van Duyl et al., 2005; Aisenbrey et al., 2006; Killian and Nyholm, 2006).
Using Model Membrane Inserted Helices to Analyze the Equilibrium between Transmembrane and Nontransmembrane States
Studies on the insertion of hydrophobic sequences inserted into model membranes usually use peptides with hydrophobic cores (primarily composed of aliphatic hydrophobic residues) flanked on both N and C termini by one or more relatively hydrophilic residues. The peptides are generally (but not always) too hydrophobic to dissolve in water. However, they can be incorporated into model membranes by directly mixing the peptides with lipids in organic solvent, followed by solvent removal or dilution (Ren et al., 1997, 1999), which usually leads to the formation of membrane-inserted helices, with the inserted state being transmembraneous. However, under conditions of negative hydrophobic mismatch, in which the length of the hydrophobic sequence is much less than the width of the lipid bilayer, a membrane-bound non-TM state, in which the helix lies adjacent to the membrane surface, can form (Ren et al., 1997, 1999). This raises the question of whether the observed structures represent an equilibrium, or instead, kinetically trapped configurations. By varying membrane width in situ (by reversible addition of hydrocarbons such as decane) or by varying pH (when the hydrophobic sequence contains an ionizable residue in a suitable location), it has been shown that TM and non-TM states are in equilibrium, not kinetically trapped (Ren et al., 1997; Lew et al., 2000; Caputo and London, 2003b).
Model Membrane Systems Allow Investigation of Environmental Conditions Relevant to Control of Post-insertional TM/non-TM Equilibria
Although it is highly desirable to define the equilibrium configuration of membrane-inserted synthetic helices, it must be emphasized that the behavior observed should not always directly parallel what is predicted by hydrophobicity values derived from solvent studies. The difference in free energy between a TM state and a membrane-bound non-TM state in model membranes should be much smaller than the difference between being membrane buried vs. dissolved in aqueous solution. In addition, hydrophobic sequences in model membranes should sometimes behave differently than they doin cotranslational translocon-based experiments. In translocon-based experiments hydrophobic segments are studied as part of a larger sequence, and if there are differences between the environment in a lipid bilayer and that in the translocon then hydrophobic sequences can be trapped in a nonequilibrium state in vivo by long hydrophilic sequences surrounding the hydrophobic sequence.
Nevertheless, model membranes and engineered helices are important because they can be used to investigate an increasingly important problem that cannot be investigated using the translocon or simple hydrophobicity measurements in solvent: how post-insertional equilibria control hydrophobic helix structure and function after release from the translocon. For such cases the equilibrium between the TM and membrane-bound non-TM state is of most interest, as there are many proteins in which hydrophobic sequences switch into a TM configuration by membrane insertion long after biosynthesis. Examples include hydrophobic sequences in bacterial toxins, Bcl-family proteins, annexins, and several mitochondrial proteins (Qiu et al., 1996; Kienker et al., 1997; Wattenberg and Lithgow, 2001; Ladokhin et al., 2002; Rosconi and London, 2002; Jeong et al., 2004; Rosconi et al., 2004; Fujita et al., 2007). These proteins either have hydrophobic sequences flanked on one side by short hydrophilic sequences or helical hairpins linked by short hydrophilic sequences. In either case, the hydrophilic sequences must be short enough to cross membranes. In addition, the sequences switching between TM and non-TM states are often “semi-hydrophobic” in the sense that they appear to be borderline in terms of having sufficient hydrophobicity to form a TM state.
The equilibrium behavior of hydrophobic and semihydrophobic sequences in model membranes is valuable because it also allows for studies of experimental conditions (pH and lipid composition) largely inaccessible to translocon-based approaches and/or solvent partition studies. These variables are important when considering how post-insertional equilibria might be controlled in vivo. What happens when a membrane protein migrates between intracellular membranes with different lipid compositions and different bilayer widths? Bilayer width, for example, can have dramatic effects on hydrophobic helix configuration, and other features of lipid structure may also be important (Ren et al., 1997, 1999). What happens to a TM helix that encounters the lumen of an acidic organelle? Protonation of ionizable residues located within hydrophobic sequences at low pH can control TM stability and could affect function (Bechinger, 1996; Caputo and London, 2004; Aisenbrey et al., 2006).
Identifying Borderline Hydrophobic Sequences: Derivation of a “Hydrophobicity” Scale that Approaches the Theoretical Limit for Accuracy
The importance of TM/non-TM equilibria is related to the abundance of borderline hydrophobic (semi- hydrophobic) sequences in nature. To assess the abundance of such sequences from genomic data requires a “hydrophobicity” type scale that accurately predicts the tendency of a sequence to form a TM structure. Otherwise, one does not know whether the apparent abundance of sequences with borderline hydrophobicity results from inaccuracies in the hydrophobicity scale. Defining the abundance of semi-hydrophobic sequences also requires a method to assign a value for the probability that particular amino acid compositions form a TM state, i.e., a statistical “apparent equilibrium constant” for the TM/non-TM equilibrium. Based on an analysis of genomic sequence data and comparison to databases of known soluble and TM sequences it is possible to define a “TM tendency” scale that accomplishes both of these goals (Zhao and London, 2006). The scale just about reaches the theoretical limit to accuracy for “single-value” scales. That is, of all hydrophobicity scales that assign each type of amino acid a single “hydrophobicity” value, the TM tendency scale is the most accurate for identifying TM segments.
This statement requires some justification. Suppose you have a hydrophobicity scale that you think is the best possible scale. Now, using databases containing all known TM and known non-TM (mainly soluble) sequences you compare TM and soluble sequences that appear to have the same hydrophobicity. If you find that TM sequences have, for example, a higher average abundance of Ile than the population of non-TM sequences with equal hydrophobicity, while the abundance of Leu is higher in the non-TM sequences, then the ability of the scale to distinguish between TM and non-TM sequences can be improved. How? If you increase your hydrophobicity value for Ile and decrease it for Leu, then the TM sequences will now, on average, have a higher hydrophobicity value than the non-TM sequences. In other words, you can now tell them apart. This procedure, performed for each amino acid residue, is how the TM tendency scale is derived. Once the average composition of populations of TM and non-TM sequences having the same hydrophobicity is the same for each type of residue, the scale can no longer be improved. Of course, the resulting TM tendency scale is not exactly a hydrophobicity scale, just a scale that evaluates the tendency to form TM sequences more accurately than the old scale. Any hydrophobicity scale that does not fulfill the equal average composition criterion can, by definition, be improved in terms of distinguishing TM from non-TM sequences by imposing this criterion. In other words, the TM tendency scale must be the best scale for distinguishing TM and non-TM sequences.
An obvious caveat is that the accuracy of the TM tendency scale depends on the quality of the databases of non-TM and TM sequences used to derive it. A more subtle caveat is that a perfect TM tendency scale demands that for any specific TM tendency value, each type of residue have exactly the same average abundance in the database of TM sequences and database of non-TM sequences having that TM tendency value. Thus, if the abundance of one type of residue is inversely linked to that of another, then it could be impossible to derive a perfect scale. For the TM tendency scale we derived, the average deviation from exactly equal abundance was so small (3%), that this should not be a major concern (Zhao and London, 2006).
Of course, the statement that TM tendency is the best single-value predictive scale is not the same as saying it is the ultimate method to predict TM sequences. Additional data, such as the position of different residues within a hydrophobic sequence, the identity of residues around the hydrophobic sequence, and the presence or absence of other hydrophobic sequences within the protein containing the sequence being analyzed, can all refine predictions.
Interestingly, the comparison of the TM tendency scale to other scales showed that the second best scale was biological hydrophobicity scale (Hessa et al., 2005), derived for simple sequences (having only two types of hydrophobic and one type of hydrophilic residue), passing through the translocon. The correlation between the two scales was unusually high (r2 = 0.95). This suggests that, on the average, the behavior of complex hydrophobic sequences in vivo, as judged by the TM tendency analysis, is very similar to that of simple hydrophobic sequences as tested by Hessa et al. (2005).
Sequences with a Borderline Tendency to Form TM States Are Probably Abundant
So, how abundant are borderline hydrophobicity sequences according to TM tendency? If we define borderline hydrophobicity TM sequences as having between a 50% and 90% probability of forming a TM state, analysis of genomic data suggests that such sequences, which should have a significant ability to switch between TM and non-TM states, represent a considerable fraction of all TM sequences (Zhao and London, 2006). However, this conclusion must be tempered by two considerations. First, the TM state is often stabilized by TM helix–TM helix interactions. Second, if surrounded by two large hydrophilic domains on opposite sides of the membrane, a TM sequence with borderline hydrophobicity will remain trapped in the TM state. Nevertheless, for proteins with single hydrophobic sequences bounded on one side by a short hydrophilic segment, equilibration between TM and non-TM states may be common, and be an important aspect of the conformational changes that such proteins undergo.
Defining Experimental Hydrophobic Helix Behavior in Model Membranes Is Important For Computational Studies
Finally, it should be noted that experimental results using hydrophobic helices in lipid bilayers is a natural complement to computational studies, as sequence, lipid composition, and pH can be modeled computationally. However, computational methods are limited by computational power. There are limits on the complexity of the system, or the time over which the analysis can be made. Knowledge of experimental behavior in model membranes is important because testing the ability of computational methods to successfully model key experimental results is an important step in identifying/refining valid short cuts that can improve computation methods and demonstrate their power.
Olaf S. Andersen served as editor.
Abbreviation used in this paper: TM, transmembrane.