Roles of Solvent Water in the Positions of Biological Equilibria
The noncovalent binding interactions of biological molecules involve the stripping away of solvent water from regions of contact between the binding partners. Accordingly, the net strength of their interactions with other molecules can be considered to include the cost of removing the interacting molecules (at least those parts that make contact with each other) from the solvent to which they were previously exposed. Free energies of solvation of biological molecules also play a major role in determining the positions of their chemical equilibria. For example, the products of hydrolysis of ATP are so much more strongly solvated than the reactants that changing solvation fully accounts for the favorable equilibrium of hydrolysis of ATP to ADP and inorganic phosphate (Williams and Wolfenden, 1985).
It has long been suspected that solvent water plays a major role in the equilibrium structure of globular proteins in solution, which typically present the aspect of “an oil drop with a polar coat” (Kauzmann, 1959). Compared with the average environment experienced by an amino acid residue in the interior of a globular protein, an amino acid side chain's environment in the interior of a biological membrane is probably relatively isomorphous. The penetrating experiments of von Heijne and his associates (Hessa et al., 2005) have furnished quantitative information about the relative tendencies of the various amino acids to be found in deeply immersed regions of membrane proteins, and an opportunity to compare these tendencies with solvation properties of the amino acids. Those latter properties are the subject of this commentary.
Amino acids: The Choice of a Representative Solute
How closely are the water-leaving tendencies of the amino acids related to their tendencies to appear in the interiors of folded proteins, or in the transmembrane sequences of membrane proteins? In attempting to detect the presence of specific attractive or repulsive interactions between biological molecules and the sites at which they occur, it would be desirable to have information about their relative tendencies to leave water and enter a simple cavity of low dielectric constant that does not furnish opportunities for H-bonding or other specific attractive interactions. Equipped with charged ammonium and carboxylate groups, the amino acids themselves are much too polar to leave water and enter a truly nonpolar solvent or the vapor phase at concentrations above their lower limits of detection by present methods of analysis. But with a few interesting exceptions (see below) free energies of solvation tend to behave as additive functions of the substituent groups that are present in solutes (Leo et al., 1971). For that reason, the problem of detecting the zwitterionic amino acids can be circumvented by truncating their structures, for example by replacing the common [-NH-CH(R)-CO-] function of an amino acid at an internal position in a protein by the molecule RH. An internal serine residue, for example, with R = CH2OH, is represented by methanol. Distribution experiments can then be made between water and the vapor phase (Wolfenden et al., 1981) or between water and organic solvents (Radzicka and Wolfenden, 1988) at concentrations sufficient to allow the relative concentrations of the truncated amino acids to be measured in both phases, so that their water-leaving tendencies can be compared.
As a secondary amine, proline does not have a “side chain” in the usual sense, although its distribution properties can be estimated by comparing the values obtained for a set of structurally related molecules (Gibbs et al., 1991). Because the absence of an −NH-group prevents it from occupying internal positions in α helices and β structures, proline is usually found at reverse turns that tend to near the surfaces of globular proteins (Rose et al., 1985). Accordingly, the properties of proline will not be considered further in this discussion.
To what extent are the relative solvation properties of amino acid side chains influenced by the chemical context in which they occur? Peptide bonds appear to be the most polar uncharged groups in biological molecules (Wolfenden, 1978), and when several substituents are present within the same molecule, their combined effects on its free energy of solvation sometimes depart from additivity. But in those few cases where such effects have been clearly documented, there is good reason to suspect the involvement of special electronic effects that are transmitted through the solute molecule itself (for review see Wolfenden et al., 1987). For example, 4-nitrophenol is considerably more hydrophilic than would be expected from comparison of the properties of benzene, nitrobenzene, and phenol. H bonding from solvent water to the nitro group of 4-nitrophenol would be expected to enhance the acidity of the hydroxyl group by inductive and resonance effects, rendering the hydroxyl group a more effective H-bond donor to solvent water than it would be in the absence of that nitro group. Both cooperativity (as in the case of nitrophenol) and anticooperativity (as in the case of the nitrophenolate ion) between functional groups, in their combined effect on free energy of solvation, seem understandable in simple chemical terms. There appears to be no reason to suppose that such effects are likely to alter the relative solvation properties of the different amino acid side chains significantly, as compared with the relative solvation properties of the corresponding amino acid residues.
The Choice of a Reference Phase
To determine the distribution coefficients of amino acid side chains between polar and nonpolar environments, what nonpolar phase would be most suitable as a reference? To test the polar-versus-nonpolar dichotomy, and its possible expression in structure, it would be desirable to obtain a simple experimental index of that characteristic, unqualified by interpretation.
Early efforts in that direction involved a comparison of solubilities in alcohol and in water, yielding free energies of transfer of many amino acids between saturated solutions in these two solvents (Cohn and Edsall, 1943; Nozaki and Tanford, 1971). A disadvantage of that approach is that solutes are present (by definition) at their limits of solubility in both solvents, where dimerization and higher states of aggregation are most likely to be encountered. But oligomerization is not easy to detect, because it is not possible to use the customary method of examining the influence (if any) of varying solute concentration on the apparent free energy of transfer. Moreover, any incorporation of solvent molecules into the crystal structure of the solute, in the presence of either solvent, would invalidate thermodynamic comparisons based on the assumption of a common reference phase. Nevertheless, those early solubility comparisons conveyed some impression of the relative hydrophobicities of many of the amino acids, at least in a qualitative sense.
Instead of measuring solubilities, with their attendant complications, it is a straightforward matter to determine the partition coefficient of solutes between water and a solvent that is not water miscible. No potentially problematic solid state is then required, and solvent-to-water distribution coefficients can be measured easily at varying solute concentrations extending toward infinite dilution (using radiolabeled solutes if necessary), to test for virial effects that might signal the presence of solute aggregation in either the aqueous or the nonpolar phase (Wolfenden and Williams, 1983).
Water-to-octanol distributions furnish a particularly convenient means of estimating the “hydrophobicities” of molecules, and this solvent has attracted widespread use in the development of quantitative structure–activity relationships in medicinal chemistry. Its well-deserved popularity arises from the fact that n-1-octanol and water separate easily, and the concentrations of solute can often be determined in both phases by UV-visible spectrophotometry, after the molar absorbancy has been determined in water and octanol. And because water- to-octanol distribution coefficients are distributed over a relatively narrow range near unity, their values are particularly easy to measure.
But distribution measurements involving water-immiscible alcohols, if these are undertaken for the purpose of testing the simple dichotomy between polar and nonpolar, introduce special problems of interpretation. With pKa values comparable with that of water, alcohols are effective H-bond donors. Moreover, substantial concentrations of water (2.3 M in the case of 1-octanol) are present in octanol at equilibrium. The same is true of diethyl ether, an effective H-bond acceptor that dissolves 0.7 M water at saturation (Radzicka and Wolfenden, 1994). Thus wet ether and wet octanol are considerably more polar than the pure solvents. As described later, distribution measurements indicate that the solvent properties of wet octanol resemble those of water more closely than they resemble those of a liquid hydrocarbon. Of additional concern is the observation that some polar solutes have been shown to “drag” extra water molecules when they enter solvents of moderate polarity such as ethers (Tsai et al., 1993). “Water-dragging,” in those few cases where it has been shown to occur, interferes with efforts to determine the hydrophobicity of a solute, because the species that are transferred to the nonpolar phase include not only the solute itself, but also its noncovalent hydrate. In such cases, hydrophobicity will have been overestimated accordingly. Nor is the presence of water dragging easy to detect experimentally in semipolar solvents, because the concentration of dissolved water usually exceeds the concentration of the solute whose distribution one wishes to establish, usually under conditions approaching infinite dilution. Under those circumstances, the concentration of excess water that is “dragged” into the nonpolar phase represents a marginal difference between large numbers, and is difficult to measure accurately.
We are therefore led to infer that, although wet octanol or wet ether may chance to resemble some region of a membrane or protein interior as experienced by an amino acid side chain at some position within a protein, neither octanol nor ether is likely to furnish an accurate indication of the simple dichotomy we sought to test. For that purpose, it seems preferable to use as a reference phase a nonpolar liquid that contains only traces of water at saturation, so that the behavior of solutes is not affected.
With a water content almost as low as that of the vapor phase over liquid water (∼2 × 10−3 M), “wet” cyclohexane furnishes one such reference phase. In the author's laboratory, experiments conducted upon many polar solutes (including water itself) have shown no evidence of self-association at moderate concentrations in cyclohexane. Because the concentration of dissolved water in cyclohexane provides a very low “background” (2 mM), it is usually a simple matter to test for the possibility of water dragging using tritiated water or proton nuclear magnetic resonance. In no case has significant water dragging been detected. Thus, water-to-cyclohexane distribution coefficients appear to be relatively free of complications that cloud the interpretation of distribution coefficients involving semipolar solvents such as octanol or ether. Apart from a solute's affinity for watery surroundings, the only variables that need be considered are the modest cost of making a cavity in cyclohexane (just large enough to accommodate the solute) and the benefit of van der Waals' interaction between the solute and the walls of the cavity (see below).
An alternative (and absolute) measure of a solute's affinity for watery surroundings is provided by its vapor pressure over dilute solution in water. That approach bypasses potential questions of interpretation presented by condensed reference phases entirely. Except at extremely high partial pressures, even very polar solutes such as undissociated acetic acid remain monomeric in the vapor phase. And at ordinary temperatures and pressures, water itself remains almost entirely monomeric in the vapor phase, where it is present at concentrations that are too low to lead to significant association of water molecules with solutes emerging from the underlying aqueous solution.
Fig. 1 relates these measures of water affinity and indicates the pitfalls that may be encountered in distribution experiments unless appropriate precautions are taken. Each of the equilibrium constants in Fig. 1, expressed in mol/liter in each phase at infinite dilution, is dimensionless. The relative polarities of different nonpolar reference phases have been tested experimentally using the side chains of the 19 amino acids as a quasi-random collection of solutes. When those results were compared, wet octanol was found to occupy a position approaching the position of water itself, whereas wet cyclohexane occupied a position approaching the position of the vapor phase (Radzicka and Wolfenden, 1988). The same ranking of polarities is observed when one compares the concentrations of water itself, at saturation in various solvents and in the vapor phase, in equilibrium with pure water (Radzicka and Wolfenden, 1994). Fig. 1 shows that equilibrium constants for transfer from the nonpolar solvent to the vapor phase (Knonpolar>vapor) can be obtained from the other two equilibrium constants. For many solutes, these values have been shown to be closely related to molecular surface area (r2 = 0.92), as might be expected if transfer of a molecule from the vapor phase into cyclohexane involved the cost of making a cavity in cyclohexane and the benefit of van der Waals' interactions with the walls of the cavity (Radzicka and Wolfenden, 1988).
In practice, water-to-vapor and water-to-cyclohexane distribution coefficients can be measured only for molecules in their uncharged forms, because their ionized forms are far too polar to leave water and enter nonpolar surroundings even in the presence of a nonpolar counterion (Wolfenden, 1983). Accordingly, the distribution coefficients of amines have been determined from basic solutions (typically in 0.1 M KOH), whereas the values for carboxylic acids have normally been determined from solutions in acid (typically in 0.1 M HCl). From the resulting distribution coefficients of the uncharged species, and their known pKa values in water, it is a simple matter to calculate the fraction remaining nonionized at pH 7, and hence the effective distribution coefficient of the solute, expressed in terms of the total concentrations of all its ionized forms in each phase, at neutrality. The free energies of transfer in Table I are expressed on that basis.
Unusually sensitive detection methods are needed to measure the vapor phase concentrations of amides, peptides, and the more polar amino acid side chains over their aqueous solutions. In the case of the aromatic amino acid side chains, UV spectrophotometry furnishes the needed sensitivity. The behavior of other amino acid side chains can be analyzed using solutes labeled with 14C. A convenient alternative, that does not require isotopic labeling, is to use proton NMR and compare the integrated intensities of the solute protons with those of pyrazine or dioxan, which are added to the sample as an integration standard after completion of the experiment. That method allows determination of a typical solute at a concentration of 3 × 10−3 M in a 5 mm NMR tube, using a single transient on a 500-MHz spectrometer, with a precision of ±5%.
In water-to-cyclohexane distribution experiments that involve extremely hydrophilic molecules, a major enhancement in sensitivity can be achieved by using a much larger volume of cyclohexane than that of water. The solubility of water in cyclohexane is so slight (40 mg/liter) that 1 ml of an aqueous solute of known initial concentration can be equilibrated with 1,000 ml of cyclohexane, and then recovered for analysis, so that the concentration of solute in the cyclohexane layer can be calculated by difference.
In evaluating the low concentrations of extremely hydrophilic molecules, in the vapor phase over their aqueous solutions, a similar enhancement in sensitivity can be achieved by measuring the concentration of solute that accumulates in an efficient water trap through which large volumes of carrier gas have passed. In a typical experiment, a large volume of nitrogen (200 liters), was allowed to bubble through three “pots” (150-ml wash bottles, each containing an aqueous solution of 2-14C-acetic acid (0.01 M) in 0.1 M HCl, 100 ml), followed by a spray trap to prevent the physical transfer of liquid droplets. The carrier gas was then conducted through three wash bottles containing 0.1 M KOH, which acted as traps. As expected, >99% of the acetic acid emerging from the pots was caught in the first trap. Using either 14C or proton NMR as a means of detection, the accuracy of that method was verified for carboxylic acid and amines, by comparison with the results of analysis by direct potentiometric titration (Wolfenden, 1978). The solute concentration in the vapor phase was obtained by dividing the moles of solute accumulating in the traps by the number of liters of nitrogen carrier gas that had passed through the system (measured using a wet test flowmeter).
Scales of Amino Acid Hydrophobicity and the Position of Tryptophan
Table I lists the results of side-chain distribution experiments, which are shown as free energies of transfer, at pH 7.0 at 25°C, between the vapor phase and water (v>w) (Wolfenden et al., 1981), cyclohexane and water (c>w) (Radzicka and Wolfenden, 1988), and 1-octanol and water (GU) (Guy, 1985); and for pentapeptides between octanol and water (WW) (Wimley et al., 1996). Also shown for comparative purposes is a theoretical scale (WWth) in which an attempt was made to adjust the WW scale for the effects of occlusion by neighboring residues (see Fig. 8 in White and Wimley, 1999).
Table I also shows the observed tendencies of amino acids to be found buried in the interiors of globular proteins in solution (CH) (Chothia, 1976), or in the transmembrane sequences of membrane proteins (VH) (Hessa et al., 2005). It is worth remembering that these tendencies observed in protein folding are statistically based, and do not represent simple equilibria in a true thermodynamic sense.
As noted earlier, on a scale of environment polarities, wet octanol occupies a position approaching that of water itself, whereas wet cyclohexane occupies a position approaching that of the vapor phase. Accordingly, the span between extreme values on any single scale in Table I is much larger for water-to-vapor and water-to-cyclohexane distributions than the span for water-to-octanol distributions. The scale of virtual equilibria for transfer of amino acid side chains from water to the interior of a globular protein, or to a transmembrane protein sequence, is also compressed with respect to the scales of water-to-vapor and water-to-cyclohexane distributions. These virtual equilibria of folding (CH) or membrane insertion (VH), involving a typical amino acid side chain R, do not involve complete removal of the molecule RH from a situation in which it is fully surrounded by water to a situation in which it is fully surrounded by a liquid hydrocarbon. The amphiphilic character of residues such as lysine, with a long nonpolar “tail” terminated by a polar amino group, interferes with the predictive value of two-phase distribution coefficients. Moreover, solvent access to the R group is more limited in a polypeptide setting than in the molecule RH, and some further dampening of sensitivity might be expected to result from the effect of competition between the demanding H bonding requirements of the peptide backbone and the solvation preferences of the attached side chain.
Considering these limitations, the relationship between the experimental scales of Table I and the protein statistical scales (VH and CH) is perhaps closer than might have been expected. Of the purely experimental scales shown in bold type, the c>w and v>w scales are closely related (r2 ≥ 0.8) to the folding tendencies observed in globular proteins (CH). The c>w scale is also closely related (r2 = 0.8) to the tendencies of side chains to be found in transmembrane sequences (VH). The octanol-water scales GU and WW show moderately good correlations (r2 = 0.6 and 0.3) with the VH scale; and scale GU shows a good correlation with the CH scale (r2 = 0.5). Theoretical scale WWth shows that r2 values for experimental scale WW can be increased substantially by adjustment for occlusion effects. Perhaps the r2 values for v>w, c>w and GU might be improved by similar adjustments. That question remains to be investigated. Shown at the foot of each experimental scale in Table I is the probability that the scale in question is not related to the VH scale, i.e., that the null hypothesis is correct, estimated using the tables of Federighi (1959) for Student's t distribution with n = 19.
The position of tryptophan deserves special scrutiny, because it constitutes the most conspicuous difference between these scales. If cyclohexane, chloroform, or the vapor is used as a reference phase, tryptophan appears to be less hydrophobic than phenylalanine, valine, leucine, isoleucine, or methionine. In contrast, the water-to-octanol distribution coefficients of Table I, as well as the relative solubilities of amino acids in alcohols and water (Nozaki and Tanford, 1971), seem to imply that tryptophan is the least polar amino acid. This solvent-dependent behavior of tryptophan persists with or without the introduction of peptide bonds on either side of the α-carbon atom (Radzicka and Wolfenden, 1988). Thus tryptophan derivatives appear to be very hydrophobic when, but only when, an alcohol is used as a reference phase. That behavior implies the existence of specific forces of attraction between indole derivatives and alcohols. Leo et al. (1971) have noted the ability of octanol to form H bonds, and the extended π-electron system of the indole ring might be expected to serve as an H-bond acceptor, suggesting a possible structural basis for specific attractive interactions between tryptophan and octanol.
Regardless of its chemical basis, the indole–alcohol attraction results in pronounced skewing of the octanol-based scales, increasing the probability that the null hypothesis is correct by more than four orders of magnitude as shown by comparison of the c>w and WW columns in Table I. If tryptophan is deleted from each list, the discrepancies in ordering between the differing scales are reduced almost to the vanishing point. In both transmembrane protein sequences (VH) and globular protein structures (CH), tryptophan occupies a position intermediate between the least polar amino acids (isoleucine, leucine, phenylalanine, valine, and methionine), and the more polar amino acids, just as it does in the vapor-to-water and cyclohexane-to-water scales.
Perhaps the most striking observation to emerge from the comparisons in Table I is the strength of the relationship between the ordering of the side-chain polarities of the amino acids (as measured by cyclohexane and vapor distributions) and their tendencies to appear in transmembrane sequences. That relationship is even closer than the previously noted relationship between the ordering of side-chain polarities of the amino acids and their inside/outside distribution in globular proteins (Radzicka and Wolfenden, 1988). It seems evident that the “recognition” of transmembrane helices by the endoplasmic reticulum translocon (Hessa et al., 2005) bears a close resemblance to the transfer of amino acid side chains from water to a liquid hydrocarbon.
This work was supported by National Institutes of Health grant number GM-18325 and National Science Foundation grant number PCM-7832016.