Discovering the functional mechanisms of biological systems frequently requires information that challenges the spatial and temporal resolution limits of current experimental techniques. Recent dramatic methodological advances have made all-atom molecular dynamics (MD) simulations an ever more useful partner to experiment because MD simulations capture the atomic resolution behavior of biological systems on timescales spanning 12 orders of magnitude, covering a spatiotemporal domain where experimental characterization is often difficult if not impossible. We present here our perspective on the mechanistic insights that a scientist—in particular, a membrane protein physiologist—might garner by complementing experiments with atomistic MD simulations. Drawing on case studies from our work, we illustrate the diversity of membrane proteins amenable to study by MD and the types of discoveries one can make through simulation. We discuss the strengths and limitations of MD as a tool for physiologists, and we speculate on advances that such simulations may enable in the coming years.
What might a physiologist gain by supplementing the usual experimental tools—cell lines, patch clamp rig, spectrometers, and the like—with atomistic MD simulations? Foremost is the ability to probe the biological system of interest, which may be anything from an individual protein to a large biological assembly, across a very broad range of timescales at high spatial resolution (Fig. 1). An all-atom MD simulation typically comprises thousands to millions of individual atoms representing, for example, all the atoms of a membrane protein and of the surrounding lipid bilayer and water bath (Fig. 2). The simulation progresses in a series of short, discrete time steps; the force on each atom is computed at each time step, and the position and velocity of each atom are then updated according to Newton’s laws of motion. Each atom in the system under study is thus followed intimately: its position in space, relative to all the other atoms, is known at all times during the simulation. This exquisite spatial resolution is accompanied by the unique ability to observe atomic motion over an extremely broad range of timescales—12 orders of magnitude—from ∼1 femtosecond (10−15 s), less than the time it takes for a chemical bond to vibrate, to >1 ms (10−3 s), the time it takes for some proteins to fold, for a substrate to be actively transported across a membrane, or for an action potential to be initiated by the opening of voltage-gated sodium channels. MD simulations thus allow access to a spatiotemporal domain that is difficult to probe experimentally (Fig. 1). Simulations can be particularly valuable for membrane proteins, for which experimental characterization of structural dynamics tends to be challenging.
How might this ability to “see” the atoms of a biological system moving over time truly be useful? First, one can observe qualitative behavior, such as the mechanism of permeation through membrane channels. Second, one can probe systems quantitatively, for example, determining the conductance of a single water or ion channel (Fig. 2). Third, simulations often allow one to generate novel mechanistic hypotheses, sometimes based simply on straightforward observation: as Yogi Berra once said, “You can observe a lot by watching.” Fourth, simulations of perturbed or mutated molecular systems can be used to test specific hypotheses originating from experiment, computation, or theory. The power of MD simulations is further augmented by the ability to model molecules that cannot easily be constructed experimentally.
A wide variety of physiological processes are amenable to study at the atomic level by MD simulation. Examples relevant to membrane protein function include the active transport of solutes across bilayers by antiporters and symporters; the passive transport of water, ions, and other solutes by structurally diverse channels; the interconversion of transmembrane electrochemical gradients and chemical potential energy by pumps such as the F1F0-ATPase and the Na+/K+-ATPase; the transmission of extracellular stimuli to the cell interior by G protein–coupled receptors (GPCRs) and tyrosine kinase receptors; and the structural coupling of cells and organelles to one another by integrins and membrane curvature modulators.
Uncovering the fundamental mechanisms that underlie the functions of membrane proteins, and increasingly the dynamical details of those mechanisms, is a major thrust of molecular physiology. Computational models, especially those arising from MD simulations, are useful because they can provide crucial mechanistic insights that may be difficult or impossible to garner otherwise (Pugh and Andersen, 2008). Insight from computer models has a long history: in a landmark study at the dawn of the computer age, Fermi, Pasta, Ulam, and Tsingou used simulation to discover the counterintuitive behavior of a deceptively simple oscillator (Fermi et al., 1965). The authors noted that “the results of our computations show features which were, from the beginning, surprising to us.” A half-century later, computational methods have progressed sufficiently to enable a true partnership of MD simulations and experiments that can yield mechanistic insights into complex problems of biological interest.
Using examples from our work on membrane proteins, we present several case studies that illustrate the breadth of physiological processes that can be investigated using MD simulations (Karplus and Kuriyan, 2005). Each study addressed unique mechanistic questions, and thus each entailed a distinct approach. Our work was inspired by earlier studies of membrane proteins, including simulations of ion and water channels (Roux and Schulten, 2004; de Groot and Grubmüller, 2005), transporters (Khalili-Araghi et al., 2009), and GPCRs (Martínez-Mayorga et al., 2006; Spijker et al., 2006).
Permeation through a water channel: aquaporin 0 (AQP0).
AQP0, the most abundant membrane protein in the mammalian eye lens, differs from most other aquaporins in two ways: it transports water significantly more slowly (Yang and Verkman, 1997), and AQP0 molecules in adjacent cells may link to form intercellular junctions (Gonen et al., 2005). Comparison of the high-resolution structures of the nonjunctional AQP0 tetramer (x ray) and the junctional AQP0 octamer (electron microscopy) has stimulated intense debate on whether water even permeates AQP0 junctions (Harries et al., 2004; Gonen et al., 2005), a question that has not been answered experimentally because of the difficulty in measuring the permeability of a junctional octamer.
To address this question, we used MD to simulate several AQP0 assemblies, including a nonjunctional tetramer embedded in a single lipid bilayer and a junctional octamer embedded in two juxtaposed bilayers (Jensen et al., 2008). Our results indicated that both AQP0 forms do in fact conduct water: the computed single-channel (monomeric) water permeabilities of the AQP0 assemblies were all essentially the same and within error of the experimental value determined for nonjunctional AQP0 (Yang and Verkman, 1997). We found further that the distinct gating action of two tyrosine residues unique to AQP0—one a static gate and the other a dynamic gate—explained AQP0’s unusually low water permeability. Our results also shed light on the possible biological significance of this low permeability: as a cellular adhesion molecule, AQP0 may need to reduce the rate of water permeation to maintain mechanical stability of the junction.
Reconciling discordant experimental results: β2-adrenergic receptor (β2AR).
The recently determined β2AR crystal structures (Cherezov et al., 2007; Rasmussen et al., 2007)—the first of ligand-activated GPCRs, the largest class of drug targets—raised new questions about the atomic-level mechanisms of GPCR signaling. Biochemical experiments had previously suggested that, in their inactive state, β2AR and many other GPCRs possess a network of salt bridges, known as the “ionic lock,” which was believed to form a molecular switch for receptor activation (Ballesteros et al., 2001). Although β2AR was crystallized in a presumably inactive state, the ionic lock was disrupted in the crystal structures, provoking uncertainty about the true conformation(s) of the inactive state as well as the role of the ionic lock in receptor activation and signaling (Rasmussen et al., 2007, Lefkowitz et al., 2008).
To address these questions, we performed MD simulations of β2AR in multiple wild-type and mutant forms (Dror et al., 2009). In simulations of wild-type β2AR, the ionic lock formed reproducibly within a few hundred nanoseconds, with the salt bridge network matching that suggested by the earlier biochemical studies (Ballesteros et al., 2001). In microsecond-timescale simulations, we observed that the lock remained formed most of the time but occasionally broke for tens or even hundreds of nanoseconds. Mutations associated with increased activity of the unliganded receptor increased the fraction of time the lock was broken in our simulations. To obtain the high-resolution β2AR structure, a flexible loop of the receptor had been replaced by a small, rigid protein, T4 lysozyme (Cherezov et al., 2007). We hypothesized that this replacement altered the ionic lock conformational equilibrium. Indeed, when we simulated the crystallographic β2AR–T4 lysozyme fusion protein, the lock-broken conformation predominated. Our results suggest that inactive β2AR exists in a context-sensitive equilibrium and that the lock-formed conformation may predominate in the inactive wild-type receptor, despite the broken lock in the crystal structures.
Ion secretion by an alternating access mechanism: the NhaA antiporter.
The Na+/H+ antiporter NhaA, a prototypical membrane transporter, uses the bacterial transmembrane proton gradient to drive secretion of one intracellular sodium ion for every two protons taken up from the periplasm (Hunte et al., 2005). Using a large set of MD simulations, we deduced an NhaA transport mechanism; the proposed mechanism was substantiated with complementary experiments on NhaA mutants (Arkin et al., 2007). The resulting mechanism provides a molecular realization of Jardetzky’s 40-year-old alternating access model (Jardetzky, 1966): one aspartate residue near the center of the protein acts as the Na+ binding site, while a second, nearby aspartate acts as an accessibility control site, whose protonation state determines whether the Na+ binding site is accessible to the periplasm or to the cytoplasm.
Experimentally, the NhaA transport cycle takes roughly one millisecond. Because MD simulations anywhere near this length were infeasible when we performed our study, we relied instead on shorter simulations (≤100 ns each) initiated with NhaA in different configurations thought to represent, or lead to, key functional intermediates. Simulations were started with the sodium ion at different putative NhaA binding sites and with all possible protonation states for the two aspartates known experimentally to be indispensable for transport. Additional simulations also led to a model for the mechanism of NhaA’s experimentally observed pH sensitivity. Moreover, MD-based free energy calculations suggested that NhaA preferably binds Li+ over Na+, and Na+ over K+, hinting at how its cation selectivity arises.
Permeation and gating of an ion channel: Kv1.2.
Potassium channels have recently reemerged as drug targets; conversely, unwanted blockade of the hERG K+ channel represents perhaps the most serious pervasive safety concern in current drug development. Ion channels have been the subject of numerous MD studies, and such simulations, combined with experimental data, have produced the most accurate atomic-level information on the energetics of permeation (Bernèche and Roux, 2001; Morais-Cabral et al., 2001).
We recently used MD simulations of Kv1.2 to make the first direct atomic resolution observations of permeation and pore domain gating in a voltage-gated K+ channel (Jensen et al., 2010). To traverse the Kv1.2 pore (Long et al., 2007), ions pass through a large, water-filled hydrophobic cavity and then through the narrow selectivity filter (Fig. 2). Our analysis of hundreds of individual ion permeation events—driven by experimentally accessible depolarizing voltages—revealed a detailed conduction mechanism, resembling the Hodgkin-Keynes “knock-on” model, in which translocation of two selectivity filter–bound ions is driven by a third, incoming ion; formation of this knock-on intermediate was found to be the rate-determining step in permeation. These simulations also provided quantitative results, including a stoichiometric ratio of permeating water molecules to K+ ions of about one, in agreement with streaming potential measurements (e.g., Alcayaga et al., 1989), and a single-channel conductance within a factor of three of the experimental value.
At reverse or zero voltages, we observed pore closure—gating of the pore domain—which took place by a novel “hydrophobic gating” mechanism: water evacuated the hydrophobic cavity, causing the open, conducting pore to collapse into a closed, nonconducting conformation. In our simulations, in which the voltage sensor domains of Kv1.2 were omitted, the pore closed within a few microseconds. Experimentally, intact Kv1.2 closes on millisecond timescales. Thus, our simulation results support the idea that the voltage sensors act to prevent pore collapse into an intrinsically more stable, closed conformation (Yifrach and MacKinnon, 2002). Hydrophobic gating may be a fundamental principle underlying the function of voltage-sensitive K+ channels, and this mechanism could help to explain the evolutionary conservation of hydrophobic cavities in structurally diverse ion channels (Jensen et al., 2010).
Strengths and limitations
How does one decide whether addressing a particular biological question using MD simulations is likely to be feasible and productive? In this section, we address the major strengths and limitations of MD as a technique for molecular physiology.
Experimentally, phenomena that occur very quickly are typically harder to characterize than those that occur over long timescales, particularly if high spatial resolution is desired. In MD simulations, the opposite is true: the longer the physical time to be simulated, the more wall-clock time is required to execute the simulation. The highest atomic vibrational frequencies limit each time step to a few femtoseconds, such that simulating even a microsecond requires nearly a billion sequential time steps, with evaluation of the forces on every atom required at each step. Because the force on each atom depends on the position of every other atom (through long-range electrostatic effects), parallelizing the simulation by splitting it across multiple processors requires substantial inter-processor communication, which limits the speedup achievable through parallelization. Thus, MD simulations have historically been most powerful for simulating motions that take place on sub-microsecond timescales. Of course, many of the events of greatest interest in molecular physiology—for example, conformational changes underlying functional mechanisms of proteins—take place on timescales of microseconds or longer (Fig. 1).
Recent advances in parallel algorithms, software, and hardware have made simulations beyond a microsecond practical on commodity computer clusters (Bhatele et al., 2008; Hess et al., 2008; Klepeis et al., 2009). Our four case studies, for example, were performed using novel parallelization methods on large commodity clusters (Bowers et al., 2006). We also recently completed a specialized supercomputer designed especially for MD simulations, named Anton, which has enabled simulations on timescales as long as a millisecond (Shaw et al., 2009). These advances all broaden the applicability of MD to molecular physiology, and facilitate direct comparisons of simulations to experimental data.
Several approaches allow the use of MD to study conformational changes on timescales exceeding those of individual atomistic simulations. First, one can simulate specific parts of a longer-timescale process by initializing simulations from appropriately chosen protein configurations thought to represent key mechanistic intermediates (see NhaA antiporter study above). Second, one can use techniques devised to allow short-timescale simulations to sample a broader swath of conformational space, for example, by applying biasing forces to guide a simulation in a particular direction or by pushing the simulation over energy barriers (Lei and Duan, 2007). Such approaches are particularly powerful when prior information about the expected conformational change is available, but they can also introduce unrealistic behavior into the simulation. Third, one can use coarse-grained simulations, in which each simulated particle represents multiple atoms, to substantially reduce computational requirements at the cost of reduced spatial resolution and, in some cases, reduced accuracy (Bond et al., 2007; see also the complementary approaches described in the accompanying Perspectives in this issue by Bahar and by Silva and Rudy).
Accuracy and errors.
MD simulation, like any experimental technique, suffers from various sources of error. Careful study design can mitigate these errors, and careful interpretation of the simulation results can help prevent erroneous conclusions.
Errors associated with MD simulations are of two types. The first represents errors due to limitations on the length and number of simulations that can be performed within resource constraints. One must ask whether results are robust to minor changes in the initial conditions, whether observations represent initial transients or steady-state behavior, and whether the simulations have reached all the relevant conformational states. Repeated simulations—typically starting from the same structure, but with different initial velocities—help ensure robustness, but do not always substitute for longer simulations: some minimum simulation time may be needed to reach conformations differing substantially from the starting structure, and repeated simulations may experience the same initial transients. Our recent experiences with long simulations have indicated that, in some cases, transient behavior can last for a microsecond or longer. When possible, a simulation long enough to capture repeated transitions between states is desirable, as this provides evidence for an equilibrium on the simulated timescale. Each of our simulations of the β2AR, for example, was long enough for the protein to cycle repeatedly between the lock-broken and lock-formed conformations.
The second type of error stems from the physical models that underlie MD simulations. These models, known as molecular mechanics force fields (or simply force fields), specify the force on each atom given the positions of the other atoms and the network of covalent bonds connecting them. The biomolecular force fields in widespread use were developed, over several decades, through extensive fitting of models to experimental data and ab initio quantum mechanics calculations. The most recent generation of force fields has been shown to reproduce a variety of experimental data, particularly for proteins (Buck et al., 2006; Hornak et al., 2006), although several shortcomings remain, and more will likely be revealed through longer-timescale simulations. Known shortcomings include poor accuracy in the heights of energy barriers separating different conformations and biases toward α-helical over β-sheet conformations or vice versa.
From a user’s perspective, force field error has two implications. First, force field parameters must be selected with care; parameters validated on specific molecules (e.g., proteins or lipids) are typically much more reliable than parameters generated automatically. Second, force field errors tend to impact certain results more than others. The molecular conformations predicted by MD simulations tend to be more robust than the relative populations of those different conformations because populations are sensitive to minor errors in relative energies. Rates can be particularly difficult to estimate by MD because they depend strongly on the heights of energy barriers. Nevertheless, MD can produce reasonably accurate rate estimates under certain conditions, as illustrated by the previously discussed studies of water and potassium ion permeation through AQP0 and Kv1.2, respectively.
Several other factors should be considered when designing an MD simulation study. One is system size: the computation required is roughly proportional to the number of atoms, and the relevant dynamics of large systems often take place on timescales longer than those of small systems. A simulation of a hydrated globular protein will typically involve 10,000–100,000 atoms, whereas a membrane protein simulation may involve 40,000–400,000 atoms. Much larger MD simulations, containing many millions of atoms, have also been performed (Schulten et al., 2008).
Classical MD simulations treat covalent bonds as unchanging. To simulate chemical reactions, one must use alternative techniques such as quantum mechanics/molecular mechanics simulations, in which the bulk of the system is simulated as in classical MD, but a small part is evaluated using more computationally intensive quantum mechanical approaches (Senn and Thiel, 2009; see also the accompanying Perspective in this issue by Bucher and Rothlisberger). By default, protonation states remain fixed during a simulation, but several methods have been developed to allow them to evolve dynamically, at the cost of additional computation (Mongan and Case, 2005).
Although sufficiently long and accurate MD simulations may eventually be used to predict protein structure, present-day simulations generally require an initial structure. The quality of that structure is a major factor in the reliability of the simulation results; a 2-Å resolution crystal structure, for example, provides a much better starting point than either a 4-Å resolution crystal structure or a homology model.
We expect that the capabilities of MD as a method for membrane protein physiology will grow substantially in the future. Some of the most immediate changes will likely result from recent advances that allow dramatically faster simulations: Anton accelerates MD simulations by orders of magnitude compared with the previous state of the art (Shaw et al., 2009), and its application to membrane proteins has just begun. In addition to extending simulations to previously inaccessible timescales on which many events of physiological interest occur, these speedups mean that simulations that previously required months now take just days. Recent software advances provide an efficient means to rapidly analyze and process the large quantities of data produced by such high-throughput simulations (Tu et al., 2008). These combined developments enable an iterative process of discovery in which the results of one simulation guide the setup of the next—a sea change in simulation workflow that brings MD simulations in line with experimental methods: hypothesize, test, revise hypothesis, test, and so on.
By facilitating increasingly direct comparisons between simulated and experimental results, longer simulations will also foster improved force fields (Lindorff-Larsen et al., 2010). More accurate force fields, in turn, will enable ever better quantitative measurements by simulation, including the determination of kinetic quantities (e.g., rates of transport or conformational change) as well as thermodynamic quantities (e.g., free energies of ligand binding, computed in a manner that properly accounts for protein flexibility).
Continued improvements in computer power will also facilitate simulations of larger molecular systems at atomistic detail over useful timeframes. Increasingly, MD is being applied not only to study the behavior of individual proteins, but also to the study of larger macromolecular complexes (Schulten et al., 2008).
MD simulation has already begun to function as a “computational microscope,” facilitating discovery in spatial and temporal domains that would otherwise be inaccessible. Over time, we expect that observations from this computational microscope will become a routine source of primary data in biology, alongside data from traditional and novel experimental methods. In several other fields—aircraft design, computer chip design, and nuclear stockpile stewardship, to name a few—simulations are often performed in preference to experiments because simulations may be cheaper, more flexible, and more accurate. The complexity of biological systems is such that experiments, especially in vivo experiments, will continue to play an indispensable role in molecular physiology, but simulations may become an equal partner, offering a view of cell biology at a level of spatial and temporal detail inaccessible through experiment alone.