The characterization of lipid binding to lipid transfer proteins (LTPs) is fundamental to understand their molecular mechanism. However, several structures of LTPs, and notably those proposed to act as bridges between membranes, do not provide the precise location of their endogenous lipid ligands. To address this limitation, computational approaches are a powerful alternative methodology, but they are often limited by the high flexibility of lipid substrates. Here, we develop a protocol based on unbiased coarse-grain molecular dynamics simulations in which lipids placed away from the protein can spontaneously bind to LTPs. This approach accurately determines binding pockets in LTPs and provides a working hypothesis for the lipid entry pathway. We apply this approach to characterize lipid binding to bridge LTPs of the Vps13-Atg2 family, for which the lipid localization inside the protein is currently unknown. Overall, our work paves the way to determine binding pockets and entry pathways for several LTPs in an inexpensive, fast, and accurate manner.
Introduction
Eukaryotic cells are organized into separate membrane-bound organelles, each with a characteristic lipid composition necessary for its proper functioning (van Meer et al., 2008). To achieve this complex homeostatic balance, lipids need to be selectively mobilized between different organelles in response to both extra- and intracellular stimuli (Fagone and Jackowski, 2009).
Intracellular lipid trafficking is largely achieved by two mechanisms—vesicular and non-vesicular transport. Non-vesicular lipid trafficking is mediated by a large group of lipid transfer proteins (LTPs), which encapsulate lipids in the hydrophobic cavity of a soluble lipid transfer domain (LTD) and transfer them between membranes of different organelles within the cell (Wong et al., 2017, 2019). Many LTPs have been discovered in the last few years (Gatta et al., 2015; Levine, 2022b; Kim et al., 2022), suggesting that intracellular lipid transport is more widespread and prominent than originally thought. Concomitantly, with the identification of new LTPs, new mechanisms of lipid transport are also being proposed (Wong et al., 2019; Dall’Armellina et al., 2023). Notably, certain LTPs have been suggested to transfer lipids between organelles by physically bridging them at membrane contact sites and by establishing long hydrophobic tunnels between the two organellar membranes (Wong et al., 2017, 2019; Reinisch and Prinz, 2021; Wozny et al., 2023).
Despite this plethora of new data on LTP cellular functions (Wong et al., 2017, 2019; Chiapparino et al., 2016), a detailed understanding of their mechanistic mode of lipid transport remains limited. This knowledge gap likely originates from the extreme and inherent molecular complexity of the process (Wong et al., 2017, 2019; Wong and Levine, 2016). Specifically, LTPs must recognize target donor and acceptor organellar membranes with high selectivity, and they must overcome non-negligible free energy barriers to extract and release lipids into these membranes. In addition, how transport directionality is achieved, as well as the auxiliary role of accessory proteins that bind LTPs and potentially affect the lipid transport cycle, remains largely unclear (Wong and Levine, 2016; Wong et al., 2017, 2019).
While structural studies are extremely helpful in this context, they are insufficient to paint a complete picture of the lipid transport mechanism. For example, even when a high-resolution structure of the LTP is available, the co-transported lipids are often missing, possibly as a consequence of the dynamic behavior of lipids inside the protein cavity. In addition, only few high-resolution structures of LTPs are available and, especially for those proposed to work via a bridge-like mechanism, AlphaFold (AF)-derived models are often used to provide a mechanistic interpretation of the functional data (Castro et al., 2022; Paul et al., 2022; Cai et al., 2022; Hanna et al., 2023; Guillén-Samander et al., 2022; Cai et al., 2022; van Vliet et al., 2022; Neuman et al., 2022; Levine, 2022a; Dall’Armellina et al., 2023; Jumper et al., 2021). Finally, basic mechanistic features such as the exact entry/exit pathways for the loading and unloading of the lipids from the protein remain largely unknown.
To overcome these limitations, computational approaches hold promise to identify lipid binding poses for LTPs. A classical strategy in this regard is molecular docking (Bender et al., 2021). While this technique is widely used to determine ligand binding pockets in proteins (Forli et al., 2016), the high flexibility of lipid chains, possibly also inside the protein binding cavity, results in a major challenge in the case of LTPs. Recently, promising attempts to use artificial intelligence (AI)–derived tools to obtain the structure of protein–ligand complexes by directly building the protein structures around the corresponding molecules have been proposed (Krishna et al., 2023, Preprint), but these methods still perform quite poorly for low-affinity ligands (Krishna et al., 2023, Preprint). Alternatively, both unbiased atomistic molecular dynamics (MD) simulations, in which the ligand of interest is placed in the bulk solvent and eventually binds to the protein (Dror et al., 2011), as well as enhanced sampling MD methods (Niitsu et al., 2019; Limongelli et al., 2013), have been used to determine binding pathways for small molecules (Shan et al., 2022; Kuzmanic et al., 2020). However, since these processes occur over relatively long timescales, these approaches are inherently computationally expensive and thus difficult to extend to high-throughput investigations.
To alleviate these computational bottlenecks, a promising computational protocol based on unbiased coarse-grain (CG) MD simulations to determine protein–ligand interactions for small drug-like molecules has been recently proposed (Souza et al., 2020). There, the authors elegantly demonstrate that chemical-specific CG simulations (using the MARTINI force field [Souza et al., 2021]) can be used to simulate protein–ligand binding using a brute force approach, paving the way for potentially screening numerous drugs and proteins in a high-throughput fashion (Souza et al., 2020).
In this work, we explore the possibility of extending this protocol to the complex cases of lipids in LTPs. By tuning the computational parameters, we succeed in observing the spontaneous binding of lipids to LTPs and their eventual entry into the experimentally determined lipid binding pocket whenever this structural information is available. Our approach identifies a dominant entry pathway of lipids into LTPs, sheds light on the lipid-binding activity of poorly characterized proteins, and provides a structural view of the lipid density inside bridge-like LTPs (BLTPs), such as Vps13 and Atg2, that have been proposed to act as bulk lipid transporters but for which the lipid localization inside the cavity has remained uncharacterized so far. Our results pave the way for an improved characterization of the molecular mode of action of LTPs, and potentially for the in silico identification of new ones, in a high-throughput fashion at minimal computational cost.
Results
Brute force unbiased CG-MD simulations can reproduce crystallographic poses of lipids bound to LTPs
To determine whether unbiased CG-MD simulations can correctly identify lipid–protein interactions for LTPs, we first adapted a protein–ligand protocol recently proposed for small drug-like molecules using the CG MARTINI force field (Souza et al., 2020). In this protocol, drug-like molecules are placed in bulk solvent in the presence of a target protein, and during unbiased MD simulations, they can spontaneously relocate into the protein binding pocket due to the smoothness of the free energy profile in the CG representation (Cascella and Vanni, 2016). We reasoned that a similar protocol could work for lipid binding to LTPs even if lipid molecules carry extremely hydrophobic moieties: in the absence of alternative hydrophobic binding partners (such as a lipid bilayer) the lipid of interest could potentially be able to find the protein cavity of LTPs within reasonable time scales.
To test this hypothesis, we first identified several proteins that have been co-crystallized with a single lipid they are proposed to transport (Table 1, rows 1–13). These include ceramide-1-phosphate transfer protein (CPTP), fatty acid binding protein-1 (FABP1), mitoguardin-2 (Miga2), maintenance of lipid asymmetry protein C (MlaC), phosphatidyl-choline transfer protein (PCTP), neurofibromin-1 (NF1), oxysterol binding protein homolog-6 (Osh6), oxysterol binding protein homolog-4 (Osh4), phosphatidyl-inositol transfer protein-α (PITP), chromatin structure-remodeling complex subunit Sfh1, and StART domain-containing protein-11 (StARD11, also called ceramide transfer protein [CERT]), apolipoprotein M (ApoM), and Niemann-Pick disease type C2 protein (NPC2).
Starting from the x-ray structures, we next set up the following protocol: each protein was stripped off its co-crystallized ligand(s), placed in a box of water with 0.12 M NaCl with a lipid randomly inserted into the bulk solvent, and 1-μs-long unbiased MD simulations were performed (Fig. 1 A and Video 1). For each protein, five independent replicas were carried out. Hydrophobic tails of all simulated lipids were initially modeled as oleoyl tails (18:1), while headgroups were chosen to be similar to the ones in the corresponding crystal structure of the protein (Table 1). In case parameters for a given lipid in the crystal structure were unavailable in the MARTINI force field, structurally similar headgroups were chosen for comparison. For example, 1,2-dioleoyl-sn-glycero-3-phosphate (DOPA) was used instead of ceramide-1-phosphate.
To assess lipid binding, we used two distinct metrics. First, we collected the time traces of the minimum distance between the protein and the lipid (Data S1 A). Second, to discriminate between bona fide binding events inside the lipid-binding pocket versus futile lipid-binding events on the external surface of the protein, we used as a metric the solvation number of the lipid tails (defined as the number of water molecules within 5 Å of the lipid acyl chains) in the last 100 ns of the trajectory of each replica (Fig. 1 B). We selected this property since a lipid bound to an external surface of the protein would be surrounded by a larger number of water molecules as opposed to lipids bound to a hydrophobic cavity. Direct comparison between lipid solvation, the minimum distance between lipid and protein, and the center of mass distance between the CG lipid and the x-ray ligand (Data S1 A) confirms that lipid solvation is a good metric for lipid binding, marking a clear distinction between externally bound lipids and lipids that bind within the hydrophobic cavity.
For the 13 proteins we tested, we observed two distinct behaviors: in some cases (8/13: CPTP, MlaC, NF1, PCTP, PITP, Sfh1, ApoM, NPC2; Fig. 1 B), the lipid could always bind spontaneously and irreversibly to the protein of interest, as indicated by the time traces of the minimum distances between the protein and the lipid and of the lipid tail solvation (Data S1 A). Solvation analysis indicates that the lipid tails are desolvated at the end of the trajectory, thus residing inside a hydrophobic binding pocket (Data S1, A and B, light purple).
In other cases (4/13: FABP1, StARD11, Osh6, Osh4), no entry of the lipid was observed (Data S1, A and B, gray), and the lipid could not find the binding pocket within the 1-μs elapsed simulation time. In one case (Miga2), the lipid entered the binding pocket only in three of the five replicas (Data S1 A and Fig. 1 B).
Next, we focused our attention on the false negative results (Fig. 1 B). We reasoned that the origin of such behavior could be related to high free energy barriers originating from protein dynamics, conformational changes potentially associated with lipid entry in physiological conditions and that are not well-reproduced by our CG simulations, or the inability of the lipid to find the entry pathway during the simulation time because of too limited sampling time. To qualitatively test these hypotheses, we performed three tests. First, we extended our simulations from 1 to 3 μs. This approach improved our results for Osh6 (Data S1, A and B), with two replicas displaying lipid binding in the extended simulation, and for Miga2, in which we observed lipid binding in one additional replica (Data S1, A and B). Next, we tested our protocol on lid-less Osh6 (Osh6-Δ69) and Osh4 (Osh4-Δ29) since oxysterol-related proteins (ORP) domains, like other LTDs, are known to have a lid-like helical region at the N-terminus that regulates lipid entry into the protein (Lipp et al., 2019; Moser von Filseck et al., 2015). Indeed, while we observed lipid entry for full-length Osh6 in only two replicas (in 3 μs), and no lipid binding for full-length Osh4, all five replicas displayed lipid entry for both Osh6-Δ69 and Osh4-Δ29 (Fig. 1 B and Data S1 A). Interestingly, in two of the replicas (see Data S1 A), the cholesterol molecule bound to Osh4-Δ29 exits the hydrophobic cavity, coming back to the bulk solvent. This phenomenon can be related to the protective role that the lid may have on the cavity, preventing the ligand from interacting with the solvent and thus unbinding. Analogously, the first six residues of FABP1 could also act as a lid, closing the entry to the hydrophobic cavity. However, the removal of the small lid did not lead to binding, although one replica underwent binding and unbinding of oleic acid (see Data S1 A).
Third, we opted to decrease the elastic network force constant that keeps the protein conformation close to its crystallographic structure in CG simulations, thus promoting increased flexibility. Indeed, upon reduction of the force constant (from 1,000 to 300 kJ mol−1 nm−2), we could observe spontaneous lipid entry in the cavity of StARD11 (Fig. 1 B and Data S1 A).
Finally, to test whether the desolvated lipid-binding pose we observed is consistent with the one determined by x-ray crystallography, we computed the distance between the center of mass of the hydrophobic tails in the experimental structure and the corresponding one of the CG lipids, exclusively for MD frames, with a low lipid tail solvation (<2 water molecules, consistent with the mean value over the positive LTPs of 1.5 molecules on average) (Fig. 1 C).
Even though our simulations display a large variability, in all cases, we could observe values very close to the x-ray distance (Fig. 1, C and D), thus identifying binding poses very close to the crystallographic structure, with the sole exception of StARD11, where CG-MD simulations suggest a lipid binding pose with the lipid tails located more deeply inside the protein cavity than in the corresponding structure from x-ray crystallography (Fig. 1 D). The observation that our protocol provides additional desolvated lipid poses with respect to the crystal structure is potentially interesting. On the one hand, this observation could be simply attributed to the inaccuracy of our CG approach. On the other hand, this observation could originate from the ability of our protocol to identify additional lipid-binding poses. These conformations might be important along the lipid entry/exit pathway, and their presence is consistent with the hypothesis that the lipid must not be trapped in highly stable states inside the pocket to facilitate its uptake and release from membranes by the LTPs.
CG-MD simulations cannot reproduce the experimentally determined lipid specificity of LTPs
Next, we investigated whether our assay could reproduce the experimentally determined sensitivity for specific lipids. To do so, we performed identical simulations with multiple lipids that are not known to bind to the tested LTPs. In detail, for all proteins we tested a lipid with an identical polar head but with two saturated chains (dipalmitoyl-phostidylX [DPPX]), a dioleoyl phosphatidyl (DOPX) lipid with a different headgroup, cholesterol, triolein (TO), and a sphingolipid (N-stearoyl-D-erythro-sphingosylphosphorylcholine [DPSM]). In most cases, we found a similar binding behavior (Fig. S1), indicating that our protocol is generally unable to discriminate between different lipids. However, for some proteins (CPTP, MlaC, Osh6-Δ69, or those that bind smaller fatty acids, such as ApoM and Miga2 [Hong et al., 2022; Sevvana et al., 2009]), the use of a non-natural lipid substrate led to significantly worse binding prediction (Fig. S1), as was the case for almost all LTPs in the presence of TO, likely because of its bigger size.
Finally, analysis of the polar head placement in our simulations reveals a large variability in the binding poses between replicas (Fig. S2) as a consequence of the lack of sensitivity for the different headgroups. Rather, the polar heads can be stabilized by both interaction with the protein and by the surrounding solvent in our CG simulations.
Furthermore, to investigate the ability of our protocol to discriminate between different lipids, we applied it to two proteins, the takeout protein and the juvenile hormone binding protein (JHBP), that possess Synaptotagmin-like mitochondrial-lipid-binding (SMP) domains that are structurally similar to that of known LTPs. The takeout protein and the JHBP do not transport lipids, but they transport hydrophobic, lipid-like molecules: ubiquinone-8 and juvenile hormone III, respectively. In the presence of 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC) and 1,2-dioleoyl-sn-glycero-3-phospho-L-serine (DOPS) lipids, we observed spontaneous binding and very modest lipid solvation for both proteins (Fig. S3). Taken together, these results indicate that while our approach succeeds in the identification of hydrophobic lipid-binding pockets for LTPs, it has limitations for what concerns its ability to recapitulate the experimental selectivity for specific lipids and thus cannot be used to interrogate lipid specificity of LTPs.
Unbiased CG-MD simulations can discriminate between bona fide LTPs and negative control proteins that contain non-lipid-specific large cavities
The observation that our protocol is sometimes unable to reproduce the experimentally determined binding of lipids to LTPs suggests that our approach might not be biased toward false positive results. To further stress-test our protocol in this direction, we next investigated whether we could correctly flag proteins that do not transport lipids despite the presence of a large cavity in their structure. To do so, we selected 10 proteins that are not known to solubilize lipids but possess a cavity of large volume (Chwastyk et al., 2020) (Table 2), which is either hydrophobic or hydrophilic, as negative controls. Lipid binding to these negative control proteins was assessed using the same protocol described above, using DOPC as a model lipid.
For these negative controls, the lipids bound stably to the proteins during the CG simulations in many instances (Data S1 B). However, upon computing the spatial density of the lipids during the last 100 ns of the simulation trajectory, a superficial and outspread occupancy map was observed for the negative control proteins (Fig. 2 A). Similarly, time traces of lipid solvation show that in negative control proteins, the lipid remains highly solvated along the entire trajectory, even when bound to the protein (Data S1 B). This observation suggests that non-specific interactions of the lipid with different exterior surfaces of the proteins, rather than specific interactions in a hydrophobic pocket, characterize protein–lipid interactions with the negative controls in our assay. To quantify this observation, calculation of the solvation number of the lipid indicates that lipid solvation was significantly higher in the case of the negative controls (mean value = 6.4, Fig. 2 B) than that in the positive-result LTPs (mean value = 1.5, Fig. 2 B). This indicates that, for negative controls, lipids bound almost exclusively to external surfaces of the proteins despite the presence of a cavity. Precisely, out of 50 control simulations (five for each protein), we observed a single exception in one of the replicas of concanavalin A (PDB ID: 3ENR, Fig. 2 B, and Data S1 B), further suggesting that our approach is quite robust against false positive results.
Entry of the lipid into the protein cavity occurs via a dominant pathway
We next focused our attention on the path of lipid entry inside the LTP cavity as this process is critical to understand their mechanism of action. Since our protocol does not describe the physiological conditions of lipid entry/exit, with the lipid initially placed in bulk solvent rather than in a lipid bilayer, we wanted to investigate whether lipid entry was following a single dominant, possibly physiological, pathway or if rather the protein structure provided several entry points for fully water-solvated lipids.
Visualization of the simulation trajectories revealed the presence of a dominant pathway of lipid entry for all LTPs. By mapping the center of mass of the lipid during the trajectories and measuring the fraction of replicas in which the corresponding pathway was observed, we noticed that this pathway was unique for most LTPs (7 out of 12 positive results, Fig. 3 A) and observed with a high probability for the other LTPs. In almost all cases (including CPTP [Rogers and Geissler, 2023], Osh6 [Moser von Filseck et al., 2015], and PCTP [Khelashvili et al., 2019]), the pathway observed in these simulations corresponds to the one that has been proposed in literature. For all proteins, the averaged contact frequency between the LTPs residues and the simulated lipids during the last 100 ns of the simulations (Data S1 C) as well as the time traces of the contacts between the relevant residues and simulated lipids for ApoM, NF1, Osh4-Δ29, and PITP (Data S1 D) are shown in supplementary information to highlight the molecular resolution of our protocol.
Finally, to test the robustness of the observed entry pathway, we adopted two strategies. First, for the case of ApoM, we compared our pathway with that proposed in all-atom (AA) simulations (Zhang et al., 2016). Our data on the observed binding pathway (see Fig. S4 A) as well as lipid–protein residue contacts (see Fig. S4 B and Fig. S7), even with the intrinsic limitation of the lack of conformational changes in our CG simulations, is in excellent agreement with the one suggested by AA simulations.
Second, we designed mutations to block the entry process observed in the CG simulations. To do so, we performed identical simulations after introducing disulfide bridge mutations along the lipid entry pathway—I1603C-W1641C in the case of NF1 and V73C-210C in the case of PITP (Fig. 3, B and C). The residues were chosen such that the distance between Cα atoms of the residues is <8 Å. We observed that in both cases, the presence of the disulfide bridge abolished lipid entry into the protein entirely, as shown by the lipid solvation in Fig. 3, B and C.
Brute-force CG-MD simulations can characterize the lipid-binding cavity of poorly characterized LTPs
As potential applications of our protocol, we first focused on proteins that have been proposed to transport lipids, but for which no experimental lipid-bound structure was determined, or for which direct experimental evidence of lipid binding is lacking. These include, for example, the recently characterized yeast ceramide transporter survival factor 1 (Svf1) (Limar et al., 2023), sorting nexin-25 (SNX25) (Paul et al., 2022), a protein that has been proposed to belong to a new class of LTPs with phox homology-associated (PXA) and C-terminal phox homology-associated (PXC) domains that form a hydrophobic channel, nucleus vacuole junction-3 (Nvj3), which is also expected to belong to this class (Paul et al., 2022), lipid droplet ergosterol cortex 1 (Lec1)/Ypr097w, which has been proposed as a LTP due to its superficial similarity to the previously mentioned class of LTPs (Paul et al., 2022), and the repeating β groove (RBG) proteins mitochondrial distribution and morphology protein-31 (Mdm31) and AsmA (Neuman et al., 2022). Using the AF structure as a starting point, our protocol was indeed able to propose a lipid binding pose for these proteins (Fig. 4, A–E).
The identified lipid-binding pocket of Svf1 is located between the two lipocalin domains (blue and violet in Fig. 4 A), which is in good agreement with what has been recently suggested using blind docking (Limar et al., 2023). For SNX25, our protocol suggests that lipid binding, which occurs in four out of five replicas (see Data S1 E), takes place in a long and conserved hydrophobic pocket formed by the PXA and PXC domains (red and orange in Fig. 4 B, respectively), in agreement with what was recently proposed based on structural considerations (Paul et al., 2022). The same results were obtained for the related Nvj3 protein, with lipid binding in all replicas along the hydrophobic channel between the predicted PXA-PXC domains (red and orange in Fig. 4 C, respectively). AsmA and Mdm31 are two prokaryotic RBG proteins distantly related to the eukaryotic BLTP superfamily, which includes well-studied LTPs such as vacuolar protein sorting 13 (Vps13) and autophagy-related protein 2 (Atg2). Our protocol displays lipid binding in all replicas for AsmA and in three out of five replicas for Mdm31 (Data S1 E) within the hydrophobic cavity formed by the RBG domains in a similar manner as their eukaryotic counterparts (Fig. 4, D and E). On the other hand, Lec1 displays lipid binding in only two out of five replicas (Data S1 E), with one more replica binding to the external surface of the protein. However, it is noteworthy that the two binding events occur within the highly conserved hydrophobic cavity formed by the PXYn (N-terminal phox homology-associated domain from yeast) and PXYc (C-terminal phox homology-associated domain from yeast) domains (green and orange in Fig. 4 F, respectively) (Paul et al., 2022).
In summary, our protocol can identify potential hydrophobic pockets of LTPs that are poorly characterized at the structural level, and our results are in good agreement with alternative methods such as docking or structural analysis.
LTPs that can bind multiple lipids in their cavity possess several continuous lipid interacting regions
Next, we applied our protocol to BLTPs (Braschi et al., 2022; Neuman et al., 2022). Recently, many LTPs have been proposed to transport lipids via a bridge-like mechanism by establishing a continuous hydrophobic conduit between membrane organelles (Neuman et al., 2022; Wong et al., 2019). Within this model, LTPs could bind many lipids concomitantly, and they could contribute to bulk lipid transport between organelles, such as in autophagosome formation (Osawa et al., 2019; Ghanbarpour et al., 2021) or lysosomal repair (Radulovic et al., 2022, Preprint; Tan and Finkel, 2022). Yet, available high- or medium-resolution structures of BLTPs are scarce and entirely devoid of lipids, with the sole exception of one single phosphatidylethanolamine molecule in the structure of a small region of Atg2 (Osawa et al., 2019).
To investigate lipid occupancy in BLTPs, we adapted our protocol to work in the presence of multiple lipids. To do so, we performed lipid-addition simulations iteratively to avoid lipid–lipid interactions (leading to micelle formation) in the solvent before binding to the protein. As a first test to validate our protocol, we investigated the SMP domain of extended synaptotagmin-2 (E-Syt2) dimer (Schauder et al., 2014; Saheki and De Camilli, 2017; Fernández-Busnadiego et al., 2015). While this protein is proposed to work via a shuttle-like mechanism (Schauder et al., 2014; Zhang et al., 2022), it has been co-crystallized with multiple lipids in its cavity (two diacylglycerols and two detergents), providing a natural positive control for our protocol.
Indeed, we observed that with the iterative lipid addition process, at least five lipids can be accommodated in the cavity of the SMP domain as indicated by the low solvation of the lipid tails (Fig. 5 D). Intriguingly, the average spatial density of lipids in E-Syt2 spans the entire hydrophobic conduit rather than a specific lipid binding pocket (Fig. 5 A).
We next applied our protocol to Atg2 and Vps13 as these proteins have been proposed to bind multiple lipids at once. We initially restricted our analysis to the respective chorein domains of these proteins, as these regions have been solved at high resolution (2.7 and 3.0 Å, respectively [Osawa et al., 2019; Kumar et al., 2018]). Again, our protocol shows that the proteins are able to bind multiple lipids and form a continuous lipid-filled tunnel spanning the entirety of the hydrophobic cavity formed by the chorein domain (Fig. 5, B and C, and individual time traces for Atg2 in Fig. S5).
Taken together, these data indicate that our protocol is able to predict the simultaneous binding of multiple lipids to BLTPs, and it confirms that the lipid binding mode for these proteins is distinct from those proposed to work in a shuttle-like fashion. Of note, in all cases, we observed that the presence of multiple lipids further decreases their solvation (Fig. 5, D–F). This suggests that the presence of lipids inside the binding cavity could promote sequestration of lipids from the lipid bilayer and, hence, lipid transport.
Next, we investigated multiple proteins that belong to the BLTP superfamily, cold sensitive for fermentation protein 1 (Csf1, also known as BLTP1 or Tweek), Vps13, and Atg2 in their entirety (Fig. 6, A–C). For Atg2, we used the AF structure (identifier AF-P53855-F1) and residues 1–1,344 as the C-terminal region consists of several helices with low predicted local distance difference test scores that are not part of the hydrophobic cavity. In the case of Csf1, the structure of three different fragments was predicted using AF and then aligned to get the whole protein structure (2,958 residues), which was considered entirely. Finally, for Vps13, a similar protocol was employed, but residues 1,860–3,144 were discarded as they do not belong to the extended chorein domain that forms the hydrophobic cavity.
Similar to the results described above, using an adapted iterative lipid addition process (see Materials and methods and Video 2), we observed that the proteins can accommodate multiple lipids at the same time. Precisely, we observed the binding of 15 lipids in Atg2, 53 in Csf1, and 49 in Vps13 with low tail solvation. Despite the large number of lipids in the cavity, the use of an elastic network to restrain the secondary structure prevents large protein conformational changes, thus indicating that the presence of multiple lipids is compatible with the initial AF models. A similar number of lipids (15) has been proposed for ATG2A, a human ortholog of Atg2 with the same length, using structural analysis (Wang et al., 2023, Preprint). The average solvation data of the lipid tails (Fig. 6 F) are consistent with our previous analysis on shuttle LTPs (average <1). Interestingly, even though the cavities are almost completely filled with lipids, the headgroups are arranged in such a way that they face the solvent (Fig. S6 and Fig. S7). Further, the spatial density maps indicate a long hydrophobic conduit, but they also suggest the possible presence of bottlenecks in Atg2 and Csf1, which could be related to regulatory mechanisms.
Finally, to highlight the practical applications of our protocol, we designed mutations that block the hydrophobic cavity of Atg2, similar to those tested in ATG2A experiments (Valverde et al., 2019; Tan and Finkel, 2022). In these experiments, a small ring of hydrophobic residues was mutated to charged ones, resulting in impaired lipid transport in vitro. To mimic this approach, we selected four hydrophobic residues (F88, L180, L208, L311) close to the chorein motif of Atg2, where the lipid spatial density is large and uniform (Fig. 6 D), and we mutated them to charged residues (arginine and glutamate). We next performed a 500-ns-long unbiased CG-MD simulation with the 15 lipids already inside the Atg2 mutant and computed their spatial density map at the end of the simulation. The resulting map (Fig. 6 D) demonstrates that the new ring of charged residues generates a bottleneck where the lipids can no longer reside. A similar strategy has also been experimentally tested on Vps13 (Li et al., 2020), showing that mutations in the middle of the hydrophobic cavity (V690D/L692R/L694E/I715K/A717D/M720K/I722D/I761R/I768E/F790D/M796D/L798R/V802E/I816R/G820D/L827E) do not impair the lipid-binding ability of Vps13 but do abrogate the Vps13 function in sporulation. As for the previous case, our protocol clearly indicates the formation of a significant bottleneck in the Vps13 mutant that could be responsible for the observed loss-of-function (Fig. 6 E).
Discussion
In the last few years, lipid transport by proteins has emerged as a central process in membrane and organelle homeostasis, but its molecular mechanisms remain largely unclear. To bridge this gap, we present here a computational protocol based on brute-force unbiased CG-MD simulations, adapted from a recently described strategy to determine protein–ligand interactions for hydrophilic small drug-like molecules (Souza et al., 2020), to propose a structural model for the binding pose of lipids inside LTPs. Our approach provides a physics-based hypothesis on the structure of the lipid–LTP complex that could contribute toward a mechanistic interpretation of in vitro and cellular experiments.
Our protocol is easy to reproduce (requiring only publicly available open-source software) and computationally cheap, as most simulations take only a few hours to perform in a standalone graphics processing unit–accelerated workstation. In addition, it does not require previous knowledge of the binding pocket, and, by taking advantage of physicochemical properties of both lipids and proteins, can easily distinguish between polar and hydrophobic interactions, thus fully considering the amphipathic nature of most lipids. In addition, our iterative approach allows to propose a binding mode for LTPs binding multiple lipids in a dynamic way, with each lipid able to rearrange its localization inside the binding pocket upon the binding of subsequent lipids. This provides a clear advantage over docking methods, where the binding pose of each individual lipid is mostly static and can’t be easily modified upon the docking of subsequent molecules.
On the other hand, we acknowledge that our protocol has three main limitations. First, it is mostly unable to discriminate between different lipids. As such, it can’t be used to investigate lipid specificity of LTPs. We foresee that future improvements in CG force fields or the use of multiscale strategies (e.g., by backmapping to all-atom simulations), possibly coupled with free-energy calculations, could help in this direction.
Second, our protocol can generate false negative results, as we could not straightforwardly identify any lipid-binding pose for a few well-characterized LTPs, rather requiring a posteriori refinements in the simulation setup. This is very likely a consequence of the conformational plasticity of LTPs, which can adopt different conformations in their apo and holo states (Srinivasan et al., 2023, Preprint). Hence, if the initial protein structure is in a “closed” conformation (for example, in the presence of a lid that precludes lipid entry), our protocol is unable to identify the correct binding pathway and pose since our CG approach can’t reproduce protein conformational flexibility as the protein structure is restricted by an elastic network. We expect that enhancing conformational sampling to generate a diverse set of starting protein structures, e.g., by all-atom MD simulations (Srinivasan et al., 2023, Preprint) or by machine-learning-based approach (Janson et al., 2023; Degiacomi, 2019) could help mitigate this issue in the future.
Third, since in our simulations the lipid is initially placed in the surrounding aqueous solution, our protocol does not reproduce the physiological process of lipid entry, in which the lipid enters the LTP’s cavity directly from a donor membrane, nor that of lipid desorption. However, the good agreement between the observed lipid entry pathway and that proposed for multiple LTPs (Osh4, Osh6, ApoM, PITP…) indicates that the pathway proposed by our protocol warrants further experimental investigation, possibly by site-directed mutagenesis studies.
In contrast, a strong advantage of our method is its robustness towards false positives. This suggests that whenever a desolvated lipid-binding pose is found, there is a high likelihood that this interaction has meaningful consequences for protein function and that this warrants further experimental investigations. Specifically, we expect that our method will be extremely useful in at least three main areas. First, it will allow for generation of mechanistic hypotheses for subsequent experimental validation for what concerns the molecular details of lipid transport, such as the identification of regulatory mechanisms, including posttranslational modifications or potential bottlenecks along the transport pathway. Second, it will provide lipid-bound structures to potentially investigate the mechanism of lipid release by LTPs into lipid bilayers in silico, thus providing new avenues toward the characterization of the lipid transport mechanism in more realistic conditions. Third, we expect it will become a powerful tool for the direct identification of novel LTPs in silico, with the quality of the AI-based structural predictions as the main limiting factor.
Materials and methods
Software details
MD simulations of all systems were performed with the GROMACS (v 2021.x) (Van Der Spoel et al., 2005) package using the MARTINI 3 force field (Marrink et al., 2007). Molecular images were rendered using Visual Molecular Dynamics (VMD) (Humphrey et al., 1996).
Protein selection
LTPs were selected to demonstrate the protocol’s flexibility across different families, folds/structures, and bound lipids. LTPs that function via both—shuttle and tunnel mechanisms—were chosen for the study. The non-LTPs with RCSB PDB IDs 1PO0, 2IKC, 2INP, 1JGJ, and 1ZCD tested in the study were chosen to represent proteins that have large cavities and can geometrically accommodate one or more molecules that resemble lipids in size. These controls were chosen from Table 1 of Chwastyk et al. (2020), which contains 50 structures with the largest hydrophilic and hydrophobic cavities. T4 Lysozyme was included in the negative controls to test a protein with a neutral cavity while CDK2 was included to test an enzyme. The other three non-LTPs were chosen to represent proteins that bind diverse molecules such as sugars (hence concanavalin A [PDB ID 3ENR]—a carbohydrate-binding protein), DNA (hence transcription factor T [PDB ID 1XBR]), and hormones (hence thyroid hormone receptor [PDB ID 1NAX]).
System setup
The atomistic structures of the protein were obtained from the RCSB PDB (Berman et al., 2000) and were converted to CG models using the martinize (de Jong et al., 2013) script. An additional elastic network with a force constant of 1,000 kJ mol−1 nm−2 was used to restrain the secondary structure of the protein, with a 0 nm elastic bond lower cutoff and 0.8 nm upper cutoff. A force constant of 300 kJ mol−1 nm−2 was also tested for StARD11 and FABP1. Side chain corrections were applied to all proteins. The CG proteins were then placed in the center of a cubic box in which the distance between the protein and the edge of the box was at least 2.0 nm, and one lipid molecule was randomly placed in the bulk solvent in each of the replicas without any distance restrictions. The system was then solvated, considering a Van der Waals (VdW) distance of 0.21 nm, and ionized with 0.12 M NaCl. For BLTPs, the structure was predicted using AF version 2.0, and, in the case of Vps13 and Csf1, different fragments were predicted and then aligned using common residues to get the structure of the entire protein. For LTPs that could bind multiple lipids in their cavity, the lipid addition procedure was repeated iteratively, such that the structure of the protein with n lipids bound in it served as the starting structure for the addition of (n+1)th lipid, solvating and ionizing again every time a new lipid was added. This protocol was concluded once no more lipid entry into the hydrophobic cavity was observed. Rather, excess lipids would bind at the extremities of the cavity and remain solvated. In the case of Vps13 and Csf1, due to their large size, the protocol was applied separately to different fragments (two and three fragments, respectively) with similar size (see Table 1). Once filled, they were aligned to the complete structure and simulated.
Simulation details for protein in water/LTP entry setup
Five independent replicas of 1 μs each were simulated for each protein–lipid system, with the exception of FABP1, FABP1 ∆6, Miga2, StARD11, Osh6, Osh6-∆69, Osh4, and Osh4-∆29, for which five independent replicas were simulated for 3 μs each. Initial equilibration was carried out by performing energy minimization using the steepest descent algorithm, followed by a short MD run of 125 ps. Production runs were performed at 310 K using a velocity-rescale thermostat (Bussi et al., 2007), with separate temperature coupling for protein and non-protein particles. The md integrator was used for the production runs, with a time step of 25 fs. The Parrinello-Rahman barostat (Parrinello and Rahman, 1981) was used to maintain the pressure at 1 bar, along with an isotropic pressure coupling scheme and a nstpcouple parameter set to 10. The Coulombic terms were calculated using reaction field, with an epsilon (dielectric constant) of 15 and a cut-off distance of 1.1 nm. A cut-off scheme was used for the VdW terms, with a cut-off distance of 1.1 nm and the Verlet cut-off scheme for the potential shift (de Jong et al., 2016). The non-bonded interactions were calculated by generating a pair-list using the Verlet scheme with a buffer tolerance of 0.005. The system setup and simulation parameters are in line with the recently proposed protocol for studying protein–ligand binding with the MARTINI force field (Souza et al., 2020).
The Osh6-Δ69 protein was modeled by removing residues 36–69 that correspond to the lid of the ORP domain. Similarly, Osh4-∆29 was modeled by removing residues 2–29, and FABP1-∆6 was modeled by removing residues 1–6. Disulfide bridges in mutants NF1 1603C-W1641C and PITP V73C-210C were added using the CHARMM-GUI PDB Reader and Manipulator (Jo et al., 2008) web server before coarse-graining the structure using the martinize script. The BLTP mutants were modeled with Pymol v2.3.0 (Schrodinger, 2015) and aligned to the wild-type BLTPs filled with lipids after coarse-graining with the martinize script.
Analysis
The minimum distance between the protein and the lipid was computed using the gmx mindist tool. An in-house tcl script was used to calculate the solvation number of the lipid, counting the number of water molecules within 5.0 Å of the lipid tail beads (headgroup and backbone were excluded) for each frame of the trajectory, for each replica. Only the last 100 ns of the trajectories were used for the boxplot analysis. In the case of cholesterol, all beads but the “ROH” one, which corresponds to the polar hydroxyl group, were considered for the analysis.
Spatial density maps for lipids were computed with the Volmap plugin of VMD (https://www.ks.uiuc.edu/Research/vmd/plugins/volmapgui/) with a resolution of 0.5 Å, a uniform weight of one (the number density), and the default atom size by averaging over a concatenated trajectory containing the last 100 ns of each replica.
To compute the distance to the lipid tails of the crystallographic ligands, the backbone beads of the CG-MD trajectories were aligned to the N atoms of the protein crystal structure using the fit option of gmx trjconv. Then, a tcl script was used to compute the distance between the center of mass of the hydrophobic beads of the lipid and the center of mass of the lipid hydrophobic tails in the crystal structure for each frame of every replica. A similar procedure was used to compute the distance to the headgroups of the ligands.
To determine the lipid entry pathways, the center of mass of the lipid along the aligned trajectories was computed and saved in a separate PDB file using gmx traj tool. Then, the resulting files were visualized using VMD, representing each center of mass as a sphere, with different colors for different replicas, and removing the points that correspond to the lipid in bulk solvent. The percentage of occurrence of each pathway was calculated by dividing the number of replicas in which the lipid entered the LTP via the dominant pathway over the total number of replicas in which the lipid entered the protein cavity.
The analysis of the residues involved in protein–lipid contacts during the last 100 ns of the simulations was performed using gmx select, and a contact was identified if any of the beads of the protein residue was within 5.0 Å of any of the lipid beads in that frame. The time trace of contacts for selected residues was computed using gmx mindist using the same criteria for the contacts.
All scripts for system preparation and analysis are available, with instructions, at https://github.com/danialv4/Unbiased_simulations_characterize_lipid_binding/, along with the input files and gro files for the lipids used in the simulations.
Final frames of all protein–lipid pairs considered in this work are also provided in the same public repository.
Online supplemental material
Data S1 shows all the time traces of lipid tail solvation, minimum distance between protein and lipid, and distance to the x-ray ligand of all the protein–lipid pairs considered in Fig. 1; the time traces of lipid tail solvation and protein–lipid distance of all the control proteins (Fig. 2) with DOPC; an analysis of protein–lipid contacts per residue for several LTPs during the last 100 ns of simulations; the time traces of protein–lipid contacts per residue, upon binding, for few LTPs; and the time traces of lipid tail solvation and protein–lipid minimum distance for the protein–lipid pairs considered in Fig. 4. Fig. S1 shows the lipid tail solvation of the simulations with LTPs and alternative lipids. Fig. S2 illustrates the distance between the headgroups of the simulated lipids and the ones of the x-ray ligands. Fig. S3 contains the time traces of solvation and distance as well as the plots of lipid tail solvation for the simulations with Takeout and JHBP proteins. Fig. S4 illustrates the results of the entry pathway and specific residues present in the hydrophobic cavity of ApoM for comparison with all-atom studies. Fig. S5 collects the time traces of lipid tail solvation for the iterative addition of four lipids to the chorein domain of Atg2. Fig. S6 demonstrates the differences in solvation between the headgroups and lipid tails of BLTPs, and Fig. S7 is an explicit representation of their disposition. Video 1 displays the application of our protocol, as in Fig. 1 A. Video 2 shows the iterative addition of 15 lipids to one of the BLTPs (Atg2).
Data availability
All scripts for system preparation and analysis are available, with instructions, at https://github.com/danialv4/Unbiased_simulations_characterize_lipid_binding/, along with the input files and gro files for the lipids used in the simulations. Final frames of all protein–lipid pairs considered in this work are also provided in the same public repository.
Acknowledgments
This work was supported by the Swiss National Science Foundation (PP00P3_194807 and PP00P3_163966). This work was also supported by grants from the Swiss National Supercomputing Centre under project ID s1030, s1132, and s1221 and has also received funding from the European Research Council under the European Union’s Horizon 2020 research and innovation program (grant agreement no. 803952). D. Alvarez acknowledges support from the Margarita Salas program 2021–2023 funded by Ministerio de Universidades (MU-21-UP2021-030-53773022). S. Vanni and A.T. John Peter acknowledge support from the Novartis Forschungsstiftung via a FreeNovation grant.
Author contributions: S. Srinivasan: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing—original draft, Writing—review & editing, D. Alvarez: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing—original draft, Writing—review & editing, A.T. John Peter: Methodology, S. Vanni: Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing—original draft, Writing—review & editing.
References
Author notes
S. Srinivasan and D. Álvarez contributed equally to this paper.
Disclosures: The authors declare no competing interests exist.