The timing and magnitude of force generation by a muscle depend on complex interactions in a compliant, contractile filament lattice. Perturbations in these interactions can result in cardiac muscle diseases. In this study, we address the fundamental challenge of connecting the temporal features of cardiac twitches to underlying rate constants and their perturbations associated with genetic cardiomyopathies. Current state-of-the-art metrics for characterizing the mechanical consequence of cardiac muscle disease do not utilize information embedded in the complete time course of twitch force. We pair dimension reduction techniques and machine learning methods to classify underlying perturbations that shape the timing of twitch force. To do this, we created a large twitch dataset using a spatially explicit Monte Carlo model of muscle contraction. Uniquely, we modified the rate constants of this model in line with mouse models of cardiac muscle disease and varied mutation penetrance. Ultimately, the results of this study show that machine learning models combined with biologically informed dimension reduction techniques can yield excellent classification accuracy of underlying muscle perturbations.
Introduction
Vertebrate hearts are comprised of highly specialized cells that cyclically contract throughout an organism’s lifespan. Each cell is composed of axially connected contractile units called sarcomeres (Powers et al., 2021; Willingham et al., 2020). Sarcomeres, in turn, contain axially oriented, compliant protein filaments—the interdigitating thick and thin filaments. Thin filaments extend from Z-disks (which bracket the ends of the sarcomere) and contain a double-helical strand of actin monomers to which myosin motors can bind. Along the thin filaments are troponin–tropomyosin protein complexes, which are responsible for calcium-dependent regulation of contraction. The myosin molecular motors that ultimately power contraction constitute the bulk of the thick filament; they extend radially from the thick filament backbone in the form of a three-start helix (Fig. 1). For a review of the structure and function of the sarcomere, see Powers et al. (2021).
Sarcomere contraction follows the electrical activation of the cell. Electrical activation triggers a cascade of biochemical and biophysical events, beginning with an influx of calcium into the sarcomere and the calcium-dependent activation of the thin filaments. The activation of the thin filaments results in the troponin–tropomyosin complex exposing actin-binding sites to myosin motors, which probabilistically bind and form force-generating cross-bridges between the filaments. During activation and force generation, the thin filament regulatory protein complexes and myosin motors undergo a series of state transitions determined by a host of rate constants (Fig. 1 D). Perturbations to any of these complex inter- and intrafilament interactions can lead to abnormal contractility and, ultimately, to pathological conditions (Davis et al., 2016; Feest et al., 2014; McKenna et al., 2017; Powers et al., 2020; van der Velden and Stienen, 2019).
Several metrics have been developed to characterize the mechanical consequences of cardiac muscle diseases caused by underlying dysfunction of sarcomeric proteins. For unloaded cells, measures such as the magnitude of cell/sarcomere shortening or the rate of shortening and relaxation are commonly reported (Davis et al., 2016; Feest et al., 2014; Toepfer et al., 2020). Additionally, under isometric conditions, peak twitch forces and the time constants of their activation and relaxation can be measured from single-cell or multicellular preparations (Powers et al., 2020; Zaunbrecher et al., 2019). A recently developed and successful metric for classifying cardiomyopathy phenotype is the tension–time index (TTI) associated with twitch forces. This index is computed from the integral of the tension–time relationship of a twitch (the impulse of a twitch): the area of pathological twitch is subtracted from that of a control twitch, both normalized to the peak force of the control twitch. The TTI has been shown by several groups to be predictive of cardiomyopathy phenotype (hypertrophic or dilated) in murine- or human-induced pluripotent stem-cell-derived (hiPSC-CM) cardiomyocytes (Davis et al., 2016; Hinson et al., 2015; Powers et al., 2020; Sewanan et al., 2019). TTI is likely predictive because it is a summative measurement that involves both activation and relaxation kinetics. It is also an integration function and provides a measure of noise reduction. Myocytes sense the abnormal tension and remodel with the activation of pathways such as MEK–ERK1 and calcineurin.
While the TTI usefully compresses time course data into a single number, integration necessarily loses the temporal features associated with a twitch. Currently, it is unknown if the time course of twitch force embeds information about the underlying sarcomeric function. We suggest that there is meaningful information about the underlying complex interactions within the sarcomere reflected in the shape of a cardiac twitch. This information could be encoded in the subtle temporal features of activation and relaxation that alter the overall shape of the tension–time trace.
Connecting temporal features of cardiac twitches to underlying rate constants and their perturbations associated with genetic cardiomyopathies remains a fundamental challenge. Addressing this challenge requires (1) large data sets of twitches and (2) the development and validation of data analytic methods. Therefore, we used biophysical simulations to produce large data sets for feature extraction and building classifiers. Here, we combine computational simulations of cardiac twitches with machine learning (ML) methods to address two key questions: (1) what features of cardiac twitches best reflect perturbations in the biochemical and biophysical determinants of force generation? and (2) can subtle differences in the features be used to classify disease states given variation and stochasticity in data? For the simulations we use here, we modify our previously published spatially explicit model of muscle force generation to account for calcium activation of the thin filament (Powers et al., 2018; Tanner et al., 2012; Williams et al., 2013). We use calcium transients derived from previously measured experimental data (Sparrow et al., 2019) to drive activation in our simulations, and compute the temporal pattern of force during a twitch. We then introduce perturbations to specific rate constants that mimic alterations in thin and thick filament activation associated with genetic cardiomyopathies. The twitch forces from these simulations mirror changes observed experimentally. Harnessing this large, labeled data set, we used ML methods to leverage temporal features embedded in the twitches for classifying disease states. This study reveals that dynamic features of twitches can delineate normal and pathological conditions; additionally, it lays groundwork for linking observed muscle function to underlying molecular mechanisms.
Materials and methods
Spatially explicit half-sarcomere model
We build upon previous spatially explicit models of muscle contraction (Daniel et al., 1998; Powers et al., 2018; Tanner et al., 2012; Williams et al., 2013). In this current study, we have incorporated both thin filament regulation and titin mechanics into our previous models and have added a data-driven calcium transient in order to simulate a twitch (Powers et al., 2018; Sparrow et al., 2019; Tanner et al., 2012).
As many aspects of this model have been described previously (Daniel et al., 1998; Powers et al., 2018; Tanner et al., 2012; Williams et al., 2013), we only briefly summarize it here. Our spatially explicit multifilament model of the half-sarcomere is a stochastic, Monte Carlo–based, kinetic model consisting of eight actin thin filaments and four myosin thick filaments arranged in a double-hexagonal lattice (a 2:1 thin to thick filament ratio, Fig. 1 A). Since we model known protein mutations with unclear twitch dynamics, it is particularly important to simulate relevant interfilament interactions explicitly (such as troponin activity), instead of modeling overall force output from the muscle.
Crossbridge state transitions are based on a three-state crossbridge model for attachment, force generation, and detachment. At each time-step, we cycle through the myosin motors, allowing each a series of random number generator checks to determine crossbridge state updates, transitioning through three states, XB1–XB3, as shown in Fig. 1 D. Thin-filament activation is modeled with the following states: (1) calcium binding to cTnC, (2) change in the cTnC–cTnI interaction (newly added to the model), and (3) thin filament activation via movement of tropomyosin that allows myosin binding (Fig. 1 D). Our starting point (base model) for all transition rates and spring constants is based on recent analyses using this model from (Powers et al, 2018) and is summarized in Fig. 1 D and Table 1.
Simulations parameters
The simulations were implemented in Python, and the code is available in our Github repository (Asencio et al, 2019). The spatially explicit model is prescribed by dozens of parameters that may be organized into two categories: biochemical parameters and experimental parameters. The four experimental parameters are sarcomere length (1,150 nm), calcium concentration (pCa 4.0), lattice spacing (12.85 nm), and Poisson ratio (constant spacing, 0.0). By default, these parameters remain constant throughout the course of a simulation. The simulation framework also allows sarcomere length, calcium concentration, and lattice spacing to be defined by time series, allowing for temporal control and specification of experimental conditions.
Following a set of simulations using the base parameters that represent control twitches (see Table 1), we undertook a set of parameter modifications that reflect various pathological conditions (sarcomeric mutations). In addition to adjusting the magnitude of parameter values, the spatially explicit nature of our model allows us to vary the penetrance (or mutant protein incorporation) of any specific mutation by randomly assigning parameter changes to specific subunits within the sarcomere. For example, an increase in the binding of calcium to troponin that is associated with the L48Q mutation of cardiac troponin C (cTnC) can be applied to a random subset of thin filament subunits. The extent of mutation (change in binding) and the fraction of sites (penetrance) can be adjusted accordingly.
While there are myriad parameter variations one could explore, we focus here on two thin filament mutations and one thick filament mutation. For the thin filament, we modeled the effects of mutations associated with the mutations L48Q cTnC and D65A cTnC that have been previously characterized (Gillis et al., 2007; Wang et al., 2012). The L48Q cTnC mutation increases Ca2+ binding to cTnC (Kreutziger et al., 2011; Wang et al., 2012), while the D65A cTnC mutation does not activate the thin filament due to the inability to bind Ca2+ (Gillis et al., 2007). We adjusted the Ca2+ to cTnC binding rate (rt;12) based on experimental recombinant L48Q cTnC measurements (Tikunova and Davis, 2004; the default rate of 1.7 × 108 s−1 mol Ca−1 was changed to 7.4 × 108 s−1 mol Ca−1). Additionally, we changed the equilibrium constant K1 from the default value of 1.47 to 18.0 × 105 mol Ca−1. For D65A cTnC, we prevent TnC–TnI interaction caused by calcium binding by setting the k23 forward rate to 0. We also simulated thick filament gain of function by decreasing the myosin detachment rate (rx,31; Fig. 1D), which depends on the forces acting on the myosin head, by a factor of 0.5. That gain of function is similar to small molecules (Lehman et al., 2022; Powers et al., 2019) or myosin mutations (Anderson et al., 2018) that increase the number of myosin motors interacting with the thin filament.
Isometric twitch simulations
Isometric twitch analysis
We use several metrics to characterize the dynamics of twitches for simulations of both control (baseline condition) and abnormal sarcomeres. As mentioned above, the twitch tension index (TTI) is computed from the difference in the force impulse (integral of the tension time relationship) of a wild type (control) twitch and that of one associated with some pathological condition, both normalized to the peak force of the control twitch (Davis et al., 2016). That difference, in turn, is then represented as a fraction of the control impulse. We also calculated more traditional metrics of twitch dynamics, including the peak force (maximum tension); the activation time, as measured by the time to half-peak force; and the relaxation time, as measured by the time between peak force and that of half-peak force during relaxation.
Singular value decomposition feature analysis
Metrics such as those above represent considerable dimension reduction relative to the detailed time history of the peak force. Singular value decomposition (SVD) can reveal additional temporal features that may vary with changes in parameters associated with twitch forces. Indeed, SVD is a robust, reliable, and efficient method for feature extraction and dimension reduction. In prior studies, we have used such approaches to reveal temporal features of motor control and force generation or temporal features underlying sensory encoding (Sponberg et al., 2015). Further, SVD lies at the heart of principal component analysis (PCA), which recasts the data into a space where twitches with different shapes can be differentiated.
We used the NumPy package in Python for SVD and PCA. We then determined the variation in the total dataset represented by a single component by normalizing to the most prominent component (Fig. 3 B). Finally, we projected the Frobenius norm of each twitch onto the two most prominent eigenvectors to produce a PCA scoring in each principal component.
ML algorithm
We used a gradient-boosting decision tree algorithm housed in the xGBoost library to train a mechanism-free model to classify the perturbation type based on a set of given features about the twitch. This random forest (RF) method has been widely used in fields such as bioinformatics and medical imaging (Borstelmann, 2020; Retson et al., 2019). We have used it previously to probe complex data relationships between the temporal dynamics of myofilament lattice spacing and the electrical activation of skeletal muscle in a model organism (Malingen et al., 2020). RF is suitable for handling large data sets with high dimensionality (like twitches). The data were partitioned into training (75%) and testing (25%) sets. We used a variety of inputs for the RF including full twitch, TTI, the first two principal components (PC1/2), and traditional activation/relaxation parameters (Fig. 5).
Classifier optimization
To increase the efficiency of the RF, we implemented a genetic algorithm to tune the parameters of the RF. The RF hyperparameters that are tuned include the number of trees, learning rates, and maximum size of decision trees. These settings are passed down in a way similar to genetic inheritance—two trees that did well combine their settings—and there is also a chance for mutation in the values. Each round of this optimizes the objective (overall scoring percent), solely by changing the general settings of the forest. We ran the genetic algorithm for seven generations with a population of 10 trees, recombined to optimize hyperparameters. The algorithm was set to maximize classification accuracy. Then we average the data and calculate 95% confidence intervals. Finally, we used one-way analysis of variance (ANOVA) with Dunnett’s multiple comparisons test to compare accuracy between the different conditions.
Results
Isometric twitches
Simulations of a constant length sarcomere with our prescribed calcium transient produce twitch forces which have temporal behaviors consistent with those measured experimentally (Fig. 2; Davis et al., 2016; Powers et al., 2020; Prodanovic et al., 2022). Moreover, a peak force of ∼100 pN is generated by our model myofilament lattice which has a cross-sectional area of 6,500 nm2 corresponding to a stress (force/area) of 1.5 × 104 Nm−2 (which is equal to 15 mN mm−2), a value consistent with experimentally measured twitch stresses in cardiac muscle (Powers et al., 2020). Our 50%D65A cTnC simulations show the expected decrease in maximal twitch force and negative TTI without affecting activation or relaxation kinetics (Table 2). Conversely, replacing as little as 25% of the control cTnC with L48Q cTnC resulted in significantly higher maximal force (fmax) and faster activation (decreased t50%). In this case, relaxation is also slowed, as measured by the time to 50% relaxation, with an associated increase in TTI. When the number of thin filaments with L48Q cTnC is increased, TTI and fmax are increased, while t50%max is decreased. In our myosin simulation condition, we observe increased fmax and TTI without changing activation kinetics.
SVD and PCA clustering
Our SVD analysis reveals that the first two components describe the majority of the variation in the data (Fig. 3), as reflected by the magnitude of their eigenvalues. As expected, the first component appears visually similar to an average twitch, while the second component reflects changes in the timing of both activation and relaxation that are characteristic of L48Q cTnC twitches. These changes reflect those seen in Fig. 2, where an average control twitch was subtracted from an averaged trace for each of our conditions. The principal components provide a low-dimensional representation of the twitch time-course that is not fully conveyed by other metrics. Although PCA is agnostic about underlying biophysical mechanisms, changes in kinetics result in changes to twitch timing, represented by clustering in the PCA.
Fig. 4 B shows the result of projecting our twitch data onto the two most prominent eigenvectors (PC1 and PC2). This projection shows clusters of twitches sharing similar time course behavior. Our D65A treatment, in particular, shows the greatest difference from the other twitches along the PC1 dimension. Further, we are able to separate the 75% L48Q and myosin modification twitches, which have similar TTI as seen in Table 1.
Classifier accuracy
The accuracy with which the random forest classifies perturbation type depends on how the input features are engineered. We considered a total of eight scenarios, providing the model with different combinations of input features: (1) metrics involving relaxation, including force at 50% relaxation, time to 50% relaxation, and time to 90% relaxation, (2) metrics involving activation, which include peak force, time to peak force, and time to 50% peak force, (3) common twitch summary metrics, which include peak tension, time to peak tension, and half-max width of the twitch, (4) the TTI, (5) the projection onto the two dominant principal components, (6) the full-time history of the twitches (all the data), (7) the time history of the twitch along with the TTI, and (8) the projection onto the two dominant principal components combined with the TTI.
All of the options provide a level of classification accuracy well above chance (14.3%), even the relaxation time alone. Our results show that the combination of PC1, PC2, and TTI as input features for the RF provides the highest classification accuracy (78.5 ± 0.1%), while the full twitch time histories combined with TTI are a close second. Interestingly, by identifying relevant information for phenotype classification using three dimension-reduction techniques, we achieved better classifier accuracy than a model using the full twitch time series alone.
Our data also show that the addition of the TTI to either PC1 and PC2 or the full twitch significantly improves classifier accuracy. The traditional activation and relaxation parameters do not perform as well as the full twitch, PC1 and PC2, TTI, or their combinations. In Fig. 5, we show the accuracy for each of the input features compared to each other and relative to classification by chance (dashed line in Fig. 5).
The overall accuracy of the classifier shown in Fig. 5 A does not entirely reflect our ability to classify a specific category. A confusion matrix (Fig. 6) can reveal how each training set performs for each particular variant. For instance, the confusion matrix shows that the combination of PC1, PC2, and TTI performs better than the full twitch and TTI at classifying 100 and 75% L48Q and control, respectively. For decreased Rx;31 and 50%D65A, the two training sets yield similar levels of classification. However, the combination of the projection onto the two dominant principal components combined with the TTI performs slightly better than the combination of full twitch and TTI. Not surprisingly, the more subtle effects corresponding to low penetrance of L48Q are more challenging for all of the classifiers to identify, although the confusion matrices show that this is not uniformly the case. It is notable that misclassifications of 25% L48Q, in particular, are usually one-off diagonal in the confusion matrices for the best-performing classifiers, indicating they are often mistaken for either a higher penetrance (50% L48Q) or control.
Discussion
In this study, we combined computational simulations of cardiac twitches with ML methods to address two key questions: (1) what features of cardiac twitches best indicate perturbations in the biochemical and biophysical determinants of force generation? and (2) can subtle differences in the features be used to classify disease states given variation and stochasticity in data? Indeed, it is an open question whether the full time-course of twitch force embeds information about the underlying sarcomeric properties and function. Thus, elucidating the connection between temporal features of cardiac twitches and underlying perturbations to rate constants associated with genetic cardiomyopathies remains a fundamental challenge. To address this challenge, we used an extensive data set of simulated twitches (along with their associated myofilament perturbations) and developed a series of ML methods to classify disease types from simulated cardiac twitches.
Four key results emerged from this study. First, we simulated cardiac twitch forces that reflect those in both wild type and disease conditions found in mouse model systems. Importantly, we found these simulations to reasonably reflect both the magnitude (normalized to area) and time course of experimentally measured twitch force. Second, using simple dimension reduction methods (e.g., SVD and PCA), we isolated temporal features of twitch forces that correspond to changes in the kinetics of thin filament regulation. Third, even in the presence of significant stochasticity arising from the Monte Carlo–based simulations, the results of dimension reduction approaches applied to twitch data can be used as features that enhance classifier performance. And fourth, the classifier accuracy depends quite strongly on the features of the twitch that the model has access to, and there is an interesting structure in the classifier discrimination capability for each feature set.
While we conclude that ML methods hold great promise for classifying disease states, even with subtle genetic penetrance, it is important to describe some of the limitations in the present study. First, we used the same intralattice calcium transient for all of our simulations (Sparrow et al., 2019). In reality, there is evidence that altering sarcomere function via mutations or small molecules results in concomitant changes in intracellular calcium levels (Davis et al., 2016; Psaras et al., 2021; Sparrow et al., 2019; Sparrow et al., 2020). Additionally, while our spatially explicit model incorporates thin filament regulatory kinetics, it does not currently incorporate any cooperative thin filament activation mechanisms as had been suggested by Tanner et al. (2012). Further, we used a three-state approximation of the crossbridge cycle in our biophysical simulation although this is a simplification (Powers et al., 2021). The current version of our model does not account for thick filament regulation, which is known to be involved in both the pathogenesis and treatment of cardiomyopathies (Lehman et al., 2022; Nag and Trivedi, 2021). In future studies, we will add a new myosin population as modeled by other groups (Kosta et al., 2022; Mijailovich et al., 2021a). Our initial model parameters were not optimized to any particular species and that may limit the ability of the model to match experimental measurements. Finally, while our ML methods performed well, we have not comprehensively tested different classification algorithms to determine which might have optimal performance.
Physiologically reasonable twitches can be produced from a spatially explicit model
Despite these limitations, we were able to generate physiologically reasonable twitches using our spatially explicit model of the sarcomere. Our original multifilament model of the sarcomere was, to our knowledge, the first spatially explicit model of muscle force generation (Daniel et al., 1998). Prior models of the mechanochemistry of force generation were based on mass action kinetics and could not account for the discrete geometrical relationship between individual myosin heads and the actin-binding sites (Huxley, 1957; Pate and Cooke, 1989). Our model accounts for lattice geometry, and filament compliance and has the ability to independently alter functional properties of individual components in the system. Since its original two-filament formulation, our model has been extended to a 3-D lattice of compliant myosin and actin filaments (Chase et al., 2004) and integrates extensional and torsional mechanics of the crossbridge (Williams et al., 2010) and, most recently, the non-linear dynamics of titin (Powers et al., 2018). This now accounts for both radial and longitudinal forces generated by a twitch that are important for ventricular pressure development. With the advent of advanced computing, using spatially explicit models in muscle biology is more practical and is more widely used. The most recent published version of our model incorporated titin stiffness as a variable but did not account for thin filament regulation, which we had in the previous version (Williams et al., 2013). Titin has a critical role in myocyte elasticity, stretch-dependent activation, and intracellular mechanosensing, and its variants are the leading cause of genetic DCM (Kellermayer et al., 2019).
In this study, we incorporate both thin filament regulation and account for titin stiffness. Thin filament activation is modeled as the following states: (1) Ca2+ binding to cTnC, (2) change in the cTnC–cTnI interaction (newly added), and (3) thin filament activation via movement of tropomyosin that allows myosin binding. Each transition in the thin filament activation pathway contains tunable forward and reverse rate constants. The state transitions follow from prior published results from our group and others (Aboelkassem et al., 2015; Siddiqui et al., 2016; Tanner et al., 2012; Table 1). Now, for the first time, we have adapted the model so that subunits can have different rate properties, making it possible to simulate a lattice where mutated proteins may have varying penetrance. This modification is essential since genetic cardiomyopathies are typically heterozygous, resulting in at most only 50% abnormal protein. Finally, in order to simulate twitches from our sarcomere model, we drive force generation and relaxation via a calcium transient based on published intralattice measurements (Sparrow et al., 2019).
Using our model, we simulate the sarcomeric twitch force resulting from underlying perturbations in binding rates. The binding rates were drawn from in vitro biochemical assays that target the changes observed in models of cardiac muscle disease. For the L48Q cTnC modification, we used experimentally measured KCa (Ca2+ binding affinity to the troponin complex) and koff (Ca2+ dissociation rate) values to adjust the first transition rate (rt,12). For D65A cTnC, we prevent TnC–TnI interaction caused by calcium binding by setting the k23 forward rate to 0. For our thick filament (myosin) modifications, we decrease the myosin detachment rate (rx,31) by half in order to increase the number of attached myosins.
The resulting twitch simulations captured some of the twitch metrics recorded in experiments (Mijailovich et al., 2021b,;Powers et al., 2020). However, our ability to make direct comparisons to experimental data from L48Q cTnC mouse is limited by the fact that we used a published guinea pig calcium transient, which is similar to human and with different kinetics than a mouse calcium transient (Prodanovic et al., 2022). In addition, for simplicity, the current version of our model did not incorporate thin filament cooperativity, which affects activation and relaxation kinetics. As seen in Table 1, increasing cTnC Ca2+ binding by introducing the L48Q cTnC modification increased maximal force and activation rate and prolonged relaxation. While our findings in 25% and 50% cTnC L48Q simulations show a similar trend seen in experimental data from transgenic mice with ∼30% L48Q cTnC incorporation (Powers et al., 2020), the extent of prolonged relaxation is underestimated. This is likely secondary to having a different Ca2+ transient and a lack of thin filament cooperativity. It was reassuring that the TTI in our 25% L48Q cTNC was calculated to be 0.5 × 104, which is similar to the published value of about 0.3 × 104 in experimental data with ∼30% L48Q cTnC (Powers et al., 2020). Increasing amount of L48Q cTnC is linearly correlated with all four parameters reported in Table 1. Similar findings have been reported for the L48Q cTnC simulations using MUSICO spatially explicit model (Mijailovich et al., 2021b). As expected, inactivation of thin filaments with the D65A cTnC decreases maximal force and TTI. There is no change in t50%fmax, while the t50%relax is decreased, suggesting faster relaxation. This behavior has been seen in the I61Q cTnC which has a faster Ca2+ dissociation rate compared with WT cTnC (Davis et al., 2016,;Wang et al., 2013). While there are no experimental data from D65A cTnC measurements, when a 50:50 mixture of WT and D65A cTnC was exchanged into demembranated trabecula, there was an ∼70% decrease in maximal force and reduced Hill coefficient, indicating decreased cooperative activation (Korte et al., 2012). Our myosin modification resulted in increased force and TTI, prolonged relaxation, and no change in t50%fmax. Such hypercontractility and prolonged relaxation are reported in experimental studies of myosin mutations that show similar mechanisms of increased crossbridge binding (Toepfer et al., 2020). Taken together, these results show that the perturbations in model thin filament or thick filament rate constants inspired by diseased states or small molecules result in changes in both the magnitude and the shape of twitches.
Temporal features combined with machine learning methods yield better disease classification than traditional metrics alone
How can the differences between the twitches be systematically distinguished? Traditional twitch metrics such as maximal force or time to 50% activation or relaxation are able to distinguish features in extreme conditions (e.g., control and D65A cTnC). However, such metrics are unable to capture more subtle differences between conditions (e.g., 50 and 75% L48Q cTnC). This is apparent in our twitch data set, where traditional twitch metrics overlap for different perturbation types. To better capture subtle differentiating features in the temporal dynamics of our twitch data set, we employed dimension reduction techniques. Using SVD of the force vs. time matrix, we were able to capture the majority of the variation in the data within the first two eigenmodes (Fig. 3 A). As expected, the first eigenmode resembles the activation and relaxation dynamics of a twitch. The second eigenmode captures the activation and relaxation features seen in our L48Q cTnC twitches. This can be seen by comparing the second eigenmode (Fig. 3 A) to the average control twitch subtracted from the average of each of the L48Q cTnC conditions (Fig. 2 D). Since SVD lies at the heart of PCA, we next projected the data onto the first two components. As seen in Fig. 4, projecting the data onto the two most prominent features (principal components), we were able to identify unique clusters and spread out the twitches. Since the second component mostly describes the L48Q cTnC behavior, the projection onto this component primarily separates out the conditions with different amounts of L48Q cTnC.
Next, we trained eight different machine learning models to classify underlying biophysical perturbations given twitch data. Each model received a different set of features from which to classify underlying perturbation, and we show in Fig. 5 A which feature sets enable the most accurate model predictions for our large, diverse data set. Due to the stochastic process of myosin interacting with actin, the simulated forces fluctuate in time (Fig. 2 A), resulting in a challenging classification problem for a model presented with only twitch time course data. As shown in Fig. 5 A, using the full twitch or the projection onto PC1 and PC2 resulted in about 70% accurate identification of a given twitch. This accuracy is higher than classifiers that used TTI alone as the input. This is not surprising as several of our conditions have a significant amount of overlap in this metric, and indicates that important information is included in the timing of a twitch. However, adding the TTI to either full twitch or projection onto PC1 and PC2 results in the highest classifier accuracy. This demonstrates the utility of integration, which reduces the noise in a full twitch. This is also an example of how applying domain-specific knowledge to better engineer features can improve model performance.
In addition to overall classification accuracy, we use confusion matrices to distinguish which perturbations are correctly classified and, if they are misclassified, into which classes they fall. Interestingly, we observed that some perturbations were classified more or less accurately based on the feature set a model used, while others were classed similarly by all models. For instance, the D65A cTnC twitches are easily distinguished using any of the input features. Remarkably, even the relaxation, activation, or summary features that do not perform well overall provide enough information to classify D65A cTnC with 100% accuracy. However, these input features perform poorly in classifying 50 or 75% L48Q cTnC. Using the TTI, projections onto PC1 and PC2 or the full twitch as input features yield models with higher classification accuracy. However, there are specific challenging classifications that deserve discussion. For example, the model using TTI alone performs poorly in classifying decreased rx,31. This is likely because it has a similar TTI to 100% L48Q. In contrast, the models using either the projections onto PC1 and PC2 or the full twitch perform much better in classifying decreased rx,31, and when TTI is combined with PC1/PC2 or the full twitch, the accuracy is further increased. Similar improvements are noted across all groups as seen in Fig. 6. In addition to these observations, the confusion matrices also show that for the top-performing models many misclassifications are between different penetrance of the mutant L48Q subunits. Since mutation penetrance is a continuous value that we have artificially binned into classes (rather than distinct mutation classes), it can be argued that these miss-classes are still correctly indicating the type of underlying rate change.
ML methods have potential in diagnosis and management of genetic cardiomyopathies
While our current study evaluated a limited number of conditions, after more extensive validation, one can envision the utility of ML and/or simulations in the care of patients with genetic cardiomyopathies. One simple use of this approach would be to model a new mutation in cTnC by providing the computational model with Ca2+ binding or troponin–tropomyosin interaction rate estimates that are associated with experimental recombinant protein measurements. These would provide a direct test of the model. Alternatively, one can use techniques such as adaptive steered molecular dynamics (Hantz and Lindert, 2021) to quantify how mutations alter calcium binding affinity. This approach can replace the need for generating any experimental transition rates for input into our spatially explicit model. Additional and related ML methods can be used to address the inverse problem of (1) identifying the space of rate constants that best explain observed twitch data and (2) predicting changes in underlying kinetics (rates) that can be adjusted to compensate for abnormal transition rates that correlate with patient-specific mutations. Lastly, a similar pipeline could be applied to a large data set of myocardial time strain curves by using speckle-tracking echocardiography. PCA has been applied to such data to distinguish physiologic from non-physiologic strain curves (Yahav et al., 2020).
Conclusion
The time course of muscle contraction echoes the underlying inter- and intrafilament interactions that shaped it. In disease, perturbed filament interactions alter the timing of force generation, but typical metrics capture minimal timing information. Using a spatially explicit model to simulate disease states, we created a large, labeled dataset that connects biophysical perturbations with resulting twitch time courses; this data set was the training ground for novel ML classifiers. We demonstrate that ML methods herald great promise for classifying the underlying mechanistic changes that reshape twitch forces in disease.
Data availability
All the data generated for the manuscript are available from the corresponding author upon reasonable request.
Acknowledgments
Henk L. Granzier served as editor.
We gratefully acknowledge advice and counsel from Dr. Michael Regnier.
This work was supported by the Center for Translational Muscle Research grant NIH P30AR074990, American Heart Association Collaborative Sciences Award (20CSA35310217), Locke Charitable Trust, and NIH grant R01 HL157169 are gratefully acknowledged. A. Asencio is supported by an American Heart Association research supplement to promote diversity in science. J. Davis is supported by National Institutes of Health grants R01 HL141187, HL142624, and HL162229.
Author contributions: F. Moussavi-Harami, T. Daniel, and J. Davis conceived the project and designed experiments. A. Asencio and S. Malingen performed simulation and ML analysis. J.D. Powers and K.B. Kooiker performed and analyzed experimental data. A. Asencio, T. Daniel, and F. Moussavi-Harami wrote the manuscript. A. Asencio, S. Malingen, K.B. Kooiker, J.D. Powers, J. Davis, T. Daniel, and F. Moussavi-Harami edited the manuscript.
References
This work is part of a special issue on Myofilament Function 2022.
Data availability
All the data generated for the manuscript are available from the corresponding author upon reasonable request.
Author notes
Co-first authors
Disclosures: The authors declare no competing interests exist.