The integration of signals originating at different times and/or locations defines the stimulus features extracted and represented by a sensory system. As such, understanding this issue is central to understanding sensory coding. Here, we focus on spatial integration by ganglion cells, the output cells of the retina. Responses of both photoreceptors and ganglion cells to a variety of light stimuli have been thoroughly described, and we have abundant anatomical information about retinal cell types and connectivity. For these reasons, the retina provides an excellent opportunity to study sensory integration from both empirical and mechanistic perspectives. Many of the issues and computational principles that emerge are likely to apply to other sensory systems.
Recent work on retinal processing has seen dramatic progress in two areas: (1) studies of the mechanisms shaping light responses as they traverse the retina; and (2) studies of the empirical properties of coding at the level of the ganglion cell output signals. These different approaches to studying retinal processing provide quite different pictures of how the retina works: mechanistic studies have emphasized nonlinear processing that shapes signals as they traverse the circuit (Singer, 2007), whereas empirical coding studies typically model spatial and temporal integration in the retinal circuitry as a linear process (Field and Chichilnisky, 2007).
This distinction matters. Nonlinearities are at the core of most interesting and/or important computations in the retina and other neural circuits. Indeed, linear integration cannot explain several aspects of ganglion cell responses—for example, the fidelity of ganglion cell responses to sparse input signals. Thus, ganglion cell responses in starlight, when photons arrive rarely at individual rod photoreceptors, rely on a thresholding nonlinearity between rods and rod bipolar cells that selectively retains signals from the few rods absorbing photons while rejecting noise from the other rods (Field et al., 2005). This nonlinearity can improve the signal-to-noise ratio of the retinal output 100-fold. To be effective, it is critical that the nonlinearity occur before, rather than after, integrating rod inputs. Similar considerations apply to many other computations.
Here, we discuss some of the successes and failures of models for how retinal ganglion cells integrate signals over space. We relate these models to mechanistic descriptions of the operation of retinal circuitry and highlight some of the issues required to bring these different approaches together. Bridging this gap will require functional models that are more tightly constrained by the growing knowledge about retinal anatomy and physiology. This will in turn help place signal-processing mechanisms in a functional context. Several past studies have embraced the added complexity of such models and described their functional features (Demb, 2008; Gollisch and Meister, 2010).
Essential features of retinal circuitry
Visual stimuli are encoded at the input to the retina by the responses of the rod and cone photoreceptors. This initial encoding consists of light intensity over space, time, and, in the case of cones, wavelength. The photoreceptor signals provide in many ways a camera-like representation of the world. Encoding in the retinal output is qualitatively different: responses of 15–20 different types of retinal ganglion cells reflect distinct features of the spatial and temporal pattern of photoreceptor activity (Field and Chichilnisky, 2007).
Feature selectivity in ganglion cells relies on both convergence and divergence of signals as they traverse the retina (Masland, 2001). Thus, cone signals diverge to ∼10 anatomically defined types of bipolar cells in mammals (Fig. 1 A). Most cone bipolar cells receive input from 5–10 cones, and bipolar cells of different types exhibit different biophysical properties (DeVries, 2000). The parallel processing initiated in the bipolar cells appears to be largely maintained by the selective synaptic contacts made by one or two bipolar cell types to a given ganglion cell type. In total, most ganglion cells receive excitatory input from tens to hundreds of bipolar cells and hundreds of cones. A notable exception is the midget circuitry in the primate fovea; in this circuit, a midget ganglion cell receives input from a single cone via a single midget bipolar cell.
A second class of interneuron, amacrine cell, also plays a key role in parallel processing. Amacrine cells receive excitatory input from bipolar cells and provide inhibitory input to bipolar cells, ganglion cells, and other amacrine cells. Most retinal neurons other than ganglion cells are not thought to generate action potentials, although some types of amacrine and bipolar cells provide exceptions. Amacrine cells exhibit substantially greater anatomical and physiological heterogeneity than bipolar cells (Masland, 2001). We have an impoverished understanding of their function.
Although we have a relatively clear picture of the anatomical connections that enable ganglion cells to collect input from different regions of space, we lack a concise functional framework that accurately captures how signals in different locations in space are integrated to control a ganglion cell’s spike output.
Successes and failures of linear and near-linear models for spatial integration
Integration of photoreceptor signals by ganglion cells is classically described in terms of a cell’s receptive field. The utility of this description depends on whether spatial integration of photoreceptor inputs can be described as a linear or nonlinear process. Linear integration would mean that the response produced by light in one region of space does not depend on light inputs in other regions; that is, the receptive field would generalize across different stimuli used to measure it. Nonlinear integration can cause inputs in different spatial regions to interact, producing poor generalization of linear receptive field properties measured using different stimuli.
Empirical models have long been used to capture the receptive field properties of ganglion cells. Early work emphasized a “difference-of-Gaussians” description in which ganglion cell firing is controlled by the difference between input signals in linear center and surround regions (Fig. 1 B) (Kuffler, 1952; Barlow, 1953). A strictly linear model requires that responses to stimuli in two regions of space add when the stimuli are presented together, and that the response to a stimulus and its inverse are opposite. These requirements are almost never met; for example, stimuli that activate only the receptive field surround often produce little or no response, but the same stimuli are able to partially or fully cancel responses generated by activation of the receptive field center. Such nonlinear response properties could be a result of nonlinearities in the retinal circuitry or of rectifying nonlinearities in spike generation and the requirement that firing rates are nonnegative. Inclusion of a post-integration rectifying nonlinearity improves the ability of difference-of-Gaussian models to capture interactions between center and surround.
Linear–nonlinear (LN) models are direct descendants of the difference of Gaussian models. In an LN model, the input stimulus is passed through a spatiotemporal linear filter L(x,t) followed by a static (time-invariant) nonlinearity N (Fig. 1 C) (Chichilnisky, 2001). The linear filter and static nonlinearity are usually estimated from stimuli that are randomly modulated in space and time; because all of the time dependence in the model is captured by the linear filter, the model components are uniquely determined by the data up to one overall scale factor. Thus, L(x,t) provides the best linear predictor of the cell’s response given the stimulus and can be calculated independently of the nonlinearity. N corrects this linear prediction for nonlinearities, for example, those in spike generation, and is unique given L(x,t). L(x,t) provides a measure of a cell’s spatial and temporal tuning (space and time projections in Fig. 1 C). Importantly, LN models retain the assumption that signals are integrated linearly in space followed by a single post-integration nonlinearity (Fig. 1 C).
Fig. 2 shows the components of an LN model computed from the responses of an OFF parasol ganglion cell to a temporally (but not spatially) modulated light input. Fig. 2 A shows the firing rate (bottom) measured in response to multiple repeats of the same random stimulus (top). The nonlinearity in the cell’s response is clear: the firing rate can only be modulated upwards because the cell has a near-zero maintained firing rate.
Fig. 2 C shows the linear filter L(t) and nonlinearity N measured from the spike response. The negative dip in the linear filter indicates that the cell preferentially responds to decreases in light intensity, integrated over a time of ~50 ms. The biphasic shape of the linear filter indicates that the cell responds most strongly to changes in light intensity rather than constant light. The nonlinearity compares the measured firing rate (y axis) with the predicted rate given by the correlation of the stimulus preceding a spike with the linear filter (x axis). The firing rate is near zero if the preceding stimulus has a time course similar to the linear filter but the opposite polarity. High firing rates result from stimuli with a high positive correlation with the linear filter. In other words, the cell’s firing rate is strongly modulated for decreases but not increases in light intensity.
The rectification indicated by the nonlinearity is fairly typical of that measured in OFF ganglion cells for such stimuli; ON cells often show less pronounced rectification (Demb et al., 2001a; Chichilnisky and Kalmar, 2002; Zaghloul et al., 2003). The LN model provides an empirical characterization of the cell’s response, but the interpretation of model components in terms of circuit elements is ambiguous. In particular, the nonlinearity could occur in spike generation and/or at upstream locations. Fig. 2 B (top) shows excitatory synaptic inputs to the same cell; these are also strongly rectified. For simplicity, we convert the currents to conductances (Fig. 2 B, bottom), that is, Gexc(t) = Iexc(t)/(V−Vexc), where Vexc is the reversal potential and V is the voltage at which the cell was held during measurement of the currents in Fig. 2 B. Fig. 2 D shows the components of an LN model for the excitatory conductance. In this case, the linear filter is the best linear estimator of the conductance given the stimulus, and the nonlinearity compares that estimate with measured conductance. The nonlinearity for excitatory inputs closely resembles that computed for spike responses (Fig. 2 D, open circles), suggesting that much of the nonlinear computation occurs upstream of spike generation (Demb et al., 1999, 2001a).
Excitatory inputs to a ganglion cell are provided by converging inputs from many bipolar cells. Thus, nonlinearities in the excitatory inputs occur before the integration of signals across space that takes place in the ganglion cell dendrites. In the case of Fig. 2, the stimulus is uniform in space and the location of the nonlinearity has little bearing on the predictive power of the model. It will affect, however, the ability to generalize to new stimuli. We will return to this issue in the context of stimuli with spatial structure below.
Difference-of-Gaussians and LN models have been successful in several ways. They can separate ganglion cells into functional types based on their spatial (Chichilnisky and Kalmar, 2002), temporal (Segev et al., 2006), and chromatic tuning (Chichilnisky and Baylor, 1999; Field et al., 2009). LN models have also been used to quantify steady-state adaptation by measuring how the linear filter and nonlinearity change when the mean or contrast of the light inputs is changed (Demb, 2008).
Several groups have created enhanced LN-style models to account for various aspects of the spike response that are not captured in the original model. Keat et al. (2001) introduced a post-spike feedback term to make the model output dependent on recent spike history (e.g., Fig. 1 C). Such models can estimate the probability of different stimulus trajectories given the spike response of a cell; that is, they determine the stimulus features that can be inferred from the spike response and the reliability of such inferences (Paninski, 2004; Pillow et al., 2005). These models have been extended to account for correlated activity by including a spike-dependent coupling term between nearby cells (Pillow et al., 2008). Even for these more complex models, the likelihood criterion used to fit model parameters has a single global maximum, and hence optimal parameters can be identified using standard numerical approaches (Paninski, 2004).
LN models including a feedback term have been especially useful in describing how adaptive mechanisms dynamically shape firing patterns. Berry et al. (1999) used an LN model with a contrast gain–control feedback to account for a retinal ganglion cell’s ability to correct for its own delay and respond to the leading edge of a moving stimulus. Ostojic and Brunel (2011) recently used several different models to capture the temporal aspects of a firing pattern, finding that an adaptive LN model in which the filter changed based on the recent spike pattern did the best job at capturing the details of a cell’s firing rate to a modulated stimulus.
LN models with and without post-spike feedback are all elaborations on a common form: linear spatial integration, followed by a nonlinear step, which in full generality is both time and spike history dependent. Although each model performs well for the tasks for which it was designed, an increasing number of phenomena in ganglion cell responses defy explanation in such a framework (Gollisch and Meister, 2010), and no model with a post-spatial integration nonlinearity has successfully predicted the responses of ganglion cells to natural or naturalistic stimuli. We argue below that models of this type are fundamentally limited because many of the nonlinear processing steps in the retina occur before spatial integration.
Y cells and their brethren: a dramatic failure of linear models
The idea that nonlinear spatial subunits exist within the ganglion cell receptive field is more than 40 years old. Recent work on the properties of synaptic transmission in the retina is beginning to reveal a more mechanistic understanding of this venerable functional abstraction.
Enroth-Cugell and Robson (1966) provided the first clear demonstration of nonlinear spatial integration in cat retinal ganglion cells (Fig. 3 A). They classified the recorded cells as X cells, which integrated their spatial inputs linearly, or Y cells, which integrated space nonlinearly. To test whether a cell was X or Y type, they presented a large sine-wave grating to the cell at several different positions. If the cell integrates light and dark inputs linearly in space (X type; Fig. 3 A, left), at some position these inputs should cancel and the cell should fail to respond to the grating. Such cancellation would occur in the integration of signals over space and hence would not depend on a final-stage nonlinearity. If the cell instead integrates nonlinearly in space (Y cell; Fig. 3 A, right), cancellation of the responses from dark and light regions is never complete, and the cell responds to the presentation of the grating at all positions. Many cells in cat exhibited such a spatial nonlinearity. Y-type cells have since been described in mouse (Stone and Pinto, 1993), rabbit (Caldwell and Daw, 1978), guinea pig (Demb et al., 1999), and monkey (de Monasterio, 1978; Petrusca et al., 2007; Crook et al., 2008).
Because Y cells respond nonlinearly to small regions of light or dark, they are sensitive to gratings of higher spatial frequency than expected from the extent of their linear receptive field (Fig. 3 B) (Enroth-Cugell and Robson, 1966; Hochstein and Shapley, 1976). The functional consequences of this high spatial frequency sensitivity have not been explored in detail. By measuring the responses of Y cells to gratings at different spatial frequencies and contrasts, Victor and Shapley (1979) established a model for nonlinear spatial integration of subunits in a ganglion cell receptive field in which each subunit had a nonlinear weight and a gain control. Their model did not take a strong stance on the anatomical substrate of the subunits, only pointing out the possibility that they corresponded to bipolar cells.
Demb et al. (1999, 2001a) used a combination of intracellular recordings and pharmacology to identify the elements of the neural circuit responsible for Y-type behavior in guinea pig ganglion cells. They found that the nonlinear responses from the receptive field center were driven by excitation from bipolar cells—likely the same bipolar cells that provide linear input to the center—and that nonlinear responses from the surround were sensitive to block of Na+ channels and hence likely involved spiking amacrine cells. These studies established a framework for connecting nonlinear ganglion cell responses to the known elements of upstream circuitry. They also provide a glimpse at the complexity of the nonlinear mechanisms shaping spatial integration in ganglion cells.
Nonlinear retinal processing
Nonlinear synaptic and cellular processes abound in the retina, as in other neural circuits. Responsible mechanisms include the voltage dependence of calcium channels that control transmitter release, the nonlinear dependence of transmitter release on intracellular calcium concentration, history dependence of synaptic transmission via synaptic depression or facilitation, and active conductances in retinal interneurons or ganglion cell dendrites. These nonlinear mechanisms are spread across circuit elements that collect information from differently sized regions of visual space and hence can, in principle, influence processing on multiple spatial scales.
We will discuss only a few of the best-characterized examples of nonlinear computations in the retinal circuitry in the most physiologically realistic conditions. Nonlinearities are often revealed by experiments that push cells and circuits well out of their normal operating range. To evaluate the importance of such nonlinearities on processing of light responses, it is important to view them in the context of the physiological operating range of cells and synapses.
Linear synaptic transmission requires that equal contrast light increments and decrements cause equal and opposite postsynaptic responses. Such symmetry requires a high sustained rate of neurotransmitter release if a synapse is to transmit a wide range of signals. The same issue, applied to spike generation and the requirement that a truly linear cell maintain a high spontaneous firing rate, motivated the inclusion of a post-integration nonlinearity in the LN model framework. To support the encoding of both positive and negative contrasts, photoreceptors and bipolar cells both use graded potentials rather than spikes, and the output synapses of both cell types have a special presynaptic structure, the ribbon (Matthews and Fuchs, 2010).
The linearity of retinal ribbon synapses has been the subject of several studies (Shapley, 2009). At the first synapse in the retina, rods make contact with rod bipolar cells, and cones make contact with cone bipolar cells and horizontal cells. Sakai and Naka (1987) found that a linear filter adequately described the voltage responses of catfish horizontal cells and bipolar cells to a randomly varying light input. A nearly linear relationship between light intensity and voltage has also been observed in salamander bipolar cells (Rieke, 2001; Baccus and Meister, 2002; Thoreson et al., 2003). The linearity of the rod synaptic output originates from a near-linear dependence of the rate of exocytosis on calcium concentration in the physiological range of rod voltages (Rieke and Schwartz, 1996; Thoreson et al., 2004); this near-linear calcium dependence is produced by a highly calcium-sensitive component of exocytosis (Thoreson et al., 2004). The rod’s high calcium sensitivity and linearity differ from the situation at most central synapses and at bipolar ribbon synapses, where exocytosis requires higher calcium concentrations and depends nonlinearly on increases in calcium (Neher and Sakaba, 2008).
Processes downstream of transmitter release from the photoreceptors can create nonlinearities in bipolar cell light responses. Burkhardt and Fahey (1998) compared the responses of salamander cones and bipolar cells to contrast increments and decrements. Although cones responded near-linearly for steps up to 100% contrast, some bipolar cells exhibited clear nonlinearities for ~20% contrast steps. Differences between this work and the studies supporting linearity of transmission are likely the result of differences in the cell types studied and the larger and more rapid changes in contrast used by Burkhardt and Fahey (1998). At low light levels, signal transfer from rods to rod bipolar cells in mouse retina acts to (nonlinearly) threshold the rod responses (van Rossum and Smith, 1998; Field and Rieke, 2002), an operation that is critical to the sensitivity of photon detection by ganglion cells. This nonlinearity originates in the transduction cascade linking metabotropic glutamate receptors to channels in the rod bipolar cell dendrites (Sampath and Rieke, 2004).
Even if signals arrive at bipolar cells proportionate to the light collected by the photoreceptors, nonlinearities in the bipolar output could lead to nonlinear spatial integration in the ganglion cell. Indeed, a ganglion cell’s excitatory synaptic input is often both profoundly rectified (see Fig. 2) (Zaghloul et al., 2003) and history dependent because of rapid adaptational mechanisms (Demb, 2008). For example, contrast adaptation (Demb, 2008) has been observed in the voltage responses of bipolar cells, in spatial subunits of the retinal ganglion cell receptive field, and in a ganglion cell’s excitatory synaptic inputs. Further, the synapse between rod bipolar cells and AII amacrine cells depresses after single-photon events (Dunn and Rieke, 2008) and voltage steps (Singer and Diamond, 2006). The effect of nonlinearities in the output of bipolar cells could be mitigated by similarly rectified inhibitory input from amacrine cells (Werblin, 2010). Inhibitory feedback circuits provided by some amacrine cells, however, enhance nonlinear transfer by decreasing the tonic release rate from the bipolar cell (Freed et al., 2003).
Active dendritic conductances can also cause nonlinearities in signal processing. NMDA receptors used in ganglion cell signaling are one example (Manookin et al., 2010). The computations underlying directionally selective responses provide additional examples (first described by Barlow and Levick, 1965; Demb, 2007). First, voltage-sensitive dendritic processing causes starburst amacrine cells to respond more strongly to stimuli moving from the soma toward the dendritic tips than vice versa (Euler et al., 2002). Second, directionally selective ganglion cells sharpen the direction tuning that they inherit from starburst cells by generating spikelets at multiple locations within their dendrites (Oesch et al., 2005).
Synaptic inputs to many ganglion cell types exhibit pronounced nonlinearities. Excitatory synaptic inputs can have nonlinearities that are similar to those in a cell’s spike output (Fig. 2), and the few inhibitory inputs that have been studied appear to be nonlinear as well. Thus, much of the nonlinearity in a ganglion cell’s spike output is already present in its synaptic inputs (Demb et al., 1999, 2001a) and hence occurs before spatial integration. In the case of excitatory inputs, this suggests that spots of light positioned within the relatively small receptive fields of the bipolar cells will interact differently that those that are spaced between bipolar cells, and functional models based on linear integration of inputs across space will fail to capture these interactions. Light stimuli that preferentially stimulate particular amacrine cells (like directional stimuli for the starburst cells) are also likely to produce inhibition in a ganglion cell that cannot be captured by a model with linear spatial integration.
A framework for the functional characterization of ganglion cell selectivity that includes nonlinear spatial integration
We are only beginning to appreciate the functional consequences of nonlinear spatial integration by retinal ganglion cells. Early work by Lettvin et al. (1959) described ganglion cell feature selectivity in terms of features inspired by natural scenes, characterizing cells as “dimming detectors, convexity detectors, and moving edge detectors.” The focus of coding studies in the retina shifted with the adoption of LN models, but recent studies have described ganglion cell selectivity for features like the approach of a dark object (Münch et al., 2009), the reversal of direction of a moving object (Schwartz et al., 2007), or the differential motion of foreground and background (Olveczky et al., 2003; Baccus et al., 2008). Gollisch and Meister (2008) presented a phase-shifted edge stimulus, like the one used by Enroth-Cugell and Robson (1966), and found that a model with linear spatial integration failed to capture the distribution of first spike latencies they observed. A model with rectifying spatial subunits (both ON and OFF type) was able to fit their data. Similar models that include a nonlinear step before spatial integration have been successful in accounting for the responses of ganglion cells to particular classes of stimuli (Gollisch and Meister, 2010). Such models are typically not fit to the data parametrically like LN models. In particular, the nonlinear step is often modeled as a straight rectification rather than an arbitrary function (Baccus et al., 2008; Gollisch and Meister, 2008). This could limit the ability of such models to generalize to arbitrary spatial stimuli.
What role does nonlinear spatial integration play in the types of information relayed by different ganglion cell types? We are far from understanding how visual information is segregated into the parallel pathways defined by each ganglion cell type. Even for the Y cell, we have only fragmentary clues about feature selectivity. As noted above, nonlinear subunits provide the Y cell with the ability to respond to much higher spatial frequencies than would be predicted by the size of the receptive field center (Fig. 3 B). Demb et al. (2001b) showed that this leads to the Y cell’s ability to respond to “second-order motion,” the movement of a high spatial frequency contrast pattern with no change in mean luminance across the ganglion cell receptive field. Nonlinear subunits might also enable the ganglion cell to signal the location of small objects within the receptive field or to distinguish between texture patterns with information at small spatial scales, but these ideas have not been tested experimentally.
Anatomical work continues to identify the cell types of the retina and their connections, and physiology is offering new insights into the ways signals are transmitted through the circuit. These advances will allow the next generation of functional models of ganglion cell behavior to move away from linear spatial integration as they confront the complexities of the nonlinearities in the retinal circuit. There are both challenges and opportunities associated with this new approach. Nonlinear spatial integration adds considerable complexity as ganglion cell sensitivity can no longer be described by a traditional receptive field. Instead, the nonlinearities of individual circuit elements, like the bipolar cells, must be measured and understood mechanistically so that they in turn can be modeled and their impact on responses to novel stimuli predicted. Although a linear receptive field can be mapped with white noise stimuli, mapping the locations and properties of subunits in the nonlinear receptive field will require the synthesis of new stimuli and analysis techniques. The general class of models that includes a nonlinearity before spatial integration can capture an enormous variety of spatial transformations (Funahashi, 1989; Hornik et al., 1989), and such models are likely to generalize across stimuli, even natural scenes, better than linear models.
We thank Jon Demb and William Grimes for helpful comments.
Support was provided by the Helen Hay Whitney Foundation (to G. Schwartz), the National Institutes of Health (grant EY11850 to F. Rieke), and the Howard Hughes Medical Institute (to F. Rieke).
Robert A. Farley served as guest editor.