Two papers from PNI use a new mathematical framework to study internal cognitive states

One’s internal mental state can profoundly alter their experience of the world. When a listener is in a jovial mood, a hilarious joke triggers raucous laughter. If that same person is feeling down, the same joke will be received tepidly. The experience of an inner mental state feels nearly universal and is easy to express in words. But identifying an animal’s inner state based only on its actions is a significant challenge in behavioral science. Two new studies from labs at the Princeton Neuroscience Institute apply an expressive new mathematical tool to identify internal states based only on subjects’ behavior and to examine how changes in internal state impact the choices the subjects made. 

Both studies were collaborations with PNI Professor Jonathan Pillow; the first with the International Brain Laboratory, an international consortium of brain researchers, and the second with PNI Professor Ilana Witten. The first study, published in Nature Neuroscience, was led by graduate student Zoe Ashwood. Ashwood and her colleagues showed how mice and humans switch between behavioral strategies when solving decision making tasks, leading to new insights into why animals make the wrong choice even in easy situations. The second study, led by PNI postdoctoral researcher Scott Bolkan and PNI graduate student Iris Stone, was also published in Nature Neuroscience. The research team used a combination of virtual reality, light-controlled brain inactivation, and the same mathematical tool as Ashwood to identify how the striatum, a brain region long known to be involved in making decisions, controls decisions in a way that depends on the animal’s internal state.

In both studies, the authors further developed and used a novel mathematical framework to identify inner mental states from subjects making decisions. Amazingly, their framework can discover inner states using only the sequence of decisions the subjects made. Key to their framework, which they call a ‘GLM-HMM’, is to conjoin two well-understood mathematical models. One half of their model (known as a Generalized Linear Model, or GLM) combines task-related and unrelated variables to determine the probability that a subject made a particular choice. The second half (known as a Hidden Markov Model, or HMM) stipulates that the details of the first half of the model (i.e., the GLM) change in time—unbeknownst to the experimenters—based on changes in a hidden (or unobserved) state. By combining the GLM and the HMM, the GLM-HMM was born.

How can internal states be identified when they cannot be seen? A key insight is that a reliable relationship between direct observations (like decisions) and unseen, internal states can be used to learn unseen states. Consider the following example: imagine you remain home for several consecutive days, but you attempt to guess the weather (an unseen variable) based on the mood of your housemate (a direct observation), who leaves and returns. Chances are high—but by no means guaranteed—that it is sunny if they are in a good mood and rainy if they are in a bad mood. Thus mood can reliably, albeit imperfectly, predict weather. Furthermore, past weather can also be used to predict future weather, and thus can help refine your guess based on your housemate’s mood. HMMs formalize these intuitions, allowing researchers to use direct observations to make educated guesses about unseen variables that change over time.

Figure 1. (a) Mice performed a decision-making task. They turned a wheel left or right according to the location of a visual stimulus. (b) Ashwood found that mice switched between three strategies when performing this task. In state 1, the model parameters indicated that mice were attending to the stimulus. For states 2 and 3, the model parameters indicated mice were weighing the stimulus less and instead following their own biases. (c) For each state, Ashwood examined the probability a mouse made a left or right decision depending on the strength of the visual stimulus. Mice performed the task almost perfectly in state 1 and displayed a strong bias in states 2 and 3. For example, in state 2, stimuli that strongly favored a rightward choice (Stimulus = 100) elicited a rightward choice on just over 50% of trials. (d) By examining the probability that a mouse was in each state on each successive trial, Ashwood found that mice would remain in a state for many successive trials before switching to a new state. For example, here the mouse remained in a biased state (3) for many trials before switching to the engaged state (1). Adapted from Ashwood et al 2022, Nature Neuroscience.

Ashwood’s approach was similar except the relationship between unobserved and observed variables — how she predicted internal state based on decisions — was more intricate. To study how internal states impacted decision-making, researchers from the IBL trained mice to turn a wheel left or right depending on the location of a visual stimulus (see Figure 1). Much like predicting the weather, Ashwood tried to predict the internal state of the animal — was it feeling impulsive? Was it ‘paying attention’ to the stimulus — based only on its decisions. Surprisingly, she found three distinct states that corresponded to unique decision-making ‘strategies’ the mouse used at various times. This was surprising because each behavioral trial (i.e. a unique decision following a unique stimulus) was in most ways identical to any other: although the stimulus varied over trials, the correct ‘strategy’ to perform the task was always the same.

When her model predicted that a mouse was in state one, the model predicted that the animal was using the visual stimulus to make its decision (as it should). This state corresponded to an ‘engaged’ strategy where the animal was actively focusing on the task. However, when the model predicted a mouse was in either state 2 or state 3, the model predicted that the mouse was ignoring the stimulus and repeatedly choosing one direction or the other based on its own biases. States 2 and 3 corresponded to ‘disengaged’ strategies where the animal was no longer attending to the task. 

The discovery of these two additional states offered insight into a long-standing question in behavioral science: why do subjects (including humans) make mistakes in situations when the correct decision is obvious? Historically these errors, referred to as ‘lapses’, were thought to arise unpredictably during an overall sequence of behavioral trials, and thought to be due to randomly occurring failures in attention or movement. Ashwood, however, found that when in the ‘engaged state’, mice performed the task nearly perfectly and did especially well on the easiest trials. In contrast, when in a ‘disengaged state’, mice performed very poorly on very easy trials, precisely because they based their decision on their own biases instead of the stimulus. These findings suggest that lapses are not randomly occurring failures of attention, but rather due to a switch in internal state that causes the mouse to no longer attend to the stimulus for many successive trials.

Ashwood’s results offer a fresh take on animal decision-making but also introduce new questions. For example, that mice persistently make choices following their own internal bias instead of the stimulus is surprising and leaves open the question why they would adopt such a losing strategy. One possibility is that mice form idiosyncratic habits, including ones that are less than ideal, as they learn the rules of the task and search for the best behavioral strategy that provides them with the most reward. “Amongst other possibilities, these states could reflect the explore-exploit tradeoff that is often discussed in reinforcement learning communities,” says Ashwood. “The mouse is not aware of the fact that the rules of the task don't change over time, so it could be the case that, by switching into one of the 'disengaged' states, the mouse is actually checking that a better strategy doesn't exist.” Future studies are needed to answer these questions and to better understand how internal states are shaped as mice learn this task.

The discovery of internal states that correspond to unique decision-making strategies begs the question of how these state changes correspond to changes within the brain. The study of Bolkan and Stone used the GLM-HMM framework to understand if the striatum, a brain region with a critical role in decision-making, shows signatures of these internal states.

To do this, the authors trained mice to perform a slightly more complex decision-making task. To provide the full experimental flexibility the authors required, these mice performed this task in virtual reality (see Figure 2). The mice had to keep track of the number of visual-sensory cues as it ran down a virtual corridor. To understand the striatum’s role in the task, the authors inserted channelrhodopsin, a light-sensitive protein, into the neurons of the striatum. The authors could then use laser light to briefly inactivate the striatum, effectively silencing it as the mouse performed the task.

Figure 2. (a) Mice navigated in a virtual environment. A projected image on a screen created a virtual world for the mouse. The image moved as the mouse moved on a spherical treadmill, creating the illusion that the mouse was moving through a virtual environment. (b) The mice performed a more complex decision-making task, in which a series of visual-tactile cues were played to the animal as it navigated a virtual corridor. The mouse had to keep track of which side had a greater number of cues and turn that direction at the end of the virtual maze to receive a liquid reward. (c). The authors used the GLM-HMM to find three distinct states that governed the choices of the mice. For each state, the authors examined the probability a mouse made a left or right decision depending on the difference in the number of sensory cues. When in state 2, optical stimulation of the striatum strongly impacted the animal’s decision, causing a bias in one direction. However, when in state 1, the animal’s performance was nearly identical whether or not the striatum was optically stimulated, suggesting a different brain region controlled decisions in these trials. Adapted from Bolkan, Stone et al 2022, Nature Neuroscience.

Using the GLM-HMM, Stone also found that mice expressed three internal states that guided their decisions. State three was reminiscent of Ashwood’s disengaged states, where the mice ignored sensory information. 

States one and two, on the other hand, provided a clear signature of state-dependence of the striatum. When in these states, mice performed the task equally well. But light inactivation of the striatum during state two trials caused the mice to ignore the sensory stimulus in favor of their own biases while during state one trials it had no effect on the animal’s decision-making strategy. These findings suggest that over successive trials mice switch between different neural circuits for guiding their choice: the striatum guiding behavior in state two trials and another brain region taking control in state one trials. 

Precisely what triggers these state switches and what advantage having multiple neural circuits underlying the same behavior might serve are questions that remain unanswered. “We're coming to find out in neuroscience that our view of how neural circuits drive behavior has been overly simplistic,” says Stone. “It's not necessarily that one circuit always underlies the same behavior. Our neural representations can change based on mood, context, experience, and a host of other variables.”

It’s also not clear if the striatum’s state-dependent role in decision-making is unique, but the researchers have their hunches: “The striatum is perhaps a particularly useful brain region to study to see this phenomenon but I think it's very likely that we will see this state-dependent mapping between brain and behavior replicated in other brain regions as well,” says Stone.

By Brian DePasquale