By
A recent MIT study on mouse behavior in reward-based tasks showed that mice, while capable of learning the best strategy, often deviate from it, suggesting a complex decision-making process. This finding, using a new analysis tool called blockHMM, has potential implications for neurological research, particularly in understanding conditions like schizophrenia and autism.
In a simple game that humans typically ace, mice learn the winning strategy, too, but refuse to commit to it, new research shows.
Neuroscience discoveries ranging from the nature of memory to treatments for disease have depended on reading the minds of mice, so researchers need to truly understand what the rodents’ behavior is telling them during experiments. In a new study that examines learning from reward,
Exploring Mice’s Decision-Making Strategies
While the mouse motif of departing from the optimal strategy could be due to a failure to hold it in memory, says lead author and Sur Lab graduate student Nhat Le, another possibility is that mice don’t commit to the “win-stay, lose-shift” approach because they don’t trust that their circumstances will remain stable or predictable. Instead, they might deviate from the optimal regime to test whether the rules have changed. Natural settings, after all, are rarely stable or predictable.
“I’d like to think mice are smarter than we give them credit for,” Le says.
But regardless of which reason may cause the mice to mix strategies, adds co-senior author Mehrdad Jazayeri, associate professor in BCS and the McGovern Institute for Brain Research, it is important for researchers to recognize that they do and to be able to tell when and how they are choosing one strategy or another.
Analyzing Mice Behavior With New Methods
“This study highlights the fact that, unlike the accepted wisdom, mice doing lab tasks do not necessarily adopt a stationary strategy, and it offers a computationally rigorous approach to detect and quantify such non-stationarities,” he says. “This ability is important because when researchers record the neural activity, their interpretation of the underlying algorithms and mechanisms may be invalid when they do not take the animals’ shifting strategies into account.”
The research team, which also includes co-author Murat Yildirim, a former Sur lab postdoc who is now an assistant professor at the Cleveland Clinic Lerner Research Institute, initially expected that the mice might adopt one strategy or the other. They simulated the results they’d expect to see if the mice either adopted the optimal strategy of inferring a rule about the task, or more randomly surveying whether left or right turns were being rewarded. Mouse behavior on the task, even after days, varied widely, but it never resembled the results simulated by just one strategy.
To differing, individual extents, mouse performance on the task reflected variance along three parameters: how quickly they switched directions after the rule switched, how long it took them to transition to the new direction, and how loyal they remained to the new direction. Across 21 mice, the raw data represented a surprising diversity of outcomes on a task that neurotypical humans uniformly optimize. But the mice clearly weren’t helpless. Their average performance significantly improved over time, even though it plateaued below the optimal level.
In the task, the rewarded side switched every 15-25 turns. The team realized the mice were using more than one strategy in each such “block” of the game, rather than just inferring the simple rule and optimizing based on that inference. To disentangle when the mice were employing that strategy or another, the team harnessed an analytical framework called a Hidden Markov Model (HMM), which can computationally tease out when one unseen state is producing a result versus another unseen state. Le likens it to what a judge on a cooking show might do: inferring which chef contestant made which version of a dish based on patterns in each plate of food before them.
Before the team could use an HMM to decipher their mouse performance results, however, they had to adapt it. A typical HMM might apply to individual mouse choices, but here the team modified it to explain choice transitions over the course of whole blocks. They dubbed their modified model the blockHMM. Computational simulations of task performance using the blockHMM showed that the algorithm is able to infer the true hidden states of an artificial agent. The authors then used this technique to show the mice were persistently blending multiple strategies, achieving varied levels of performance.
“We verified that each animal executes a mixture of behavior from multiple regimes instead of a behavior in a single domain,” Le and his co-authors wrote. “Indeed 17/21 mice used a combination of low, medium, and high-performance behavior modes.”
Further analysis revealed that the strategies afoot were indeed the “correct” rule inference strategy and a more exploratory strategy consistent with randomly testing options to get turn-by-turn feedback.
Future Research Directions
Now that the researchers have decoded the peculiar approach mice take to reversal learning, they are planning to look more deeply into the brain to understand which brain regions and circuits are involved. By watching brain cell activity during the task, they hope to discern what underlies the decisions the mice make to switch strategies.
By examining reversal learning circuits in detail, Sur says, it’s possible the team will gain insights that could help explain why people with schizophrenia show diminished performance on reversal learning tasks. Sur added that some people with autism spectrum disorders also persist with newly unrewarded behaviors longer than neurotypical people, so his lab will also have that phenomenon in mind as they investigate.
Yildirim, too, is interested in examining potential clinical connections.
“This reversal learning paradigm fascinates me since I want to use it in my lab with various preclinical models of neurological disorders,” he says. “The next step for us is to determine the brain mechanisms underlying these differences in behavioral strategies and whether we can manipulate these strategies.”
Reference: “Mixtures of strategies underlie rodent behavior during reversal learning” by Nhat Minh Le, Murat Yildirim, Yizhi Wang, Hiroki Sugihara, Mehrdad Jazayeri and Mriganka Sur, 14 September 2023, PLOS Computational Biology.
DOI: 10.1371/journal.pcbi.1011430
Funding for the study came from The