Neuroscientists led by Institute Professor Ann Graybiel found that untrained monkeys performing a simple visual scanning task gradually developed efficient patterns that allowed them to minimize the time it took to receive their reward.
Institute Professor Ann Graybiel
Photo: Donna Coveney
The task was designed to mimic natural scenarios — a nearly infinite number of choices for the monkeys to make and an unpredictable reward structure. “We wanted to create an environment that would be similar to the world we walk around in every day — an environment where there are lots of choices the animal can make,” says Theresa Desrochers, an MIT graduate student and lead author of a paper describing the work in the Proceedings of the National Academy of Sciences the week of Oct. 25.
The findings not only help reveal how the brain forms habits, but also could shed light on neurological disorders where amplified habit-formation results in highly repetitive behavior, such as Tourette’s syndrome, obsessive-compulsive disorder and schizophrenia, says Graybiel.
Graybiel and Desrochers took an unusual approach to their study. In most behavioral studies of monkeys, the researchers first train the animals to perform a task, then begin experiments. In this case, Graybiel and Desrochers wanted to see if the monkeys could learn a simple visual free-scanning task with no training at all.
The researchers measured the monkeys’ eye movements and brain activity as they looked at a grid of either four or nine dots. In each trial, after a period of time when the monkey just looked around, a different dot was randomly chosen to be “baited,” meaning that the monkey succeeded in the trial when its gaze landed on that dot. After a successful trial ended, the monkey received a food reward.
While the task itself is simple, it is capable of generating a rich variety of behavior, due to the number of choices available to the animals.
“There are cases in which people have looked at the way the brain makes sequential decisions in a really simple task, but nobody has gone in with a really complex state space,” meaning the animals have a near infinite number of choices, says Read Montague, professor of neuroscience at Baylor College of Medicine, who was not involved with the study.
This video demonstrates the sequential pathways that monkeys used to look at a grid of dots. They received a reward when their gaze landed on one of the dots.
Video: Theresa Desrochers and Daniel Gibson
The monkeys performed such trials about 1,000 times a day, and over several months, they developed ways to look at all of the different dots in sequences that were more and more cost-effective — meaning that they reached the target dot faster.
The changes were gradual: The animals would use one pattern for five to 10 days, then shift to a slightly different pattern. When looking at the entire mass of data, the researchers couldn’t tell what was driving these changes. However, a trial-by-trial analysis revealed that very small variations in the scanning patterns could reduce the overall time to receive the reward, which would then reinforce that behavior and lead the monkey to adopt the new pattern.
“The upshot was that tiny little changes in cost — how far they moved the eyes — seemed to be driving these shifts until they did it as optimally as they could, despite the fact that they had never been instructed,” says Graybiel.
This suggests that primates have an “inborn tendency to maximize reward and minimize cost,” says Graybiel. She and Desrochers believe the same kind of phenomenon, known as reinforcement learning, may also guide human habit formation.
“When you drive to work, it’s never going to take exactly the same amount of time. You might try one different street to avoid a stoplight, or some other subtle variation. At some point, you may completely shift,” says Desrochers.
Desrochers and Graybiel plan to design studies that will test whether humans show the same kind of habit-forming behavior in an eye-scanning task similar to the one the monkeys learned. They also hope to discover which parts of the brain control habit formation. They believe that the basal ganglia, which play a role in learning, and the prefrontal cortex, which is involved in planning, are likely candidates.