Skip to content

Summary: The brain uses two separate dopamine-based learning systems: one for evaluating outcomes and another for reinforcing repeated actions. Known as reward prediction error (RPE) and action prediction error (APE), these systems help explain how habits form and why they can become difficult to break. While RPE helps us learn from outcomes, APE strengthens behaviors we repeat often, enabling more efficient multitasking by freeing up cognitive resources. The study showed that damage to the brain’s tail of the striatum, where APE is encoded, prevents mice from forming habits, indicating this region’s essential role in habitual learning. Key Facts: Second Learning System Identified: Action prediction error (APE) allows habits to form by reinforcing frequent actions. Brain Region Implicated: The tail of the striatum plays a key role in habitual learning and is distinct from regions tied to outcome-based decisions. Clinical Potential: Insights may help develop treatments for addiction and Parkinson’s by targeting the APE system. Source: Sainsbury Wellcome Center Neuroscientists at the Sainsbury Wellcome Centre (SWC) at UCL have discovered that the brain uses a dual system for learning through trial and error.

This is the first time a second learning system has been identified, which could help explain how habits are formed and provide a scientific basis for new strategies to address conditions related to habitual learning, such as addictions and compulsions.

Published today in Nature, the study in mice could also have implications for developing therapeutics for Parkinson’s.

“Essentially, we have found a mechanism that we think is responsible for habits. Once you have developed a preference for a certain action, then you can bypass your value-based system and just rely on your default policy of what you’ve done in the past.

“This might then allow you to free up cognitive resources to make value-based decisions about something else,” explained Dr Marcus Stephenson-Jones, Group Leader at SWC and lead author of the study.

The researchers uncovered a dopamine signal in the brain that acts as a different kind of teaching signal to the one previously known.

Dopamine signals in the brain were already understood to form reward prediction errors (RPE), where they signal to the animal whether an actual outcome is better or worse than expected. In this new study, the scientists discovered that, in parallel to RPE, there is an additional dopamine signal, called action prediction error (APE), which updates how often an action is performed.

These two teaching signals give animals two different ways of learning to make a choice, learning to choose either the most valuable option or the most frequent option.

“Imagine going to your local sandwich shop. The first time you go, you might take your time choosing a sandwich and, depending on which you pick, you may or may not like it.

“But if you go back to the shop on many occasions, you no longer spend time wondering which sandwich to select and instead start picking one you like by default. We think it is the APE dopamine signal in the brain that is allowing you to store this default policy,” explained Dr Stephenson-Jones.

The newly discovered learning system provides a much simpler way of storing information than having to directly compare the value of different options. This might free up the brain to multi-task.

For example, once you have learned to drive, you can also hold a conversation with someone during your journey. While your default system is doing all the repetitive tasks to drive the car, your value-based system can decide what to talk about.

Previous research discovered the dopamine neurons needed for learning reside in three areas of the midbrain: the ventral tegmental area, substantia nigra pars compacta, and substantia nigra pars lateralis.

While some studies showed that these neurons were involved in coding for reward, earlier research found that half of these neurons code for movement, but the reason remained a mystery.

RPE neurons project to all areas of the striatum apart from one, called the tail of the striatum. Whereas the movement-specific neurons project to all areas apart from the nucleus accumbens.

This means that the nucleus accumbens exclusively signals reward, and the tail of the striatum exclusively signals movement.

By investigating the tail of the striatum, the team were able to isolate the movement neurons and discover their function.

To test this, the researchers used an auditory discrimination task in mice, which was originally developed by scientists at Cold Spring Harbor Laboratory. Co first authors, Dr Francesca Greenstreet, Dr Hernando Martinez Vergara and Dr Yvonne Johansson, used a genetically encoded dopamine sensor, which showed that dopamine release in this area was not related to reward, but it was related to movement.

“When we lesioned the tail of the striatum, we found a very characteristic pattern. We observed that lesioned mice and control mice initially learn in the same way, but once they get to about 60-70% performance, i.e. when they develop a preference (for example, for a high tone go left, for a low tone, go right), then the control mice rapidly learn and develop expert performance, whereas the lesioned mice only continue to learn in a linear fashion.

“This is because the lesioned mice can only use RPE, whereas the control mice have two learning systems, RPE and APE, which contribute to the choice,” explained Dr Stephenson Jones.

To further understand this, the team silenced the tail of striatum in expert mice and found that this had a catastrophic effect on their performance in the task.

This showed that while in early learning animals form a preference using the value-based system based on RPE, in late learning they switch to exclusively use APE in the tail of striatum to store these stable associations and drive their choice.

The team also used extensive computational modelling, led by Dr Claudia Clopath, to understand how the two systems, RPE and APE, learn together.

These findings hint at why it is so hard to break bad habits and why replacing an action with something else may be the best strategy.

If you replace an action consistently enough, such as chewing on nicotine gum instead of smoking, the APE system may be able to take over and form a new habit on top of the other one.

“Now that we know this second learning system exists in the brain, we have a scientific basis for developing new strategies to break bad habits. Up until now, most research on addictions and compulsions has focused on the nucleus accumbens.

“Our research has opened up a new place to look in the brain for potential therapeutic targets,” commented Dr Stephenson Jones.

This research also has potential implications for Parkinson’s, which is known to be caused by the death of midbrain dopamine neurons, specifically in substantia nigra pars compacta. The type of cells that have been shown to die are movement-related dopamine neurons, which may be responsible for coding APE.

This may explain why people with Parkinson’s experience deficits in doing habitual behaviours such as walking, however they do not experience deficits in more flexible behaviours such as ice skating.

“Suddenly, we now have a theory for paradoxical movement in Parkinson’s. The movement related neurons that die are the ones that drive habitual behaviour.

“And so, movement that uses the habitual system is compromised, but movement that uses your value-based flexible system is fine. This gives us a new place to look in the brain and a new way of thinking about Parkinson’s,” concluded Dr Stephenson-Jones.

The research team is now testing whether APE is really needed for habits. They are also exploring what exactly is being learned in each system and how the two work together.

 

Summary: Scientists have identified the neural circuitry responsible for assigning emotional value, positive or negative, to social encounters. Two key neuromodulators, serotonin and neurotensin, were found to control opposing emotional responses in a brain region responsible for learning and memory.

In a mouse model of autism spectrum disorder (ASD), activating serotonin receptors restored the ability to form positive impressions from social interactions. The findings could pave the way for therapies that target emotional imbalances in disorders like ASD and schizophrenia.

Key Facts:

  • Emotional Tagging: Serotonin and neurotensin in the hippocampus determine whether a social interaction feels positive or negative.
  • Reversing Deficits: Stimulating serotonin 1B receptors restored positive social impressions in a mouse model of ASD.
  • Therapeutic Potential: The study reveals specific neuromodulatory targets that could inform future treatments for social cognitive deficits.

Source: Mount Sinai Hospital

Mount Sinai researchers have identified for the first time the neural mechanisms in the brain that regulate both positive and negative impressions of a social encounter, as well as how an imbalance between the two could lead to common neuropsychiatric disorders like autism spectrum disorder (ASD) and schizophrenia.

The study, published April 30 in Nature, also describes how activating a serotonin receptor in the brain of a mouse model of ASD restored positive emotional value (also known as “valence”), with encouraging implications for the development of future therapies.

“The ability to recognize and distinguish unpleasant from pleasant interactions is essential for humans to navigate their social environment,” says Xiaoting Wu, PhD, Assistant Professor of Neuroscience at the Icahn School of Medicine at Mount Sinai, and senior author of the study.

“Until now, it has been unclear how the brain assigns positivity or negativity—‘valence’—to social experiences, and how that information can be flexibly updated in a constantly changing environment.”

At the center of this complex neural circuitry is the hippocampus, located deep in the temporal lobe of the brain and responsible for forming new memories, learning, and emotions.

The Mount Sinai researchers described how two neuromodulators—serotonin and neurotensin, which influence processes such as mood, arousal, and neural plasticity—are released into the hippocampal subregion known as ventral CA1, where they control opposing social valence assignment.

Both neurotransmitters impact distinct populations of ventral CA1 neurons through their respective receptors, serotonin 1B and neurotensin 1.

While deficits in social valence are known to be prevalent in many neuropsychiatric disorders, their underlying neural mechanisms and pathophysiology have remained elusive.

“Through our work we’ve provided the first foundational insights into the neural basis of social valence,” notes Dr. Wu.

“We have demonstrated that the neuromodulators serotonin and neurotensin signal opposing valence, revealing a fundamental principle of brain function in the form of a neuromodulatory switch that allows behavioral adaptation based on social history.”

Specifically, the team developed a novel social cognitive paradigm that involved exposing mice to negative and positive social encounters. In the negative social encounter, the test mouse was exposed to a mean/aggressive mouse; in the positive encounter, the mouse was exposed to a potential mate.

In both assays the mice had negative or neutral/positive or neutral interaction and then got to choose which mouse they would like to spend more time with.

Without prior experience, the mice did not have a preference, but with the experience, they associated a mouse with a positive or negative valence and then learned to avoid the “bad” mouse or approach the “good” mouse.”

Just as importantly, the team uncovered specific drug targets for positive and negative valence, knowledge that could potentially factor into future treatments.

Specifically, serotonin acting on the serotonin 1B receptor generates a positive impression of a social encounter, while neurotensin acting on the neurotensin 1 receptor creates a negative impression. Imbalanced emotional processing of those two social experiences is known to be a debilitating symptom of ASD.

Consequently, by activating the serotonin 1B receptor, researchers were able to restore a positive impression associated with rewarding social experiences.

“We identified a specific neuromodulator receptor which we then targeted to rescue social cognitive deficits in a mouse model of ASD,” Dr. Wu explains.

“On a broader scale, our work provides critical insights into complex social behaviors while revealing potential therapeutic targets that can be leveraged to improve social cognitive deficits in common neuropsychiatric disorders.”

Funding: This work was supported by funding from NIH K99 Career Development Award (grant no. MH122697), NIMH BRAINS R01 Award (grant no. MH136228), Alkermes Pathways Award, NARSAD Young Investigator Award and Friedman Brain Institute Scholar Award.