The Cognitive and Neural Mechanisms of Human Reversal Learning and Their Applications in Psychopathology

Xiang Jie; Feng Tingyong

doi:10.16719/j.cnki.1671-6981.20250204

PDF(1054 KB)

Journal of Psychological Science ›› 2025, Vol. 48 ›› Issue (2) : 295-305. DOI: 10.16719/j.cnki.1671-6981.20250204

General Psychology, Experimental Psychology & Ergonomics

The Cognitive and Neural Mechanisms of Human Reversal Learning and Their Applications in Psychopathology

Xiang Jie, Feng Tingyong

Author information +

History +

Abstract

In today's rapidly changing environment, cognitive and behavioral flexibility are becoming increasingly crucial. Reversal learning refers to the ability to adjust previously learned responses or strategies in the face of environmental changes or new rules, reflecting an individual's cognitive or behavioral flexibility. The reversal learning paradigm, originally applied in animal studies and later extended to human research, is widely used to assess cognitive flexibility. In a classic reversal learning paradigm, participants select between two stimuli to receive a reward; after a reversal of outcomes, they must adjust their choice. This process can be further complicated by probabilistic reversal learning to probe adaptability to changes. Despite numerous studies exploring the cognitive mechanisms, neural bases, and applications of reversal learning in the field of psychopathology, systematic reviews focusing on the cognitive and neural foundations of reversal learning and its applications remain scarce. This study aims to combine cognitive computational modeling and MRI research to comprehensively examine the cognitive processing models and neural mechanisms underlying reversal learning. It further analyzes the applications of reversal learning in psychopathology, with the goal of promoting the flexible use of reversal learning in future research and providing theoretical support for psychopathological studies.
The reinforcement learning models (RLM) allow for a nuanced analysis of the cognitive processes involved in reversal learning. These models divide the reversal learning process into decision-making, feedback reception, and learning stages, and provide corresponding computational metrics for each stage, such as value estimation (Q value), decision bias (P value), and decision stability (β value) during the decision-making stage; feedback sensitivity (ρ ), feedback strength (R ), feedback valence, and prediction error (PE) during feedback reception; and learning rate (α) in the learning stage.
The decision-making process in reversal learning is primarily influenced by the fronto-parietal network, including the ventromedial prefrontal cortex (vmPFC), dorsomedial prefrontal cortex (dmPFC), and parietal cortex. The processing of feedback intensity during the feedback phase is mainly associated with bilateral orbitofrontal cortex (OFC). Positive feedback processing primarily involves the medial OFC, ventral striatum (VS), anterior cingulate cortex (ACC), amygdala, and other brain regions associated with reward processing and emotional responses. In contrast, the processing of negative feedback and reward prediction errors (PE) relies on the fronto-parietal control network (dlPFC, vlPFC, IFG, and superior parietal lobule), the salience network (inferior frontal cortex, insula, and dACC), the emotion processing network (OFC and amygdala), the reward and motivation system (including the striatum and its specific regions such as dorsolateral striatum, ventral pallidum, ventral striatum, ventromedial prefrontal cortex, medial OFC, anterior insula, and dorsal anterior cingulate cortex), and the default mode network (mPFC and dPCC). This highlights a clear functional dissociation between the brain regions involved in processing positive and negative feedback. Finally, the learning phase predominantly engages the coordinated activity of the mPFC and dACC. These brain regions support successful completion of reversal learning tasks by integrating feedback information, monitoring errors, and adapting behavior.
Reversal learning paradigms have been widely used in neuropsychological research, such as in attention deficit hyperactivity disorder (ADHD), autism spectrum disorder (ASD), addictive behaviors, obsessive-compulsive disorder (OCD), depression and schizophrenia. For instance, individuals with ADHD may exhibit altered learning rates, leading to suboptimal responses to feedback, while those with OCD might demonstrate excessive sensitivity to negative feedback, resulting in rigid and perseverative behaviors. By integrating computational models and MRI techniques, reversal learning research not only reveals the dynamic characteristics of cognitive processing but also provides new perspectives for understanding cognitive flexibility deficits in various neuropsychiatric disorders.
According to the analysis of the existing research, there are three potential directions for future research. First, refine cognitive computational models to analyze more precisely the cognitive processing mechanisms underlying reversal learning. Second, update neural computational models to gain deeper insights into the neural basis of reversal learning. Finally, when applying reversal learning to psychopathology, individual differences should be fully considered.

Key words

reversal learning / flexibility / reinforcement learning model / magnetic resonance imaging (mri)

Cite this article

EndNote

Ris (Procite)

Bibtex

Download Citations

Xiang Jie, Feng Tingyong. The Cognitive and Neural Mechanisms of Human Reversal Learning and Their Applications in Psychopathology[J]. Journal of Psychological Science. 2025, 48(2): 295-305 https://doi.org/10.16719/j.cnki.1671-6981.20250204

References

[1] Aster H. C., Waltmann M., Busch A., Romanos M., Gamer M., van Noort B. M., Beck A., Kappel V., & Deserno L. (2024). Impaired flexible reward learning in ADHD patients is associated with blunted reinforcement sensitivity and neural signals in ventral striatum and parietal cortex. NeuroImage: Clinical, 42, Article 103588.
[2] Bağci B., Düsmez S., Zorlu N., Bahtiyar G., Isikli S., Bayrakci A., & Sebold M. (2022). Computational analysis of probabilistic reversal learning deficits in male subjects with alcohol use disorder. Frontiers in Psychiatry, 13, Article 960238.
[3] Belmans E., Raes F., Vervliet B., & Takano K. (2023). Depressive symptoms and persistent negative self-referent thinking among adolescents: A learning account. Acta Psychologica, 232, Article 103823.
[4] Boehme R., Lorenz R. C., Gleich T., Romund L., Pelz P., Golde S., & Beck A. (2017). Reversal learning strategy in adolescence is associated with prefrontal cortex activation. European Journal of Neuroscience, 45(1), 129-137.
[5] Bray S., Shimojo S., & O'Doherty J. P. (2010). Human medial orbitofrontal cortex is recruited during experience of imagined and real rewards. Journal of Neurophysiology, 103(5), 2506-2512.
[6] Brolsma S. C. A., Vrijsen J. N., Vassena E., Kandroodi M. R., Annemiek Bergman M., van Eijndhoven P. F., .. Cools R. (2022). Challenging the negative learning bias hypothesis of depression: Reversal learning in a naturalistic psychiatric sample. Psychological Medicine, 52(2), 303-313.
[7] Crawley D., Zhang L., Jones E. J. H., Ahmad J., Oakley B., Cáceres A. S. J., & the EU-AIMS LEAP group. (2020). Modeling flexible behavior in childhood to adulthood shows age-dependent learning mechanisms and less optimal learning in autism in each age group. PLoS Biology, 18(10), Article e3000908.
[8] D'Cruz A. M., Mosconi M. W., Ragozzino M. E., Cook E. H., & Sweeney J. A. (2016). Alterations in the functional neural circuitry supporting flexible choice behavior in autism spectrum disorders. Translational Psychiatry, 6(10), Article e916.
[9] Deserno L., Beck A., Huys Q. J. M., Lorenz R. C., Buchert R., Buchholz H. G., & Heinz A. (2015). Chronic alcohol intake abolishes the relationship between dopamine synthesis capacity and learning signals in the ventral striatum. European Journal of Neuroscience, 41(4), 477-486.
[10] Drapała, J., & Frydecka, D. (2022). Coarse-grained neural network model of the basal ganglia to simulate reinforcement learning tasks. Brain Sciences, 12(2), Article 262.
[11] Evers E. A. T., Cools R., Clark L., van der Veen, F. M., Jolles J., Sahakian B. J., & Robbins T. W. (2005). Serotonergic modulation of prefrontal cortex during negative feedback in probabilistic reversal learning. Neuropsychopharmacology, 30(6), 1138-1147.
[12] Fellows, L. K., & Farah, M. J. (2003). Ventromedial frontal cortex mediates affective shifting in humans: Evidence from a reversal learning paradigm. Brain, 126(8), 1830-1837.
[13] Fradkin I., Ludwig C., Eldar E., & Huppert J. D. (2020). Doubting what you already know: Uncertainty regarding state transitions is associated with obsessive compulsive symptoms. PLoS Computational Biology, 16(2), Article e1007634.
[14] Gläscher J., Hampton A. N., & O'Doherty J. P. (2009). Determining a role for ventromedial prefrontal cortex in encoding action-based value signals during reward-related decision making. Cerebral Cortex, 19(2), 483-495.
[15] Greening S. G., Finger E. C., & Mitchell, D. G. V. (2011). Parsing decision making processes in prefrontal cortex: Response inhibition, overcoming learned avoidance, and reversal learning. NeuroImage, 54(2), 1432-1441.
[16] Hampshire A., Chaudhry A. M., Owen A. M., & Roberts A. C. (2012). Dissociable roles for lateral orbitofrontal cortex and lateral prefrontal cortex during preference driven reversal learning. NeuroImage, 59(4), 4102-4112.
[17] Hampton A. N., Adolphs R., Michael Tyszka M., & O'Doherty J. P. (2007). Contributions of the amygdala to reward expectancy and choice signals in human prefrontal cortex. Neuron, 55(4), 545-555.
[18] Hampton A. N., Bossaerts P., & O'Doherty J. P. (2006). The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. Journal of Neuroscience, 26(32), 8360-8367.
[19] Hauser T. U., Iannaccone R., Ball J., Mathys C., Brandeis D., Walitza S., & Brem S. (2014). Role of the medial prefrontal cortex in impaired decision making in juvenile attention-deficit/hyperactivity disorder. JAMA Psychiatry, 71(10), 1165-1173.
[20] Hauser T. U., Iannaccone R., Walitza S., Brandeis D., & Brem S. (2015). Cognitive flexibility in adolescence: Neural and behavioral mechanisms of reward prediction error processing in adaptive decision making during development. NeuroImage, 104, 347-354.
[21] Hornak J., O'Doherty J., Bramham J., Rolls E. T., Morris R. G., Bullock P. R., & Polkey C. E. (2004). Reward-related reversal learning after surgical excisions in orbito-frontal or dorsolateral prefrontal cortex in humans. Journal of Cognitive Neuroscience, 16(3), 463-478.
[22] Jara-Rizzo M. F., Navas J. F., Rodas J. A., & Perales J. C. (2020). Decision-making inflexibility in a reversal learning task is associated with severity of problem gambling symptoms but not with a diagnosis of substance use disorder. BMC Psychology, 8(1), Article 120.
[23] Katthagen T., Kaminski J., Heinz A., Buchert R., & Schlagenhauf F. (2020). Striatal dopamine and reward prediction error signaling in unmedicated schizophrenia patients. Schizophrenia Bulletin, 46(6), 1535-1546.
[24] Kreis I., Zhang L., Moritz S., & Pfuhl G. (2022). Spared performance but increased uncertainty in schizophrenia: Evidence from a probabilistic decision-making task. Schizophrenia Research, 243, 414-423.
[25] Linke J., Kirsch P., King A. V., Gass A., Hennerici M. G., Bongers A., & Wessa M. (2010). Motivational orientation modulates the neural response to reward. NeuroImage, 49(3), 2618-2625.
[26] Livingston N. R., Hawkins P. C., Gilleen J., Ye R., Valdearenas L., Shergill S. S., & Mehta M. A. (2021). Preliminary evidence for the phosphodiesterase type-4 inhibitor, roflumilast, in ameliorating cognitive flexibility deficits in patients with schizophrenia. Journal of Psychopharmacology, 35(9), 1099-1110.
[27] Marzuki A. A., Tomic I., Ip S. H. Y., Gottwald J., Kanen J. W., Kaser M., & Robbins T. W. (2021). Association of environmental uncertainty with altered decision-making and learning mechanisms in youths with obsessive-compulsive disorder. JAMA Network Open, 4(11), Article e2136195.
[28] McCarthy H., Stanley J., Piech R., Skokauskas N., Mulligan A., Donohoe G., & Frodl T. (2018). Childhood-diagnosed ADHD, symptom progression, and reversal learning in adulthood. Journal of Attention Disorders, 22(6), 561-570.
[29] Meder D., Madsen K. H., Hulme O., & Siebner H. R. (2016). Chasing probabilities-Signaling negative and positive prediction errors across domains. NeuroImage, 134, 180-191.
[30] Mitchell D. G. V., Rhodes R. A., Pine D. S., & Blair, R. J. R. (2008). The contribution of ventrolateral and dorsolateral prefrontal cortex to response reversal. Behavioural Brain Research, 187(1), 80-87.
[31] Mukherjee D., Filipowicz A. L. S., Vo K., Satterthwaite T. D., & Kable J. W. (2020). Reward and punishment reversal-learning in major depressive disorder. Journal of Abnormal Psychology, 129(8), 810-823.
[32] Nutt D. J., Lingford-Hughes A., Erritzoe D., & Stokes, P. R. A. (2015). The dopamine theory of addiction: 40 years of highs and lows. Nature Reviews Neuroscience, 16(5), 305-312.
[33] Ogishima H., Maeda S., Tanaka Y., & Shimada H. (2020). Effects of depressive symptoms, feelings, and interoception on reward-based decision-making: Investigation using reinforcement learning model. Brain Sciences, 10(8), Article 508.
[34] Pajkossy P., Gesztesi G., & Racsmány M. (2023). How uncertain are you? Disentangling expected and unexpected uncertainty in pupil-linked brain arousal during reversal learning. Cognitive, Affective, and Behavioral Neuroscience, 23(3), 578-599.
[35] Panitz M., Deserno L., Kaminski E., Villringer A., Sehm B., & Schlagenhauf F. (2022). Anodal tDCS over the medial prefrontal cortex enhances behavioral adaptation after punishments during reversal learning through increased updating of unchosen choice options. Cerebral Cortex Communications, 3(1), Article tgac006.
[36] Pearce, J. M., & Hall, G. (1980). A model for Pavlovian learning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychological Review, 87(6), 532-552.
[37] Perandrés-Gómez A., Navas J. F., van Timmeren T., & Perales J. C. (2021). Decision-making (in)flexibility in gambling disorder. Addictive Behaviors, 112, Article 106534.
[38] Piau C., Mahmoudzadeh M., Kibleur A., Polosan M., David O., & Wallois F. (2021). Cortical hemodynamic mechanisms of reversal learning using high-resolution functional near-infrared spectroscopy: A pilot study. Neurophysiologie Clinique, 51(5), 409-424.
[39] Pietrzak M., Yngve A., Hamilton P. J., Asratian A., Gauffin E., Löfberg A., & Boehme R. (2024). Ghrelin decreases sensitivity to negative feedback and increases prediction-error related caudate activity in humans, a randomized controlled trial. Neuropsychopharmacology, 49(6), 1042-1049.
[40] Pievsky, M. A., & McGrath, R. E. (2018). The neurocognitive profile of attention-deficit/hyperactivity disorder: A review of meta-analyses. Archives of Clinical Neuropsychology: The Official Journal of the National Academy of Neuropsychologist, 33(2), 143-157.
[41] Portengen C. M., Sprooten E., Zwiers M. P., Hoekstra P. J., Dietrich A., Holz N. E., & Naaijen J. (2021). Reward and punishment sensitivity are associated with cross-disorder traits. Psychiatry Research, 298, Article 113795.
[42] Ray M. H., Hite T., Gallo M., & Pickens C. L. (2018). Operant over-responding is more sensitive than reversal learning for revealing behavioral changes after withdrawal from alcohol consumption. Physiology and Behavior, 196, 176-184.
[43] Reddy L. F., Waltz J. A., Green M. F., Wynn J. K., & Horan W. P. (2016). Probabilistic reversal learning in schizophrenia: Stability of deficits and potential causal mechanisms. Schizophrenia Bulletin, 42(4), 942-951.
[44] Remijnse P. L., Nielen M. M. A., Uylings H. B. M., & Veltman D. J. (2005). Neural correlates of a reversal learning task with an affectively neutral baseline: An event-related fMRI study. NeuroImage, 26(2), 609-618.
[45] Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black, W. F. Prokasy (Eds.), Classical conditioning II: Current theory and research (pp. 64-99). Meredith Corporation.
[46] Rossi-Goldthorpe R., Silverstein S. M., Gold J. M., Schiffman J., Waltz J. A., Williams T. F., & Corlett P. R. (2024). Different learning aberrations relate to delusion-like beliefs with different contents. Brain, 147(8), 2854-2866.
[47] Savage H. S., Davey C. G., Fullana M. A., & Harrison B. J. (2020). Clarifying the neural substrates of threat and safety reversal learning in humans. NeuroImage, 207, Article 116427.
[48] Schmitt L. M., Sweeney J. A., Erickson C. A., & Shaffer R. (2022). Brief report: Feasibility of the probabilistic reversal learning task as an outcome measure in an intervention trial for individuals with autism spectrum disorder. Journal of Autism and Developmental Disorders, 52(9), 4191-4199.
[49] Schutte I., Kenemans J. L., & Schutter, D. J. L. G. (2017). Resting-state theta/beta EEG ratio is associated with reward- and punishment-related reversal learning. Cognitive Affective and Behavioral Neuroscience, 17(4), 754-763.
[50] Swanson K., Averbeck B. B., & Laubach M. (2022). Noradrenergic regulation of two-armed bandit performance. Behavioral Neuroscience, 136(1), 84-99.
[51] Tobler P. N., O'Doherty J P., Dolan R. J., & Schultz W. (2006). Human neural learning depends on reward prediction errors in the blocking paradigm. Journal of Neurophysiology, 95(1), 301-310.
[52] Waltmann M., Schlagenhauf F., & Deserno L. (2022). Sufficient reliability of the behavioral and computational readouts of a probabilistic reversal learning task. Behavior Research Methods, 54(6), 2993-3014.
[53] Wieland L., Ebrahimi C., Katthagen T., Panitz M., Luettgau L., Heinz A., .. Sjoerds Z. (2023). Acute stress alters probabilistic reversal learning in healthy male adults. European Journal of Neuroscience, 57(5), 824-839.
[54] Wiering, M. A., & van Otterlo, M. (2012). Reinforcement learning. Adaptation, Learning, and Optimization, 12(3), Article 729.
[55] Wilkinson M. P., Slaney C. L., Mellor J. R., & Robinson, E. S. J. (2021). Investigation of reward learning and feedback sensitivity in non-clinical participants with a history of early life stress. PLoS ONE, 16(12), Article e0260444.
[56] Williams, S. B. (1942). Reversal learning after two degrees of training. Journal of Comparative Psychology, 34(3), 353-360.
[57] Wisniewski D., Forstmann B., & Brass M. (2019). Outcome contingency selectively affects the neural coding of outcomes but not of tasks. Scientific Reports, 9(1), Article 19395.
[58] Wolff, S., & Brechmann, A. (2023). Dorsal posterior cingulate cortex responds to negative feedback information supporting learning and relearning of response policies. Cerebral Cortex, 33(10), 5947-5956.
[59] Zhang Z. H., Mendelsohn A., Manson K. F., Schiller D., & Levy I. (2015). Dissociating value representation and inhibition of inappropriate affective response during reversal learning in the ventromedial prefrontal cortex. eNeuro, 2(6), Article e0072-15.