Reinforcement learning and Bayesian inference provide complementary models for the unique advantage of adolescents in stochastic reversal

During adolescence, youth venture out, explore the wider world, and are challenged to learn how to navigate novel and uncertain environments. We investigated how performance changes across adolescent development in a stochastic, volatile reversal-learning task that uniquely taxes the balance of persistence and flexibility. In a sample of 291 participants aged 8–30, we found that in the mid-teen years, adolescents outperformed both younger and older participants. We developed two independent cognitive models, based on Reinforcement learning (RL) and Bayesian inference (BI). The RL parameter for learning from negative outcomes and the BI parameters specifying participants’ mental models were closest to optimal in mid-teen adolescents, suggesting a central role in adolescent cognitive processing. By contrast, persistence and noise parameters improved monotonically with age. We distilled the insights of RL and BI using principal component analysis and found that three shared components interacted to form the adolescent performance peak: adult-like behavioral quality, child-like time scales, and developmentally-unique processing of positive feedback. This research highlights adolescence as a neurodevelopmental window that can create performance advantages in volatile and uncertain environments. It also shows how detailed insights can be gleaned by using cognitive models in new ways.

Maria K. Eckstein, Sarah L. Master, Ronald E. Dahl, Linda Wilbrecht, Anne G.E. Collins, Reinforcement learning and Bayesian inference provide complementary models for the unique advantage of adolescents in stochastic reversal, Developmental Cognitive Neuroscience, Volume 55, 2022, 101106, ISSN 1878-9293, https://doi.org/10.1016/j.dcn.2022.101106, https://www.sciencedirect.com/science/article/pii/S1878929322000494

Reinforcement learning and Bayesian inference provide complementary models for the unique advantage of adolescents in stochastic reversal2022-06-18T20:50:11+00:00

Modeling changes in probabilistic reinforcement learning during adolescence

In the real world, many relationships between events are uncertain and probabilistic. Uncertainty is also likely to be a more common feature of daily experience for youth because they have less experience to draw from than adults. Some studies suggest probabilistic learning may be inefficient in youths compared to adults, while others suggest it may be more efficient in youths in mid adolescence. Here we used a probabilistic reinforcement learning task to test how youth age 8-17 (N = 187) and adults age 18-30 (N = 110) learn about stable probabilistic contingencies. Performance increased with age through early-twenties, then stabilized. Using hierarchical Bayesian methods to fit computational reinforcement learning models, we show that all participants’ performance was better explained by models in which negative outcomes had minimal to no impact on learning. The performance increase over age was driven by 1) an increase in learning rate (i.e. decrease in integration time scale); 2) a decrease in noisy/exploratory choices. In mid-adolescence age 13-15, salivary testosterone and learning rate were positively related. We discuss our findings in the context of other studies and hypotheses about adolescent brain development.

Liyu Xia, Sarah L. Master, Maria K. Eckstein, Beth Baribault, Ronald E. Dahl, Linda Wilbrecht, Anne Gabrielle Eva Collins, Modeling changes in probabilistic reinforcement learning during adolescence, July 2021, https://doi.org/10.1371/journal.pcbi.1008524

Modeling changes in probabilistic reinforcement learning during adolescence2022-06-18T20:53:19+00:00

Learning Rates Are Not All the Same: The Interpretation of Computational Model Parameters Depends on the Context

Reinforcement Learning (RL) has revolutionized the cognitive and brain sciences, explaining behavior from simple conditioning to problem solving, across the life span, and anchored in brain function. However, discrepancies in results are increasingly apparent between studies, particularly in the developmental literature. To better understand these, we investigated to which extent parameters generalize between tasks and models, and capture specific and uniquely interpretable (neuro)cognitive processes. 291 participants aged 8-30 years completed three learning tasks in a single session, and were fitted using state-of-the-art RL models. RL decision noise/exploration parameters generalized well between tasks, decreasing between ages 8-17. Learning rates for negative feedback did not generalize, and learning rates for positive feedback showed intermediate generalizability, dependent on task similarity. These findings can explain discrepancies in the existing literature. Future research therefore needs to carefully consider task characteristics when relating findings across studies, and develop strategies to computationally model how context impacts behavior.

Maria K Eckstein, Sarah L Master, Liyu Xia, Ronald E Dahl, Linda Wilbrecht, Anne Gabrielle Eva Collins, Learning Rates Are Not All the Same: The Interpretation of Computational Model Parameters Depends on the Context (May 2021), https://www.biorxiv.org/content/10.1101/2021.05.28.446162v1
doi: https://doi.org/10.1101/2021.05.28.446162

Learning Rates Are Not All the Same: The Interpretation of Computational Model Parameters Depends on the Context2022-06-18T20:51:56+00:00

What do Reinforcement Learning Models Measure? Interpreting Model Parameters in Cognition and Neuroscience

Reinforcement learning (RL) is a concept that has been invaluable to research fields including machine learning, neuroscience, and cognitive science. However, what RL entails partly differs between fields, leading to difficulties when interpreting and translating findings.

This paper lays out these differences and zooms in on cognitive (neuro)science, revealing that we often overinterpret RL modeling results, with severe consequences for future research. Specifically, researchers often assume—implicitly—that model parameters generalize between tasks, models, and participant populations, despite overwhelming negative empirical evidence for this assumption. We also often assume that parameters measure specific, unique, and meaningful (neuro)cognitive processes, a concept we call interpretability, for which empirical evidence is also lacking.

We conclude that future computational research needs to pay increased attention to these implicit assumptions when using RL models, and suggest an alternative framework that resolves these issues and allows us to unleash the potential of RL in cognitive (neuro)science.

Maria Eckstein, Linda Wilbrecht, Anne Collins, What do Reinforcement Learning Models Measure? Interpreting Model Parameters in Cognition and Neuroscience (May 2021), https://psyarxiv.com/e7kwx/

 

What do Reinforcement Learning Models Measure? Interpreting Model Parameters in Cognition and Neuroscience2022-06-18T20:52:15+00:00

The Unique Advantage of Adolescents in Probabilistic Reversal: Reinforcement Learning and Bayesian Inference Provide Adequate and Complementary Models

During adolescence, youth venture out, explore the wider world, and are challenged to learn how to navigate novel and uncertain environments. We investigated whether adolescents are uniquely adapted to this transition, compared to younger children and adults. In a stochastic, volatile reversal learning task with a sample of 291 participants aged 8-30, we found that adolescents 13-15 years old outperformed both younger and older participants. We developed two independent cognitive models, one based on Reinforcement learning (RL) and the other Bayesian inference (BI), and used hierarchical Bayesian model fitting to assess developmental changes in underlying cognitive mechanisms. Choice parameters in both models improved monotonously. By contrast, RL update parameters and BI mental-model parameters peaked closest to optimal values in 13-to-15-year-olds. Combining both models using principal component analysis yielded new insights, revealing that three readily-interpretable components contributed to the early-to mid-adolescent performance peak. This research highlights early-to mid-adolescence as a neurodevelopmental window that may be more optimal for behavioral adjustment in volatile and uncertain environments. It also shows how increasingly detailed insights can be gleaned by invoking different cognitive models.

Maria K. Eckstein, Sarah L. Master, Ronald E. Dahl, Linda Wilbrecht, Anne G.E. Collins, The Unique Advantage of Adolescents in Probabilistic Reversal: Reinforcement Learning and Bayesian Inference Provide Adequate and Complementary Models (Mar. 2021) https://www.biorxiv.org/content/10.1101/2020.07.04.187971v2.full
doi: https://doi.org/10.1101/2020.07.04.187971

The Unique Advantage of Adolescents in Probabilistic Reversal: Reinforcement Learning and Bayesian Inference Provide Adequate and Complementary Models2022-06-18T20:52:28+00:00

Choice suppression is achieved through opponent but not independent function of the striatal indirect pathway in mice

The dorsomedial striatum (DMS) plays a key role in action selection, but little is known about how direct and indirect pathway spiny projection neurons (dSPNs and iSPNs) contribute to choice suppression in freely moving animals. Here, we used pathway-specific chemogenetic manipulation during a serial choice foraging task to test opposing predictions for iSPN function generated by two theories: 1) the ‘select/suppress’ heuristic which suggests iSPN activity is required to suppress alternate choices and 2) the network-inspired Opponent Actor Learning model (OpAL) which proposes that the weighted difference of dSPN and iSPN activity determines choice. We found that chemogenetic activation, but not inhibition, of iSPNs disrupted learned suppression of nonrewarded choices, consistent with the predictions of the OpAL model. Our findings suggest that iSPNs’ role in stopping and freezing does not extend in a simple fashion to choice suppression. These data may provide insights critical for the successful design of interventions for addiction or other conditions in which suppression of behavior is desirable.

Kristen Delevich, Benjamin Hoshal, Anne GE Collins, Linda Wilbrecht, Choice suppression is achieved through opponent but not independent function of the striatal indirect pathway in mice, BioRxiv, https://www.biorxiv.org/content/10.1101/675850v3
doi: https://doi.org/10.1101/675850

 

Choice suppression is achieved through opponent but not independent function of the striatal indirect pathway in mice2022-06-18T20:52:41+00:00

Distentangling the systems contributing to changes in learning during adolescence

Multiple neurocognitive systems contribute simultaneously to learning. For example, dopamine and basal ganglia (BG) systems are thought to support reinforcement learning (RL) by incrementally updating the value of choices, while the prefrontal cortex (PFC) contributes different computations, such as actively maintaining precise information in working memory (WM). It is commonly thought that WM and PFC show more protracted development than RL and BG systems, yet their contributions are rarely assessed in tandem. Here, we used a simple learning task to test how RL and WM contribute to changes in learning across adolescence. We tested 187 subjects ages 8 to 17 and 53 adults (25-30). Participants learned stimulus-action associations from feedback; the learning load was varied to be within or exceed WM capacity. Participants age 8-12 learned slower than participants age 13-17, and were more sensitive to load. We used computational modeling to estimate subjects’ use of WM and RL processes. Surprisingly, we found more protracted changes in RL than WM during development. RL learning rate increased with age until age 18 and WM parameters showed more subtle, gender- and puberty-dependent changes early in adolescence. These results can inform education and intervention strategies based on the developmental science of learning.

Sarah L. Master, Maria K. Eckstein, Neta Gotlieb, Ronald Dahl, Linda Wilbrecht, Anne G.E. Collins, Distentangling the systems contributing to changes in learning during adolescence, Developmental Cognitive Neuroscience (2019),
https://doi.org/10.1016/j.dcn.2019.100732, http://www.sciencedirect.com/science/article/pii/S1878929319303196

Distentangling the systems contributing to changes in learning during adolescence2019-11-20T23:08:54+00:00