TY - JOUR A1 - Kastius, Alexander A1 - Schlosser, Rainer T1 - Dynamic pricing under competition using reinforcement learning JF - Journal of revenue and pricing management N2 - Dynamic pricing is considered a possibility to gain an advantage over competitors in modern online markets. The past advancements in Reinforcement Learning (RL) provided more capable algorithms that can be used to solve pricing problems. In this paper, we study the performance of Deep Q-Networks (DQN) and Soft Actor Critic (SAC) in different market models. We consider tractable duopoly settings, where optimal solutions derived by dynamic programming techniques can be used for verification, as well as oligopoly settings, which are usually intractable due to the curse of dimensionality. We find that both algorithms provide reasonable results, while SAC performs better than DQN. Moreover, we show that under certain conditions, RL algorithms can be forced into collusion by their competitors without direct communication. KW - Dynamic pricing KW - Competition KW - Reinforcement learning KW - E-commerce KW - Price collusion Y1 - 2021 U6 - https://doi.org/10.1057/s41272-021-00285-3 SN - 1476-6930 SN - 1477-657X VL - 21 IS - 1 SP - 50 EP - 63 PB - Springer Nature Switzerland AG CY - Cham ER - TY - JOUR A1 - Balta Beylergil, Sinem A1 - Beck, Anne A1 - Deserno, Lorenz A1 - Lorenz, Robert C. A1 - Rapp, Michael Armin A1 - Schlagenhauf, Florian A1 - Heinz, Andreas A1 - Obermayer, Klaus T1 - Dorsolateral prefrontal cortex contributes to the impaired behavioral adaptation in alcohol dependence JF - NeuroImage: Clinical : a journal of diseases affecting the nervous system N2 - Substance-dependent individuals often lack the ability to adjust decisions flexibly in response to the changes in reward contingencies. Prediction errors (PEs) are thought to mediate flexible decision-making by updating the reward values associated with available actions. In this study, we explored whether the neurobiological correlates of PEs are altered in alcohol dependence. Behavioral, and functional magnetic resonance imaging (fMRI) data were simultaneously acquired from 34 abstinent alcohol-dependent patients (ADP) and 26 healthy controls (HC) during a probabilistic reward-guided decision-making task with dynamically changing reinforcement contingencies. A hierarchical Bayesian inference method was used to fit and compare learning models with different assumptions about the amount of task-related information subjects may have inferred during the experiment. Here, we observed that the best-fitting model was a modified Rescorla-Wagner type model, the “double-update” model, which assumes that subjects infer the knowledge that reward contingencies are anti-correlated, and integrate both actual and hypothetical outcomes into their decisions. Moreover, comparison of the best-fitting model's parameters showed that ADP were less sensitive to punishments compared to HC. Hence, decisions of ADP after punishments were loosely coupled with the expected reward values assigned to them. A correlation analysis between the model-generated PEs and the fMRI data revealed a reduced association between these PEs and the BOLD activity in the dorsolateral prefrontal cortex (DLPFC) of ADP. A hemispheric asymmetry was observed in the DLPFC when positive and negative PE signals were analyzed separately. The right DLPFC activity in ADP showed a reduced correlation with positive PEs. On the other hand, ADP, particularly the patients with high dependence severity, recruited the left DLPFC to a lesser extent than HC for processing negative PE signals. These results suggest that the DLPFC, which has been linked to adaptive control of action selection, may play an important role in cognitive inflexibility observed in alcohol dependence when reinforcement contingencies change. Particularly, the left DLPFC may contribute to this impaired behavioral adaptation, possibly by impeding the extinction of the actions that no longer lead to a reward. KW - Alcohol dependence KW - Prediction error KW - Reinforcement learning KW - Reversal learning KW - Dorsolateral prefrontal cortex KW - Decision-making Y1 - 2017 U6 - https://doi.org/10.1016/j.nicl.2017.04.010 SN - 2213-1582 VL - 15 SP - 80 EP - 94 PB - Elsevier CY - Oxford ER - TY - JOUR A1 - Sebold, Miriam Hannah A1 - Garbusow, Maria A1 - Jetzschmann, P. A1 - Schad, Daniel A1 - Nebe, S. A1 - Schlagenhauf, Florian A1 - Heinz, A. A1 - Rapp, Michael Armin A1 - Romanczuk-Seiferth, Nina T1 - Reward and avoidance learning in the context of aversive environments and possible implications for depressive symptoms JF - Psychopharmacology N2 - Background Aversive stimuli in the environment influence human actions. This includes valence-dependent influences on action selection, e.g., increased avoidance but decreased approach behavior. However, it is yet unclear how aversive stimuli interact with complex learning and decision-making in the reward and avoidance domain. Moreover, the underlying computational mechanisms of these decision-making biases are unknown. Methods To elucidate these mechanisms, 54 healthy young male subjects performed a two-step sequential decision-making task, which allows to computationally model different aspects of learning, e.g., model-free, habitual, and model-based, goal-directed learning. We used a within-subject design, crossing task valence (reward vs. punishment learning) with emotional context (aversive vs. neutral background stimuli). We analyzed choice data, applied a computational model, and performed simulations. Results Whereas model-based learning was not affected, aversive stimuli interacted with model-free learning in a way that depended on task valence. Thus, aversive stimuli increased model-free avoidance learning but decreased model-free reward learning. The computational model confirmed this effect: the parameter lambda that indicates the influence of reward prediction errors on decision values was increased in the punishment condition but decreased in the reward condition when aversive stimuli were present. Further, by using the inferred computational parameters to simulate choice data, our effects were captured. Exploratory analyses revealed that the observed biases were associated with subclinical depressive symptoms. Conclusion Our data show that aversive environmental stimuli affect complex learning and decision-making, which depends on task valence. Further, we provide a model of the underlying computations of this affective modulation. Finally, our finding of increased decision-making biases in subjects reporting subclinical depressive symptoms matches recent reports of amplified Pavlovian influences on action selection in depression and suggests a potential vulnerability factor for mood disorders. We discuss our findings in the light of the involvement of the neuromodulators serotonin and dopamine. KW - Reward learning KW - Avoidance learning KW - Reinforcement learning KW - Computational psychiatry KW - Decision-making KW - Affective modulation KW - Depression symptoms Y1 - 2019 U6 - https://doi.org/10.1007/s00213-019-05299-9 SN - 0033-3158 SN - 1432-2072 VL - 236 IS - 8 SP - 2437 EP - 2449 PB - Springer CY - New York ER - TY - JOUR A1 - Sebold, Miriam Hannah A1 - Nebe, Stephan A1 - Garbusow, Maria A1 - Guggenmos, Matthias A1 - Schad, Daniel A1 - Beck, Anne A1 - Kuitunen-Paul, Sören A1 - Sommer, Christian A1 - Frank, Robin A1 - Neu, Peter A1 - Zimmermann, Ulrich S. A1 - Rapp, Michael Armin A1 - Smolka, Michael N. A1 - Huys, Quentin J. M. A1 - Schlagenhauf, Florian A1 - Heinz, Andreas T1 - When Habits Are Dangerous: Alcohol Expectancies and Habitual Decision Making Predict Relapse in Alcohol Dependence JF - Biological psychiatry : a journal of psychiatric neuroscience and therapeutics ; a publication of the Society of Biological Psychiatry N2 - BACKGROUND: Addiction is supposedly characterized by a shift from goal-directed to habitual decision making, thus facilitating automatic drug intake. The two-step task allows distinguishing between these mechanisms by computationally modeling goal-directed and habitual behavior as model-based and model-free control. In addicted patients, decision making may also strongly depend upon drug-associated expectations. Therefore, we investigated model-based versus model-free decision making and its neural correlates as well as alcohol expectancies in alcohol-dependent patients and healthy controls and assessed treatment outcome in patients. METHODS: Ninety detoxified, medication-free, alcohol-dependent patients and 96 age-and gender-matched control subjects underwent functional magnetic resonance imaging during the two-step task. Alcohol expectancies were measured with the Alcohol Expectancy Questionnaire. Over a follow-up period of 48 weeks, 37 patients remained abstinent and 53 patients relapsed as indicated by the Alcohol Timeline Followback method. RESULTS: Patients who relapsed displayed reduced medial prefrontal cortex activation during model-based decision making. Furthermore, high alcohol expectancies were associated with low model-based control in relapsers, while the opposite was observed in abstainers and healthy control subjects. However, reduced model-based control per se was not associated with subsequent relapse. CONCLUSIONS: These findings suggest that poor treatment outcome in alcohol dependence does not simply result from a shift from model-based to model-free control but is instead dependent on the interaction between high drug expectancies and low model-based decision making. Reduced model-based medial prefrontal cortex signatures in those who relapse point to a neural correlate of relapse risk. These observations suggest that therapeutic interventions should target subjective alcohol expectancies. KW - Alcohol dependence KW - Alcohol expectancy KW - Goal-directed control KW - Medial prefrontal cortex KW - Reinforcement learning KW - Treatment outcome Y1 - 2017 U6 - https://doi.org/10.1016/j.biopsych.2017.04.019 SN - 0006-3223 SN - 1873-2402 VL - 82 SP - 847 EP - 856 PB - Elsevier CY - New York ER - TY - JOUR A1 - Friedel, Eva A1 - Schlagenhauf, Florian A1 - Beck, Anne A1 - Dolan, Raymond J. A1 - Huys, Quentin J. M. A1 - Rapp, Michael Armin A1 - Heinz, Andreas T1 - The effects of life stress and neural learning signals on fluid intelligence JF - European archives of psychiatry and clinical neuroscience : official organ of the German Society for Biological Psychiatry N2 - Fluid intelligence (fluid IQ), defined as the capacity for rapid problem solving and behavioral adaptation, is known to be modulated by learning and experience. Both stressful life events (SLES) and neural correlates of learning [specifically, a key mediator of adaptive learning in the brain, namely the ventral striatal representation of prediction errors (PE)] have been shown to be associated with individual differences in fluid IQ. Here, we examine the interaction between adaptive learning signals (using a well-characterized probabilistic reversal learning task in combination with fMRI) and SLES on fluid IQ measures. We find that the correlation between ventral striatal BOLD PE and fluid IQ, which we have previously reported, is quantitatively modulated by the amount of reported SLES. Thus, after experiencing adversity, basic neuronal learning signatures appear to align more closely with a general measure of flexible learning (fluid IQ), a finding complementing studies on the effects of acute stress on learning. The results suggest that an understanding of the neurobiological correlates of trait variables like fluid IQ needs to take socioemotional influences such as chronic stress into account. KW - Reinforcement learning KW - Prediction error signal KW - Ventral striatum KW - Stress KW - Intelligence Y1 - 2015 U6 - https://doi.org/10.1007/s00406-014-0519-3 SN - 0940-1334 SN - 1433-8491 VL - 265 IS - 1 SP - 35 EP - 43 PB - Springer CY - Heidelberg ER - TY - JOUR A1 - Sebold, Miriam Hannah A1 - Deserno, Lorenz A1 - Nebe, Stefan A1 - Schad, Daniel A1 - Garbusow, Maria A1 - Haegele, Claudia A1 - Keller, Juergen A1 - Juenger, Elisabeth A1 - Kathmann, Norbert A1 - Smolka, Michael N. A1 - Rapp, Michael Armin A1 - Schlagenhauf, Florian A1 - Heinz, Andreas A1 - Huys, Quentin J. M. T1 - Model-based and model-free decisions in alcohol dependence JF - Neuropsychobiology : international journal of experimental and clinical research in biological psychiatry, pharmacopsychiatry, Biological Psychology/Pharmacopsychology and Pharmacoelectroencephalography N2 - Background: Human and animal work suggests a shift from goal-directed to habitual decision-making in addiction. However, the evidence for this in human alcohol dependence is as yet inconclusive. Methods: Twenty-six healthy controls and 26 recently detoxified alcohol-dependent patients underwent behavioral testing with a 2-step task designed to disentangle goal-directed and habitual response patterns. Results: Alcohol-dependent patients showed less evidence of goal-directed choices than healthy controls, particularly after losses. There was no difference in the strength of the habitual component. The group differences did not survive controlling for performance on the Digit Symbol Substitution Task. Conclusion: Chronic alcohol use appears to selectively impair goal-directed function, rather than promoting habitual responding. It appears to do so particularly after nonrewards, and this may be mediated by the effects of alcohol on more general cognitive functions subserved by the prefrontal cortex. KW - Alcohol dependence KW - Decision-making KW - Reinforcement learning KW - Dopamine KW - Computational psychiatry Y1 - 2014 U6 - https://doi.org/10.1159/000362840 SN - 0302-282X SN - 1423-0224 VL - 70 IS - 2 SP - 122 EP - 131 PB - Karger CY - Basel ER -