Variable feedback strength impairs learning, and that explains range adaptation of dopaminergic activity

  • Datum: 26.08.2019
  • Uhrzeit: 11:15 - 12:15
  • Vortragende(r): Moritz Möller
  • University of Oxford
  • Ort: Max-Planck-Ring 8
  • Raum: room 203
  • Gastgeber: Peter Dayan
Variable feedback strength impairs learning, and that explains range adaptation of dopaminergic activity

Reinforcement learning algorithms explain many aspects of dopaminergic responses to rewarding stimuli, which we now interpret as reward prediction errors. However, one well-documented phenomenon—the adaptation of those responses to the range of experienced rewards—remains enigmatic. Assuming that this adaptation serves a purpose, we investigated its impact on trial-and-error learning. We found that range adaptation impairs reward prediction, but enables effective policy-optimisation from feedback of variable strength. Our result suggests that the brain might rely on range adaptation to compensate for changes in feedback scale. This hypothesis is easy to test experimentally and, if true, implies that improving action selection has priority over accurate reward prediction.

Zur Redakteursansicht