Neural Reinforcement Learning

The general area of neural reinforcement learning includes a wide variety of questions in and around affectively-charged decision-making in humans and other animals.

One general area of current interest is learning over the long-run – how animals come to understand the demands of a task over multiple sessions (sometimes including shaping trials in which they are typically asked to solve simplified versions), and how they build appropriate representations that allow them to build appropriate predictors of rewards and generalize to find good actions. We are taking various theoretical and experimental approaches to study this, notably including analyzing the huge body of behavioural data available in virtue of our membership of the International Brain Lab (IBL; a consortium of 10 theoretical and 11 experimental labs around the world devoted to studying a single decision-making task in mice). We are also continuing our long-standing examination of long-run learning of a complex spatial alternation task (Kastner et al., 2022), and looking at the learning of a complex serial reaction time task over thousands of trials by human subjects (Éltető, Nemeth, et al, 2023). Methods include sophisticated non-parametric Bayesian models.

IBL is not only a source of a huge trove of millions of behavioural choices of hundreds of mice; in 2023, we submitted the main fruits of the first round of data collection – a brainwide map of decision-making (International Brain Lab et al, 2023) – for which I was one of the lead authors.

A second area containing various projects is metacognition and metacontrol. Along with an information-theoretic measure of meta-cognitive efficiency (Dayan, 2023) and a detailed analysis of the use of a popular process model of metacognition to determine search (Schulz et al, 2023), we have examined the costs of cognition (Master et al., 2023), which help determine the goals for metacontrol, and also structural ways for improving the (internal) evaluation of decision trees (Éltető & Dayan, 2023). We are currently developing a method for measuring metacognitive efficiency in RL tasks (such as restless bandits), which violate the assumptions of common metacognitive measures (Ershadmanesh et al., 2023).

A third general area of interest concerns game theoretic interactions amongst intentional agents, something that we have long studied also from the perspective of the computational psychiatry of personality disorders. Here, we are developing theoretical methods, and are also using them to study aspects of metacognition in deception. We have worked on a simple example of a cognitive hierarchy (Schulz, Alon et al., 2023), with extensions in a variety of directions (including to computational psychiatry, as above).

Our new model of the aethestic value of images (Brielmann & Dayan, 2022) gained some evidentiary support (Brielmann, Berentelg & Dayan, 2023) and extensions to boredom (Brielmann & Dayan, 2023), and, in a related study, complexity (Nath et al, 2023). We extended our work on off-line replay (Antonov et al., 2022) to allow for exploration (Antonov et al., 2023; this was Sutton’s original computational idea underpinning replay).

Go to Editor View