Univ Online Talks: Patrick Rebeschini

Professor Patrick Rebeschini
This talk introduces the multi-armed bandit problem – one of the foundational challenges in reinforcement learning.
Named after the dilemma of choosing between different slot machines (one-armed bandits), this framework goes far beyond gambling. Today, it plays a central role in areas such as online advertising, design optimisation, and dynamic pricing. Bandit algorithms are also crucial for improving user feedback systems and fine-tuning large language models such as ChatGPT.
At its core, the bandit problem is about decision-making: how do we balance exploration (trying new options) with exploitation (choosing what has worked best thus far)?
In this talk, Professor Patrick Rebeschini, Professor of Statistics and Machine Learning at the University of Oxford and Tutorial Fellow at Univ, offers an accessible and engaging introduction to the statistical thinking that underpins this powerful concept, and why it matters in today’s digital world.
We hope that you enjoy the talk.
If you would like to hear more about AI, we are hosting the Univ Seminar: Creativity and AI on Thursday 3 July 2025. Please click here to learn more.
Published: 23 June 2025