James Fox

フォクス・ジェームズ

LISA / University of Oxford

James Fox is Research Director of LISA, where he oversees research prioritisation and strategy. He is also awaiting examination on his Computer Science PhD on technical AI safety, supervised by Tom Everitt, Michael Wooldridge & Alessandro Abate, which focused on game theory, causality, reinforcement learning, and agent foundations.

Matt MacDermott

マクデルモット・マット

Imperial College London / Causal Incentives Group

Matt MacDermott is a PhD student at Imperial College London under the supervision of Francesco Belardinelli, and a member of the Causal Incentives group working with Tom Everitt of Google DeepMind.

Towards Causal Foundations of Safe AGI

Friday, April 5th, 15:00–15:50

Causality has captivated philosophers for centuries, not merely to grapple with the exact relationship between a cause and an effect, but also because it underpins so many other concepts of interest. In this talk, we will start by exploring how causality is linked to several fundamental issues in AI safety, including incentives, misspecification, generalisation, deception, and corrigibility. Next, we will posit that agency – central to many AGI threat models – is itself a causal concept. We will then bridge our theoretical foundations with practical applications by showcasing our approaches towards developing ‘agency detectors’.