What to optimize for – from robot arms to frontier AI - Anca Dragan

Anca Dragan

International Conference on Machine Learning 2025 · Invited Talk

In this insightful keynote at ICML 2025, Anca Dragan, who co-leads post-training for Gemini at Google DeepMind, delves into one of the most fundamental and persistent challenges in AI: understanding and optimizing for what humans *truly* want. Drawing on over a decade of research spanning robotics to frontier large language models (LLMs), Dragan argues that the problem isn't just about specifying a reward function, but about inferring and aligning with the *intended* reward, which is often unobservable, noisy, and subject to human biases and irrationality.

AI review

Dragan's ICML 2025 keynote is a well-structured, intellectually honest tour of the reward misspecification problem across a decade of her research, from cooperative IRL in robotics to RLHF hacking in frontier LLMs. The talk is strongest as a problem taxonomy — it correctly identifies systematic bias in human feedback, coverage gaps, and LLM judge fragility as distinct failure modes, and it situates them within a coherent Bayesian IRL framing. What it is not, and does not claim to be, is a technical contribution: there are no new theorems, no new algorithms, no definitive empirical results…