Temporal Difference Flows

Jesse Farebrother, Matteo Pirotta, Andrea Tirinzoni, REMI MUNOS, Alessandro Lazaric, Ahmed Touati

International Conference on Machine Learning 2025 · Oral

This article delves into "Temporal Difference Flows," a groundbreaking work presented at ICML 2025 by Jesse Farebrother, who conducted this research during an internship at Meta, alongside a team of distinguished collaborators including Matteo Pirotta, Andrea Tirinzoni, REMI MUNOS, Alessandro Lazaric, and Ahmed Touati. The talk introduces a novel approach to learning **geometrically discounted horizon models**, often referred to as **gamma models** or related to the **successor representation** in reinforcement learning. These models aim to predict future states directly, discounted by their temporal distance, offering a compelling alternative to traditional world models that often struggle with long-term predictions due to compounding errors.

AI review

Temporal Difference Flows is a technically coherent contribution that applies flow matching to the longstanding problem of learning successor measures (geometrically discounted state distributions) in model-based RL. The core insight — that coupling the noise variable X0 across the bootstrapped target generation and the conditional flow matching objective reduces gradient variance — is clean and plausible. The reported empirical gains are striking if they hold up. My reservations are about depth: the theoretical characterization of variance (scaling as gamma^2) is suggestive but not fully…