Reinforcement Unlearning

Dayong Ye

Network and Distributed System Security (NDSS) Symposium 2025 · Day 3 · Machine Unlearning

In an era increasingly shaped by artificial intelligence, the ability for machine learning models to "forget" specific information has become paramount, driven by privacy regulations, the need for adaptability, and the imperative to correct errors. This talk, "Reinforcement Unlearning," presented by Dayong Ye from the University of Technology Sydney, introduces a novel concept that merges the principles of **Reinforcement Learning (RL)** with **Machine Unlearning**. It tackles the complex challenge of enabling an RL agent to selectively remove specific training experiences and knowledge, making it behave as if those experiences never occurred, without compromising its performance in other learned environments.

AI review

Legitimate academic research on a real problem — RL unlearning is underexplored and the dual-method framing (policy fine-tuning vs. environment poisoning) is a coherent contribution. But this is a conference paper presentation, not a security talk, and the threat modeling is thin: the 'defensive implications' section reads like grant-writing padding rather than rigorous analysis of adversarial scenarios.

Watch on YouTube