PrivORL: Differentially Private Synthetic Dataset for Offline Reinforcement Learning

Chen GONG

Network and Distributed System Security (NDSS) Symposium 2026 · Day 3 · Privacy & Measurement

PrivORL is the first framework for generating **differentially private synthetic datasets** for **offline reinforcement learning (RL)**. In domains where RL training data contains sensitive information (medical decision-making, autonomous driving), directly sharing datasets poses privacy risks. PrivORL addresses this by training a **diffusion model-based synthesizer** with **DP-SGD** to generate synthetic trajectory and transition data that preserves utility while providing formal privacy guarantees. A novel **curiosity-driven module** improves data diversity by 60%, and the framework handles both **transition-level** and **trajectory-level** privacy protection units.

AI review

The first DP synthetic data method for offline RL, addressing a real problem in medical/autonomous driving data sharing. The curiosity-driven module is the most interesting contribution, improving performance by 60%. However, this is fundamentally a machine learning/privacy paper with no security attacks, defenses, or exploitation -- the security relevance is limited to enabling private data sharing for RL training.

Watch on YouTube