DLBox: New Model Training Framework for Protecting Training Data

Jaewon Hur

Network and Distributed System Security (NDSS) Symposium 2025 · Day 2 · ML Security

The proliferation of artificial intelligence, particularly deep learning, has led to an increasing demand for vast datasets to train sophisticated models. However, a significant hurdle in this ecosystem is the secure sharing of sensitive training data between data owners and AI developers. This talk introduces **DLBox**, a novel model training framework designed to protect proprietary and sensitive training data from unauthorized leakage during the model development process. Presented by Jaewon Hur from Georgia Tech, DLBox addresses the critical challenge of ensuring data utility for model training while simultaneously enforcing strict controls to prevent its misuse or exfiltration.

AI review

Solid systems security research that attacks a real and underappreciated problem: the trust gap between data owners and ML developers. The DGM rule formalization is a clean conceptual contribution, and the AMD SEV-SNP + proxy variable architecture is a credible engineering answer, not a hand-wavy 'just use TEEs' non-solution. The 4% overhead claim, if it holds up under scrutiny, makes this actually deployable rather than just academically interesting.

Watch on YouTube