Advanced Bypass Techniques and a Novel Detection Approach

Black Hat USA 2025 · Day 1 · Briefings

Static model scanners used to vet AI models from repositories like Hugging Face are fundamentally unable to catch malicious code embedded in model files, because the problem of exhaustively analyzing arbitrary code is NP-hard. Itay Ravia of Aim Security demonstrates dozens of bypass techniques against every major static scanner and proposes a dynamic tracing approach — running models in a sandbox and observing their system calls — as the only reliable detection method. ---

AI review

Ravia correctly diagnoses that static model scanners are structurally broken — NP-hard problem, deny-list approach, partial opcode emulation — and the Pickle stack desynchronization demos are clean, reproducible bypasses that make the argument without needing to wave hands. The dynamic sandbox proposal is the right answer.

Watch on YouTube