Trajectory-Aware Post-Training of Open-Weight Models for Security Agents
Aaron Brown, Madhur Prashant
[un]prompted 2026 — AI Security Practitioner Conference · Day 1 · 2
Frontier models score 80% on isolated cybersecurity tasks but 0% on multi-stage operations. Aaron Brown of AWS released an open-source training gymnasium at the conference — Open Trajectory Gym — that combines supervised fine-tuning on expert traces with online reinforcement learning to produce smaller, specialized security models that outperform their base counterparts on real penetration testing benchmarks. ---
AI review
Brown dropped an open-source training gymnasium at the conference — one hour before his talk — and walked through exactly how he got a 27B model from 12.5% to meaningfully better performance on multi-stage pentesting benchmarks in under a day. This is practitioners building the tools the field needs. The composition gap data is damning and the open release is generous.