Exploring the AI Automation Boundary for Threat Hunting at Datadog

Arthi Nagarajan

[un]prompted 2026 — AI Security Practitioner Conference · Day 2 · 1

Datadog's threat hunting team spent six to nine months discovering exactly where AI can and cannot help in a real-world security operations environment. Their Hunting Copilot evolved through multiple architectures — from a naive single agent to a multi-agent orchestrator-expert framework — achieving a 41% reduction in total hunt time during A/B testing. The central lesson: schema field discovery from live data, not documentation, is the key to semantic accuracy, and human experts must remain in the driver's seat. ---

AI review

Six to nine months of honest operational failure, clearly reported. The 41% reduction in hunt time during a real A/B test isn't a benchmark — it's what actually happened on a real hunt week, and that's the only metric that matters. The 'schema from live data, not documentation' insight will save people months of debugging hallucinated field names.

Watch on YouTube