Claude: Climbing a CTF Scoreboard Near You
Keane Lucas
DEF CON 33 · Day 3 · Main Stage
Keane Lucas from Anthropic's Frontier Red Team presented a detailed experimental study of Claude's performance on Capture the Flag (CTF) competitions across a broad range of security categories. The r
AI review
Anthropic Frontier Red Team researcher systematically evaluates Claude's performance across CTF challenge categories, documenting solve rates by difficulty and category, analyzing failure modes, and demonstrating that prompting strategy (agentic loops, uncertainty handling, step decomposition) significantly impacts performance — framed as an early warning for AI-assisted offensive capability trajectory.