From Prompts to Pwns: Exploiting and Securing AI Agents

Black Hat USA 2025 · Day 1 · Briefings

NVIDIA's AI Red Team demonstrated live prompt injection attacks against Microsoft Copilot, PandasAI (CVE disclosed), and Cursor IDE — including exploits that achieved remote code execution via a GitHub issue, a malicious pip package in a PR, and ASCII-smuggled instructions hidden in crowdsourced rule files. Their defensive framework reframes agent security using two new axioms: "least autonomy is the new least privilege" and "assume prompt injection is the new assume breach." ---

AI review

NVIDIA's red team does competent prompt injection demos against Copilot, PandasAI, and Cursor, adds a clean autonomy taxonomy and a workable defensive framework. Solid tradecraft, nothing you couldn't have assembled from existing public research, but the ASCII smuggling via Unicode tag characters is a technique worth knowing and the 'least autonomy is the new least privilege' framing is genuinely quotable.

Watch on YouTube