Prompt Injection Attack to Tool Selection in LLM Agents
Jiawen Shi
Network and Distributed System Security (NDSS) Symposium 2026 · Day 1 · Web Security
This talk presents a systematic attack against the **tool selection mechanism** in LLM agents -- the process by which an agent decides which tool to invoke for a given task. The researchers demonstrate that an attacker can publish a malicious tool with carefully crafted documentation to a tool hub, and through optimization-based prompt injection, force the LLM agent to select the malicious tool over legitimate alternatives. Under a realistic black-box setting, their gradient-free and gradient-based methods achieve **over 80% attack success rate (ASR)**, compared to below 40-60% for existing attacks.
AI review
A well-formulated optimization attack against LLM agent tool selection achieving 80%+ ASR under black-box conditions. The decomposition into retrieval and selection objectives is clean, and the shadow framework approach is practical. However, the proxy presentation lacks depth, the threat model assumes tool hubs accept arbitrary submissions, and the real-world impact depends on how many production LLM agent deployments actually use open tool discovery.