ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking

Yunzhe Li

Network and Distributed System Security (NDSS) Symposium 2026 · Day 1 · AI Security

This talk introduces **ThinkTrap**, a novel denial-of-service (DoS) attack against cloud-hosted large language model services that exploits the fundamental autoregressive nature of LLM inference. By crafting optimized prompts that force models into extremely long output generation, an attacker operating within normal rate limits can monopolize GPU resources and drive system throughput to near zero. The attack is **black-box** (no access to model internals), **query-efficient** (requiring approximately 10,000 optimization queries), and works across **eight public LLM services** including DeepSeek and GPT variants.

AI review

A clean demonstration that LLM services can be DoS'd by forcing maximum-length outputs through optimized prompts, all within normal rate limits. The gradient-free optimization in a 200-dimensional projected space is an interesting technique, and the attack is genuinely practical. However, the concept is relatively straightforward -- making a model talk too much -- and the presentation quality was notably poor, making the technical details hard to follow.

Watch on YouTube