Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries

Yunyi Zhang

Network and Distributed System Security (NDSS) Symposium 2026 · Day 1 · Web Security

This talk presents a comprehensive evaluation of security risks in **LLM-based applications** that arise not from traditional jailbreaking but from poorly defined **capability boundaries**. While most security research focuses on bypassing LLM safety guardrails, this work examines the gap between what application developers intend their apps to do and what the underlying models actually allow. The researchers analyzed **800,000 applications** across four major platforms, collected **10,000 public application prompts**, and identified three distinct risk categories: **capability downgrade** (degrading expected performance), **capability upgrade** (expanding beyond intended scope), and **capability jailbreak** (bypassing all constraints).

AI review

A large-scale empirical study of LLM application security that goes beyond jailbreaking to examine capability boundary violations across 800,000 applications. The taxonomy of downgrade/upgrade/jailbreak risks is useful, and the finding that half of applications have zero functional constraints is a damning indictment of the ecosystem. The scale is impressive, but the technical depth on exploitation is shallow -- this is measurement science, not attack research.

Watch on YouTube