Exploring ChatGPT's Capabilities on Vulnerability Management

Peiyu Liu, Junming Liu, Lirong Fu, Kangjie Lu, Yifan Xia, Xuhong Zhang, Wenzhi Chen, Haiqin Weng, Shouling Ji, Wenhai Wang

33rd USENIX Security Symposium · Day 1 · USENIX Security '24

This talk presents a comprehensive evaluation of **ChatGPT**'s capabilities across the entire **vulnerability management lifecycle**. Given the burgeoning interest in **Large Language Models (LLMs)** for diverse applications, including code-related analysis, researchers from various institutions explored how ChatGPT performs on a suite of six distinct vulnerability management tasks. Unlike prior work that often focused on isolated aspects of software engineering, this research provides the first large-scale, holistic assessment of an LLM's utility from initial bug reporting to final patch commit.

AI review

This research delivers a much-needed, rigorously executed evaluation of ChatGPT's true capabilities across the entire vulnerability management lifecycle. It systematically dissects where LLMs excel and fail, offering critical insights into the profound impact of prompt engineering and practical considerations for integrating these tools into security workflows. This isn't just another "AI-powered" fluff piece; it's a substantive, data-driven assessment that cuts through the hype.

Watch on YouTube