Let LLM Learn: When Your Static Analyzer Actually Gets It
Black Hat USA 2025 · Day 1 · Briefings
Existing SAST tools like CodeQL deliberately over-restrict their rules to minimize false positives, inadvertently suppressing real vulnerabilities before any LLM ever sees them. This research proposes inverting that design: use large language models during the *development* phase to relax and optimize SAST rules, achieving roughly three times the recall rate while maintaining near-100% precision, then apply a path-segmentation and Chain-of-Thought reasoning framework at runtime to triage the resulting flood of findings without the prohibitive per-path LLM inference cost. ---
AI review
The core insight — LLMs belong in SAST rule *development*, not post-scan triage — is the right diagnosis of a real structural problem, and tripling CodeQL recall while holding precision near 100% is a result worth paying attention to. The path segmentation and operation database for triage scaling are practical engineering with a clean conceptual model. Zhong shows genuine depth.