Why Most ML Vulnerability Detection Fails (And What Actually Worked for Kernel Bugs)

Jenny Guanni Qu

[un]prompted 2026 — AI Security Practitioner Conference · Day 2 · 2

Most ML models applied to vulnerability detection fail because researchers start with complex architectures before establishing what simple baselines can already do. Jenny Qu, trained on math AI at Caltech and backed by Pebblebed Ventures, applied rigorous ML methodology to 125,000 labeled Linux kernel commits and discovered that context length, curriculum design, and hard negative selection matter more than model sophistication — and that the most surprising result came from a model that used just three numbers. ---

AI review

A methodologically rigorous ML paper compressed into a short talk, delivered by someone who knows what she's doing. The 'beat the dumb baselines first' rule and the curriculum learning inversion finding are genuinely useful. But the talk is brief, the results are preliminary, and the gap between detection and exploitation is significant.

Watch on YouTube