GoldenFuzz: Generative Golden Reference Hardware Fuzzing

Lichao Wu

Network and Distributed System Security (NDSS) Symposium 2026 · Day 3 · Fuzzing

Traditional hardware fuzzers rely on random mutation strategies that lack semantic understanding of processor behavior. This talk presents **GoldenFuzz**, a pre-silicon hardware fuzzer that uses a **customized GPT-2 language model** to generate semantically valid instruction sequences for RISC-V CPU testing. The key innovation is a two-stage fuzzing pipeline: first fuzzing a fast **golden reference model** (a software ISA simulator) to rapidly refine the language model's fuzzing policy, then applying the refined policy to the actual hardware design under test. Using **Direct Preference Optimization (DPO)** for coverage-guided policy refinement, GoldenFuzz discovered **7 new vulnerabilities** across open-source and proprietary RISC-V cores, including critical logic errors in privileged mode handling on a commercial core, while achieving **28% speedup** over state-of-the-art approaches.

AI review

A well-executed application of language models to hardware fuzzing that produces real results (7 vulnerabilities including on a commercial core). The golden reference model pre-refinement is a clever optimization, and the DPO-based policy learning is technically sound. However, this is the pre-silicon RISC-V companion to Fuzzilicon and inherently less impactful -- it targets simpler open-source designs rather than production x86 processors.

Watch on YouTube