"Len or index or count, anything but v1": Predicting Variable Names in Decompilation Output with Transfer Learning

Kuntal Kumar Pal, Ati Priya Bajaj, Pratyay Banerjee, Audrey Dutcher, Mutsumi Nakamura, Zion Leonahenahe Basque

IEEE Symposium on Security and Privacy 2024 · Day 3 · Continental Ballroom 4

In the realm of computer science, the challenge of "naming things" is notoriously difficult, and its impact is profoundly felt in the domain of reverse engineering. This talk, presented by Ati Priya Bajaj from Arizona State University and her co-authors, addresses this fundamental problem by introducing **Varbo**, a novel, transfer learning-based approach to predict meaningful variable names and their origins in decompiled code. Binary decompilation, the process of transforming machine-level code back into high-level pseudo-code, is a critical step for security analysts, malware researchers, and those working with legacy systems. However, compilers are inherently lossy, discarding crucial source code artifacts like original variable names, leading decompilers to generate obscure, generic placeholders such as `A1`, `A2`, or `V4`.

AI review

This research delivers a significant breakthrough in tackling the pervasive 'naming things' problem in reverse engineering. Varbo's transfer learning approach, backed by meticulous data engineering, dramatically improves the interpretability of decompiled code by predicting meaningful variable names, making it an essential tool for any serious analyst.

Watch on YouTube