MEADOW: Memory-efficient Dataflow and Data Packing for Low Power Edge LLMs

Abhishek Moitra, Arkapravo Ghosh, Shrey Agrawal, Karthik Swaminathan, Priyadarshini Panda

Conference on Machine Learning and Systems 2025 · Day 4 · Session 12: Edge and Cloud Systems

The rapid scaling of Large Language Models (LLMs) has unlocked a vast array of applications, from sophisticated chatbots to real-time translation systems. However, deploying these increasingly massive models on resource-constrained edge devices presents significant challenges, primarily due to their immense computational and memory demands. This talk by Abhishek Moitra and collaborators from IBM Research introduces **MEADOW**, a novel framework designed to address the memory bottleneck inherent in deploying LLMs on low-power edge hardware, specifically Field-Programmable Gate Arrays (FPGAs).

AI review

MEADOW is a legitimate piece of systems research — lossless weight packing plus a custom attention dataflow for memory-bound FPGA inference — with real benchmark numbers on real hardware. The work is honest about its scope (edge FPGAs, not ASICs or MCUs) and the roofline framing is useful. But this article is a reconstructed summary, not a transcript of someone actually showing their implementation, and the engineering detail stops just short of reproducible. You get the architecture sketch, not the HDL. Solid MLSys-tier work, but it doesn't change how most engineers building software-side…