Optimizing LLM Queries in Relational Data Analytics Workloads

Shu Liu, Asim Biswal, Amog Kamsetty, Ion Stoica, Joseph E. Gonzalez, Matei Zaharia

Conference on Machine Learning and Systems 2025 · Day 3 · Session 6: Edge and Cloud Systems

This talk, presented by Shu Liu from UC Berkeley, along with collaborators from UC Berkeley and Stanford University, addresses the critical challenge of optimizing Large Language Model (LLM) queries within relational data analytics workloads. While much attention has focused on conversational LLM interactions, Liu and the team highlight the equally significant, yet often overlooked, use case of LLMs as operators within analytical queries on structured data. The core problem tackled is the prohibitive cost and latency associated with processing large volumes of relational text data using LLMs, which can amount to hours of computation and thousands of dollars for even moderately sized datasets.

AI review

Solid systems paper from the Berkeley/Stanford crowd that identifies a real, underappreciated inefficiency in LLM-powered analytics pipelines and proposes a practical, well-characterized fix. The core insight — that column ordering in relational-to-prompt transformations dramatically affects prefix cache hit rates, and that a greedy reordering heuristic gets you within 1-2% of optimal in seconds instead of hours — is genuinely useful and non-obvious. The numbers are real (3.4x speedup, 79% cost reduction on Anthropic pricing), the algorithm is described well enough to implement, and the…