FLStore: Efficient Federated Learning Storage for Non-training Workloads

Ahmad Faraz Khan, Samuel Fountain, Ahmed M. Abdelmoniem, Ali R. Butt, Ali Anwar

Conference on Machine Learning and Systems 2025 · Day 4 · Session 11: Federated Learning

This article delves into **FLStore**, a novel architecture designed to enhance the efficiency of federated learning (FL) pipelines by optimizing the storage and processing of **non-training workloads**. Presented by Ahmad Faraz Khan and Samuel Fountain from Virginia Tech, this work addresses a critical bottleneck in federated learning systems: the high latency and cost associated with tasks beyond the core model training process. While FL has gained significant traction across diverse applications like healthcare, financial systems, and edge computing (e.g., Apple's voice recognition, Google's Gboard), existing cloud aggregators struggle to efficiently handle the increasing complexity and data requirements of tasks such as personalization, malicious client identification, and contribution calculation.

AI review

FLStore makes a real and underappreciated point — non-training workloads in federated learning are expensive and poorly optimized — and offers a genuine architectural response: push compute into the cache via serverless functions rather than pulling data to a separate compute plane. The numbers are impressive if you take them at face value (65-71% latency reduction, 92-99% cost reduction). But the article is a write-up of a talk, not a reproducible implementation, and the gaps between the claims and anything I could verify or extend are wide enough to keep this out of must-watch territory.