The Underlying Logic of Language Models: The Underlying Logic of Language Models: Transformers and Circuits
Jiaoda Li, Ryan Cotterell, Franz Nowak, Anej Svete
International Conference on Machine Learning 2025 · Tutorial
This talk, delivered by Anej Svete as part of a broader session with Jiaoda Li, Ryan Cotterell, and Franz Nowak at ICML 2025, delves into a foundational understanding of transformer-based language models through the lens of **Boolean circuit complexity theory**. Moving beyond a purely empirical perspective, the speakers propose a formal framework to analyze the computational capabilities and inherent limitations of transformers by mapping their operations to **directed acyclic graphs (DAGs)**, which are then interpreted as Boolean circuits. The core objective is to precisely characterize what problems transformers can and cannot solve, connecting their architectural properties (like fixed layers and attention mechanisms) to well-established complexity classes such as **AC0**, **TC0**, and **NC1**.
AI review
A competent and honest presentation of an active research program connecting transformer expressivity to Boolean circuit complexity — a direction the community genuinely needs. The core framing (constant-precision transformers are in AC0, log-precision are in TC0, CoT adds depth, padding adds width) is well-grounded in a real line of theoretical CS work going back to Barrington, Straubing, and more recently Merrill, Sabharwal, and collaborators. The talk appears to communicate these results clearly. However, based on the article's account, this reads more like an excellent synthesis and…