The Underlying Logic of Language Models: The Underlying Logic of Language Models: Introduction

Jiaoda Li, Ryan Cotterell, Franz Nowak, Anej Svete

International Conference on Machine Learning 2025 · Tutorial

This talk, an introductory segment of a broader tutorial presented at ICML 2025, delves into the fundamental question of "what language models can compute." Presented by Anej Svete, Jiaoda Li, and Franz Nowak from Ryan Cotterell's lab at ETH Zurich, the session sets the stage for a rigorous scientific inquiry into the underlying computational logic of modern language models, particularly **Transformers**. While acknowledging the astonishing capabilities of models like ChatGPT—demonstrated by their ability to generate intricate sonnets on abstract topics—the speakers immediately pivot to their persistent, often baffling failures on seemingly simple tasks, such as basic arithmetic or complex reasoning problems like the **knapsack problem**. This dichotomy highlights a critical gap in our understanding: despite their emergent intelligence, the precise computational boundaries and internal mechanisms of these models remain largely opaque.

AI review

This is the introductory segment of a multi-part ICML 2025 tutorial on the expressivity of Transformer language models, framed through formal language theory, circuit complexity, and algebraic decomposition. As a standalone artifact, it presents no new theorems, no experimental results, and no technical findings — it is a roadmap and a motivation. Evaluated charitably as a tutorial introduction, it is competently framed and points toward a genuinely important and under-served research agenda. Evaluated as a technical contribution, there is simply nothing to evaluate yet.