ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features
Alec Helbling, Tuna Han Salih Meral, Benjamin Hoover, Pinar Yanardag, Polo Chau
International Conference on Machine Learning 2025 · Oral
In this insightful talk from ICML 2025, Alec Helbling, a PhD student at Georgia Tech, introduced **ConceptAttention**, a novel method designed to visualize the textual concepts embedded within the generated images of **Diffusion Transformer (DiT)** models. As text-to-image and increasingly text-to-video diffusion models continue to push the boundaries of generative AI, their inherent complexity—characterized by billions of parameters, high-dimensional internal representations, and the simultaneous processing of vast numbers of tokens and image patches—renders them notoriously difficult to interpret. ConceptAttention addresses this critical interpretability gap by providing a mechanism to generate high-quality, textual saliency maps that highlight which parts of an image correspond to specific textual concepts.
AI review
ConceptAttention is a competent, practically motivated interpretability method for Diffusion Transformer models that produces sharper saliency maps than cross-attention by projecting contextualized text-concept vectors against image output vectors, with a causal masking trick to probe arbitrary vocabulary concepts without modifying generation. The core mechanism is sensible and the results appear visually compelling. However, the theoretical justification for *why* the output vectors carry this semantic structure is absent — 'emergent properties' is invoked without definition or analysis —…