ReferSplat: Referring Segmentation in 3D Gaussian Splatting

Shuting He, Guangquan Jie, Changshuo Wang, Yun Zhou, Shuming Hu, Guanbin Li, Henghui Ding

International Conference on Machine Learning 2025 · Oral

The talk "ReferSplat: Referring Segmentation in 3D Gaussian Splatting" introduces a groundbreaking framework for enabling natural language interaction with 3D scenes, specifically within the context of **3D Gaussian Splatting (3DGS)**. Presented by Tang Xiaoli from Nanyang Technological University on behalf of a collaborative team from Shanghai University of Finance and Economics, Fudan University, Nanyang Technological University, and Sun Yat-sen University, this spotlight paper addresses a critical gap in 3D scene understanding: the ability to interpret and localize objects based on arbitrary, free-form natural language descriptions. Traditional methods often rely on fixed-pattern class names, severely limiting their applicability in dynamic, real-world scenarios.

AI review

ReferSplat introduces a reasonable engineering combination — per-Gaussian feature vectors, cross-modal attention, and contrastive learning — applied to referring segmentation in 3D Gaussian Splatting scenes. The system works and beats baselines on a new dataset the authors themselves constructed. But the paper is fundamentally empirical work dressed in system-contribution framing, with no theoretical grounding, a benchmark that cannot be independently validated, and technical modules that are straightforward applications of existing cross-modal learning machinery without formal analysis of…