Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Navigating to a designated goal using visual information is a fundamental
capability for intelligent robots. Most classical visual navigation methods are
restricted to single-goal, single-modality, and closed set goal settings. To
address the practical demands of multi-modal, open-vocabulary goal queries and
multi-goal visual navigation, we propose LagMemo, a navigation system that
leverages a language 3D Gaussian Splatting memory. During exploration, LagMemo
constructs a unified 3D language memory. With incoming task goals, the system
queries the memory, predicts candidate goal locations, and integrates a local
perception-based verification mechanism to dynamically match and validate goals
during navigation. For fair and rigorous evaluation, we curate GOAT-Core, a
high-quality core split distilled from GOAT-Bench tailored to multi-modal
open-vocabulary multi-goal visual navigation. Experimental results show that
LagMemo's memory module enables effective multi-modal open-vocabulary goal
localization, and that LagMemo outperforms state-of-the-art methods in
multi-goal visual navigation. Project page:
https://weekgoodday.github.io/lagmemo
Authors (8)
Haotian Zhou
Xiaole Wang
He Li
Fusheng Sun
Shengyu Guo
Guolei Qi
+2 more
Submitted
October 28, 2025
Key Contributions
LagMemo is a novel navigation system that enables multi-modal, open-vocabulary, multi-goal visual navigation by leveraging a 'language 3D Gaussian Splatting memory'. It constructs a unified 3D language memory during exploration, queries it for goal locations, and uses a verification mechanism to dynamically match and validate goals during navigation.
Business Value
Paves the way for more intelligent and adaptable robots in complex, dynamic environments, such as autonomous delivery, exploration in unknown territories, and advanced robotics in logistics and warehousing.