arxiv_ai 90% Match Research paper Robotics researchers,AI researchers,Computer vision engineers,Developers of autonomous systems 1 week ago

LagMemo: Language 3D Gaussian Splatting Memory for Multi-modal Open-vocabulary Multi-goal Visual Navigation

robotics › navigation

📄 Abstract

Abstract: Navigating to a designated goal using visual information is a fundamental capability for intelligent robots. Most classical visual navigation methods are restricted to single-goal, single-modality, and closed set goal settings. To address the practical demands of multi-modal, open-vocabulary goal queries and multi-goal visual navigation, we propose LagMemo, a navigation system that leverages a language 3D Gaussian Splatting memory. During exploration, LagMemo constructs a unified 3D language memory. With incoming task goals, the system queries the memory, predicts candidate goal locations, and integrates a local perception-based verification mechanism to dynamically match and validate goals during navigation. For fair and rigorous evaluation, we curate GOAT-Core, a high-quality core split distilled from GOAT-Bench tailored to multi-modal open-vocabulary multi-goal visual navigation. Experimental results show that LagMemo's memory module enables effective multi-modal open-vocabulary goal localization, and that LagMemo outperforms state-of-the-art methods in multi-goal visual navigation. Project page: https://weekgoodday.github.io/lagmemo

Authors (8)

Haotian Zhou

Xiaole Wang

He Li

Fusheng Sun

Shengyu Guo

Guolei Qi

+2 more

Submitted

October 28, 2025

arXiv Category

cs.RO

arXiv PDF

Key Contributions

LagMemo is a novel navigation system that enables multi-modal, open-vocabulary, multi-goal visual navigation by leveraging a 'language 3D Gaussian Splatting memory'. It constructs a unified 3D language memory during exploration, queries it for goal locations, and uses a verification mechanism to dynamically match and validate goals during navigation.

Business Value

Paves the way for more intelligent and adaptable robots in complex, dynamic environments, such as autonomous delivery, exploration in unknown territories, and advanced robotics in logistics and warehousing.

Paper Metadata

Innovation Type

System/Algorithmic

Deployment Feasibility

Moderate, requires significant computational resources for 3D reconstruction and memory management.

Limitations Addressed

Restrictions of classical visual navigation methods to single-goal, single-modality, and closed-set goal settings.

Performance Gains

Enables effective multi-modal open-vocabulary multi-goal visual navigation.

Technical Tags

Visual navigationMulti-modal navigationOpen-vocabulary goalsMulti-goal navigation3D Gaussian SplattingLanguage memoryRobot explorationGoal verificationEmbodied AIRobotics

Research Topics

Robotic navigationEmbodied AILanguage groundingMemory systems for AI3D scene understanding

Methods & Architectures

LagMemo systemLanguage 3D Gaussian Splatting memoryMemory queryingGoal predictionPerception-based verification 3D Gaussian SplattingMemory networks

Applications & Tasks

Robotics Autonomous systems Virtual environments Single-goal, single-modality, closed-set navigationMulti-modal, open-vocabulary, multi-goal navigationEfficient robot exploration Navigating to a designated goal using visual informationHandling multi-modal and open-vocabulary goal queriesPerforming multi-goal visual navigation

Datasets & Benchmarks

Datasets

GOAT-Core, GOAT-Bench

Navigation success rateEfficiencyRobustness to multi-modal/open-vocabulary goals

Related Fields

RoboticsComputer visionNatural Language Processing3D reconstructionMemory-augmented neural networks

Keywords

Visual NavigationRobotics3D Gaussian SplattingLanguage MemoryOpen VocabularyMulti-goalEmbodied AIExplorationGoal SpecificationAutonomous SystemsScene Memory

Academic Context

#Robotic navigation#Embodied AI#Language grounding#Memory systems for AI#3D scene understanding

Commercial Potential

Potential Products

Autonomous mobile robots for complex environmentsAdvanced simulation platforms for robot trainingNavigation systems for drones and ground vehicles

Target Industries

RoboticsLogisticsWarehousingAutonomous VehiclesExploration (e.g., space, underwater)

Use Case Examples

A robot navigating a warehouse to find a specific item described in natural languageAn autonomous drone exploring an unknown area and identifying multiple targetsA humanoid robot performing tasks in a home environment based on verbal instructions

Competitive Edge

Addresses limitations of prior visual navigation systems by enabling open-vocabulary, multi-goal, and multi-modal navigation through a novel memory system.

Market Opportunity

Rapidly growing market for autonomous robots and AI-powered navigation systems.

Revenue Models

Licensing of navigation softwaredevelopment of specialized robotic platformsintegration into existing robotic systems.

Resource Requirements

Compute Needs

High, especially for 3D reconstruction and memory management during exploration.

Data Requirements

Requires datasets with 3D environments and associated language descriptions.

Deployment Constraints

Real-time performance and robustness in dynamic environments are key challenges.

Scalability

Scalability depends on the efficiency of the 3D reconstruction and memory querying mechanisms.

Production Readiness

Maturity Level

Research prototype

Time to Market

3-5 years for robust commercial deployment.

Patent Potential

High, related to the language-guided 3D memory and navigation strategy.

View Full Paper Back to Papers