arxiv_cv 98% Match Survey AI researchers,Robotics engineers,Autonomous driving developers,Students in AI/ML 2 weeks ago

A Comprehensive Survey on World Models for Embodied AI

robotics › embodied-agents

📄 Abstract

Abstract: Embodied AI requires agents that perceive, act, and anticipate how actions reshape future world states. World models serve as internal simulators that capture environment dynamics, enabling forward and counterfactual rollouts to support perception, prediction, and decision making. This survey presents a unified framework for world models in embodied AI. Specifically, we formalize the problem setting and learning objectives, and propose a three-axis taxonomy encompassing: (1) Functionality, Decision-Coupled vs. General-Purpose; (2) Temporal Modeling, Sequential Simulation and Inference vs. Global Difference Prediction; (3) Spatial Representation, Global Latent Vector, Token Feature Sequence, Spatial Latent Grid, and Decomposed Rendering Representation. We systematize data resources and metrics across robotics, autonomous driving, and general video settings, covering pixel prediction quality, state-level understanding, and task performance. Furthermore, we offer a quantitative comparison of state-of-the-art models and distill key open challenges, including the scarcity of unified datasets and the need for evaluation metrics that assess physical consistency over pixel fidelity, the trade-off between model performance and the computational efficiency required for real-time control, and the core modeling difficulty of achieving long-horizon temporal consistency while mitigating error accumulation. Finally, we maintain a curated bibliography at https://github.com/Li-Zn-H/AwesomeWorldModels.

Authors (4)

Xinqing Li

Xin He

Le Zhang

Yun Liu

Submitted

October 19, 2025

arXiv Category

cs.CV

arXiv PDF

Key Contributions

This survey provides a unified framework and a three-axis taxonomy for understanding world models in embodied AI. It systematizes problem settings, learning objectives, data resources, and metrics across various domains, offering a comprehensive overview and quantitative comparison of existing approaches.

Business Value

Accelerates the development of more intelligent and capable AI agents for robotics and autonomous systems by providing a structured understanding of world models and their evaluation.

Paper Metadata

Innovation Type

Survey/Framework

Deployment Feasibility

N/A (Survey paper)

Limitations Addressed

Lack of a unified understanding and taxonomy for world models,Inconsistent data resources and evaluation metrics across different embodied AI tasks

Technical Tags

world modelsembodied AIreinforcement learningpredictive modelssimulationagent perceptionagent actiontemporal modelingspatial representation

Research Topics

Embodied AIWorld ModelsRoboticsReinforcement LearningAgent Perception and Action

Methods & Architectures

World ModelsSimulationForward RolloutsCounterfactual RolloutsTaxonomy of World Models World Models

Applications & Tasks

Robotics Autonomous Driving General AI Agents Agent Decision MakingEnvironment PredictionPerceptionAction Planning Enabling agents to perceive, act, and anticipateImproving agent decision-making through internal simulation

Related Fields

Artificial IntelligenceMachine LearningRoboticsComputer VisionReinforcement Learning

Keywords

world modelsembodied AIroboticsautonomous drivingreinforcement learningsimulationagentperceptionactionpredictionsurveytaxonomydecision making

Academic Context

#Embodied AI#World Models#Robotics#Reinforcement Learning#Agent Perception and Action

Commercial Potential

Target Industries

RoboticsAutomotiveAI Research

Use Case Examples

Training robots to navigate complex environmentsDeveloping autonomous vehicles that can anticipate traffic scenarios

Competitive Edge

Provides a foundational overview and categorization of a critical component (world models) for embodied AI research.

Market Opportunity

Large and growing market for AI agents in robotics and autonomous systems.

Revenue Models

N/A (Survey paper)

Resource Requirements

Compute Needs

N/A (Survey paper)

Data Requirements

N/A (Survey paper)

Deployment Constraints

N/A (Survey paper)

Scalability

N/A (Survey paper)

Production Readiness

Maturity Level

Foundational Research

Time to Market

N/A (Survey paper)

Patent Potential

Low

View Full Paper Back to Papers