arxiv_ai 85% Match Research Paper AI Researchers,Robotics Engineers,Simulation Developers 1 week ago

Clone Deterministic 3D Worlds with Geometrically-Regularized World Models

computer-vision › 3d-vision

📄 Abstract

Abstract: A world model is an internal model that simulates how the world evolves. Given past observations and actions, it predicts the future of both the embodied agent and its environment. Accurate world models are essential for enabling agents to think, plan, and reason effectively in complex, dynamic settings. Despite rapid progress, current world models remain brittle and degrade over long horizons. We argue that a central cause is representation quality: exteroceptive inputs (e.g., images) are high-dimensional, and lossy or entangled latents make dynamics learning unnecessarily hard. We therefore ask whether improving representation learning alone can substantially improve world-model performance. In this work, we take a step toward building a truly accurate world model by addressing a fundamental yet open problem: constructing a model that can fully clone and overfit to a deterministic 3D world. We propose Geometrically-Regularized World Models (GRWM), which enforces that consecutive points along a natural sensory trajectory remain close in latent representation space. This approach yields significantly improved latent representations that align closely with the true topology of the environment. GRWM is plug-and-play, requires only minimal architectural modification, scales with trajectory length, and is compatible with diverse latent generative backbones. Across deterministic 3D settings and long-horizon prediction tasks, GRWM significantly increases rollout fidelity and stability. Analyses show that its benefits stem from learning a latent manifold with superior geometric structure. These findings support a clear takeaway: improving representation learning is a direct and useful path to robust world models, delivering reliable long-horizon predictions without enlarging the dynamics module.

Authors (5)

Zaishuo Xia

Yukuan Lu

Xinyi Li

Yifan Xu

Yubei Chen

Submitted

October 30, 2025

arXiv Category

cs.LG

arXiv PDF

Key Contributions

This paper proposes Geometrically-Regularized World Models (GRWM) to address the brittleness and long-horizon degradation of current world models. By enforcing geometric regularization, GRWM aims to improve representation quality, making dynamics learning more tractable and enabling the cloning of deterministic 3D worlds.

Business Value

Enables more robust and predictable simulation environments for training AI agents in robotics, autonomous driving, and gaming, leading to safer and more efficient development cycles.

Paper Metadata

Innovation Type

Algorithmic

Deployment Feasibility

Moderate. Requires significant computational resources for training and simulation, but the core concepts are applicable to various simulation platforms.

Limitations Addressed

Brittleness of current world models,Degradation over long horizons,Difficulty in learning dynamics from high-dimensional, entangled latents

Technical Tags

world modelsrepresentation learning3D world simulationgeometric regularizationdynamics learningexteroceptive inputslatent representationsdeterministic environmentsoverfittingembodied agents

Research Topics

World ModelingRepresentation Learning3D Scene UnderstandingEmbodied AISimulation

Methods & Architectures

Geometrically-Regularized World Models (GRWM)Representation Learning World Models

Applications & Tasks

Robotics Autonomous Systems Game Development Simulation Environments Learning dynamics in complex environmentsImproving long-horizon predictionHandling high-dimensional inputsLearning disentangled representations Cloning deterministic 3D worldsPredicting future statesLearning environment dynamics

Related Fields

Reinforcement LearningComputer VisionRoboticsSimulation

Keywords

world modelrepresentation learning3D worldgeometric regularizationdynamicssimulationembodied AIplanningpredictionlatent spacedeterministicoverfitting

Academic Context

#World Modeling#Representation Learning#3D Scene Understanding#Embodied AI#Simulation

Commercial Potential

Potential Products

Advanced simulation enginesAI training platforms

Target Industries

GamingRoboticsAutomotiveVirtual Reality

Use Case Examples

Training autonomous driving agents in realistic 3D environmentsDeveloping AI for complex robotic manipulation tasksCreating more believable non-player characters in video games

Competitive Edge

Aims to surpass existing world models by focusing on representation quality and geometric constraints, potentially offering greater accuracy and stability in simulated environments.

Market Opportunity

Large (simulation and AI training markets)

Revenue Models

Licensing of simulation technologyplatform services

Resource Requirements

Compute Needs

High (for training world models and running simulations)

Data Requirements

Requires data from deterministic 3D environments.

Deployment Constraints

Computational cost, need for accurate 3D environment data.

Scalability

Scalability depends on the complexity of the 3D world and the agent's actions.

Production Readiness

Maturity Level

Research

Time to Market

Long (requires significant R&D)

Patent Potential

Low to Moderate (depends on specific algorithmic innovations)

View Full Paper Back to Papers