arxiv_cv 93% Match Research Paper 3D Artists,Game Developers,Robotics Researchers,VR/AR Developers,Computer Graphics Researchers 1 week ago

ArtiLatent: Realistic Articulated 3D Object Generation via Structured Latents

generative-ai › diffusion

📄 Abstract

Abstract: We propose ArtiLatent, a generative framework that synthesizes human-made 3D objects with fine-grained geometry, accurate articulation, and realistic appearance. Our approach jointly models part geometry and articulation dynamics by embedding sparse voxel representations and associated articulation properties, including joint type, axis, origin, range, and part category, into a unified latent space via a variational autoencoder. A latent diffusion model is then trained over this space to enable diverse yet physically plausible sampling. To reconstruct photorealistic 3D shapes, we introduce an articulation-aware Gaussian decoder that accounts for articulation-dependent visibility changes (e.g., revealing the interior of a drawer when opened). By conditioning appearance decoding on articulation state, our method assigns plausible texture features to regions that are typically occluded in static poses, significantly improving visual realism across articulation configurations. Extensive experiments on furniture-like objects from PartNet-Mobility and ACD datasets demonstrate that ArtiLatent outperforms existing approaches in geometric consistency and appearance fidelity. Our framework provides a scalable solution for articulated 3D object synthesis and manipulation.

Authors (4)

Honghua Chen

Yushi Lan

Yongwei Chen

Xingang Pan

Submitted

October 24, 2025

arXiv Category

cs.CV

arXiv PDF

Key Contributions

ArtiLatent generates articulated 3D objects by jointly modeling part geometry and articulation dynamics using structured latents within a VAE and latent diffusion model framework. It features an articulation-aware Gaussian decoder for photorealistic rendering, handling occluded regions based on articulation state.

Business Value

Enables faster and more efficient creation of complex 3D assets for virtual environments, simulations, and robotics, reducing manual modeling effort and improving realism.

Paper Metadata

Innovation Type

Framework/Methodology

Deployment Feasibility

Feasible, builds upon established generative techniques (VAE, diffusion models) and addresses a key challenge in 3D content creation.

Limitations Addressed

Addresses the challenge of generating realistic 3D articulated objects with accurate geometry, articulation, and appearance, particularly handling occluded parts that become visible when objects articulate.

Technical Tags

3D object generationarticulated objectsstructured latentsvariational autoencoder (VAE)latent diffusion modelsparse voxel representationarticulation propertiesGaussian decoderarticulation-aware renderingphotorealistic synthesis

Research Topics

Generative Models3D Computer Vision3D Shape SynthesisDeep LearningComputer Graphics

Methods & Architectures

Joint modeling of geometry and articulationVariational Autoencoder (VAE)Latent Diffusion ModelArticulation-aware Gaussian decoderConditioning appearance on articulation state Variational Autoencoder (VAE)Latent Diffusion ModelGaussian Decoder

Applications & Tasks

3D Content Creation Robotics Virtual Reality Augmented Reality Game Development 3D Shape GenerationModeling Articulated ObjectsRealistic Rendering Synthesizing 3D models of articulated objectsGenerating objects with accurate geometry, articulation, and appearanceRendering objects in various articulated states

Related Fields

Computer GraphicsComputer VisionGenerative AIRobotics3D Modeling

Keywords

3D generationarticulated objectsstructured latentsVAEdiffusion modelGaussian renderingphotorealismgeometryarticulationappearance3D modelinggenerative AI

Academic Context

#Generative Models#3D Computer Vision#3D Shape Synthesis#Deep Learning#Computer Graphics

Technology Stack

Programming Languages

Python

Commercial Potential

Potential Products

3D asset generation tools for game enginesParametric 3D model librariesSimulation environments for robotics

Target Industries

GamingVirtual RealityAugmented RealityRoboticsManufacturing (Prototyping)

Use Case Examples

Generating diverse 3D models of furniture with functional drawers and doors for virtual showrooms.Creating articulated robot models for simulation and training.Populating virtual worlds with realistic, interactive 3D objects.

Competitive Edge

Offers a novel approach to generating articulated 3D objects by integrating geometry and articulation modeling within a generative framework, aiming for higher realism and control than previous methods.

Market Opportunity

Growing market for 3D content creation tools, driven by gaming, VR/AR, and the metaverse.

Revenue Models

Licensing of the generation technologysale of generated 3D assetsdevelopment of specialized 3D modeling software.

Resource Requirements

Compute Needs

High, requires significant GPU resources for training VAEs and diffusion models, and for rendering.

Data Requirements

Requires datasets of 3D articulated objects with associated part geometry, articulation parameters (joint types, axes, ranges), and potentially textures.

Deployment Constraints

Computational cost of generation and rendering, quality and diversity of generated assets, need for user control over articulation.

Scalability

Scalability depends on the efficiency of the VAE, diffusion model, and the Gaussian decoder. Generating high-resolution models can be computationally intensive.

Production Readiness

Maturity Level

Research/Development

Time to Market

2-4 years for refinement, optimization, and integration into professional 3D pipelines.

Patent Potential

Moderate, potential for patents on the structured latent space representation, the articulation-aware decoder, or the joint modeling approach.

View Full Paper Back to Papers