Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: We propose ArtiLatent, a generative framework that synthesizes human-made 3D
objects with fine-grained geometry, accurate articulation, and realistic
appearance. Our approach jointly models part geometry and articulation dynamics
by embedding sparse voxel representations and associated articulation
properties, including joint type, axis, origin, range, and part category, into
a unified latent space via a variational autoencoder. A latent diffusion model
is then trained over this space to enable diverse yet physically plausible
sampling. To reconstruct photorealistic 3D shapes, we introduce an
articulation-aware Gaussian decoder that accounts for articulation-dependent
visibility changes (e.g., revealing the interior of a drawer when opened). By
conditioning appearance decoding on articulation state, our method assigns
plausible texture features to regions that are typically occluded in static
poses, significantly improving visual realism across articulation
configurations. Extensive experiments on furniture-like objects from
PartNet-Mobility and ACD datasets demonstrate that ArtiLatent outperforms
existing approaches in geometric consistency and appearance fidelity. Our
framework provides a scalable solution for articulated 3D object synthesis and
manipulation.
Authors (4)
Honghua Chen
Yushi Lan
Yongwei Chen
Xingang Pan
Submitted
October 24, 2025
Key Contributions
ArtiLatent generates articulated 3D objects by jointly modeling part geometry and articulation dynamics using structured latents within a VAE and latent diffusion model framework. It features an articulation-aware Gaussian decoder for photorealistic rendering, handling occluded regions based on articulation state.
Business Value
Enables faster and more efficient creation of complex 3D assets for virtual environments, simulations, and robotics, reducing manual modeling effort and improving realism.