Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 91% Match Research Paper 3D animators,Game developers,VFX artists,Researchers in computer graphics 3 weeks ago

SMF: Template-free and Rig-free Animation Transfer using Kinetic Codes

generative-ai › vae
📄 Abstract

Abstract: Animation retargetting applies sparse motion description (e.g., keypoint sequences) to a character mesh to produce a semantically plausible and temporally coherent full-body mesh sequence. Existing approaches come with restrictions -- they require access to template-based shape priors or artist-designed deformation rigs, suffer from limited generalization to unseen motion and/or shapes, or exhibit motion jitter. We propose Self-supervised Motion Fields (SMF), a self-supervised framework that is trained with only sparse motion representations, without requiring dataset-specific annotations, templates, or rigs. At the heart of our method are Kinetic Codes, a novel autoencoder-based sparse motion encoding, that exposes a semantically rich latent space, simplifying large-scale training. Our architecture comprises dedicated spatial and temporal gradient predictors, which are jointly trained in an end-to-end fashion. The combined network, regularized by the Kinetic Codes' latent space, has good generalization across both unseen shapes and new motions. We evaluated our method on unseen motion sampled from AMASS, D4D, Mixamo, and raw monocular video for animation transfer on various characters with varying shapes and topology. We report a new SoTA on the AMASS dataset in the context of generalization to unseen motion. Code, weights, and supplementary are available on the project webpage at https://motionfields.github.io/

Key Contributions

Self-supervised Motion Fields (SMF) is a template-free and rig-free framework for animation retargeting that uses Kinetic Codes, a novel autoencoder-based sparse motion encoding, to create a semantically rich latent space. This allows for large-scale training without dataset-specific annotations, enabling the generation of plausible and temporally coherent full-body mesh sequences from sparse motion descriptions while avoiding motion jitter.

Business Value

Significantly speeds up and democratizes the animation process for games, films, and virtual experiences by reducing the need for manual rigging and complex motion capture cleanup.