arxiv_cv 95% Match Research Paper Researchers in generative AI,Animators,Game developers,Robotics engineers 1 week ago

InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation

generative-ai › autoregressive

📄 Abstract

Abstract: We present InfiniDreamer, a novel framework for arbitrarily long human motion generation. InfiniDreamer addresses the limitations of current motion generation methods, which are typically restricted to short sequences due to the lack of long motion training data. To achieve this, we first generate sub-motions corresponding to each textual description and then assemble them into a coarse, extended sequence using randomly initialized transition segments. We then introduce an optimization-based method called Segment Score Distillation (SSD) to refine the entire long motion sequence. SSD is designed to utilize an existing motion prior, which is trained only on short clips, in a training-free manner. Specifically, SSD iteratively refines overlapping short segments sampled from the coarsely extended long motion sequence, progressively aligning them with the pre-trained motion diffusion prior. This process ensures local coherence within each segment, while the refined transitions between segments maintain global consistency across the entire sequence. Extensive qualitative and quantitative experiments validate the superiority of our framework, showcasing its ability to generate coherent, contextually aware motion sequences of arbitrary length.

Authors (3)

Wenjie Zhuo

Fan Ma

Hehe Fan

Submitted

November 27, 2024

arXiv Category

cs.CV

arXiv PDF

Key Contributions

Introduces InfiniDreamer for generating arbitrarily long human motion sequences, overcoming the limitations of short sequence generation in current methods. It achieves this by assembling sub-motions and using Segment Score Distillation (SSD) to refine the entire sequence in a training-free manner, leveraging existing short-clip motion priors.

Business Value

Enables the creation of more realistic and longer character animations for entertainment, virtual environments, and potentially for training robotic agents, reducing manual effort.

Paper Metadata

Innovation Type

Algorithmic

Deployment Feasibility

Feasible for offline generation tasks in animation pipelines. Real-time generation might be challenging due to the iterative refinement process.

Limitations Addressed

Current motion generation methods are restricted to short sequences due to a lack of long motion training data and difficulties in maintaining coherence over extended durations.

Technical Tags

human motion generationlong sequence generationdiffusion modelsautoregressive modelssegment score distillationmotion priortraining-free refinement

Research Topics

Generative AIComputer VisionHuman Motion SynthesisDeep Learning

Methods & Architectures

InfiniDreamer frameworkSegment Score Distillation (SSD)Optimization-based refinementTraining-free approach Diffusion models (as prior)Autoregressive transformer (implied)

Applications & Tasks

Animation Gaming Virtual Reality Robotics Sequence GenerationData Augmentation Generating arbitrarily long human motion sequencesRefining motion sequences

Related Fields

Computer GraphicsMachine LearningAnimation

Keywords

human motion generationlong sequenceInfiniDreamerdiffusion modelsautoregressivesegment score distillationanimationVRgamingtraining-free

Academic Context

#Generative AI#Computer Vision#Human Motion Synthesis#Deep Learning

Commercial Potential

Potential Products

Motion generation tools for 3D animation softwareProcedural content generation systems for gamesRealistic avatar animation platforms

Target Industries

GamingFilm & AnimationVirtual RealityMetaverseRobotics

Use Case Examples

Generating long, natural-looking walking or dancing sequences for virtual characters.Creating diverse motion data for training humanoid robots.

Competitive Edge

Addresses the specific challenge of long-sequence generation, offering a training-free refinement method that leverages existing models.

Market Opportunity

Growing market for digital content creation, VR/AR, and simulation.

Revenue Models

Software licensingAPI accessservice-based generation.

Resource Requirements

Compute Needs

High, especially during the iterative refinement phase (SSD). Training the initial motion prior also requires significant compute.

Data Requirements

Requires datasets of human motion sequences (potentially short clips for the prior).

Deployment Constraints

Computational cost for generation, potential need for fine-tuning for specific styles or characters.

Scalability

Scales to longer sequences, but generation time increases with length.

Production Readiness

Maturity Level

Research

Time to Market

1-3 years for integration into professional tools.

Patent Potential

Moderate, for the Segment Score Distillation technique and the overall framework.

View Full Paper Back to Papers