Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: A deep understanding of kinematic structures and movable components is
essential for enabling robots to manipulate objects and model their own
articulated forms. Such understanding is captured through articulated objects,
which are essential for tasks such as physical simulation, motion planning, and
policy learning. However, creating these models, particularly for complex
systems like robots or objects with high degrees of freedom (DoF), remains a
significant challenge. Existing methods typically rely on motion sequences or
strong assumptions from hand-curated datasets, which hinders scalability. In
this paper, we introduce Kinematify, an automated framework that synthesizes
articulated objects directly from arbitrary RGB images or text prompts. Our
method addresses two core challenges: (i) inferring kinematic topologies for
high-DoF objects and (ii) estimating joint parameters from static geometry. To
achieve this, we combine MCTS search for structural inference with
geometry-driven optimization for joint reasoning, producing physically
consistent and functionally valid descriptions. We evaluate Kinematify on
diverse inputs from both synthetic and real-world environments, demonstrating
improvements in registration and kinematic topology accuracy over prior work.
Authors (6)
Jiawei Wang
Dingyou Wang
Jiaming Hu
Qixuan Zhang
Jingyi Yu
Lan Xu
Submitted
November 3, 2025
Key Contributions
Kinematify introduces an automated framework for synthesizing articulated objects directly from RGB images or text prompts, overcoming the limitations of motion sequences and hand-curated datasets. It addresses the core challenges of inferring kinematic topologies for high-DoF objects and estimating joint parameters from static geometry, enabling more scalable and versatile modeling of complex objects for robotics and simulation.
Business Value
Enables faster and more cost-effective creation of 3D assets for robotics simulation, game development, and virtual/augmented reality applications, reducing manual effort and improving model fidelity.