arxiv_cv 93% Match Research Paper Robotics researchers,3D artists,Game developers,AR/VR developers 3 days ago

Kinematify: Open-Vocabulary Synthesis of High-DoF Articulated Objects

robotics › manipulation

📄 Abstract

Abstract: A deep understanding of kinematic structures and movable components is essential for enabling robots to manipulate objects and model their own articulated forms. Such understanding is captured through articulated objects, which are essential for tasks such as physical simulation, motion planning, and policy learning. However, creating these models, particularly for complex systems like robots or objects with high degrees of freedom (DoF), remains a significant challenge. Existing methods typically rely on motion sequences or strong assumptions from hand-curated datasets, which hinders scalability. In this paper, we introduce Kinematify, an automated framework that synthesizes articulated objects directly from arbitrary RGB images or text prompts. Our method addresses two core challenges: (i) inferring kinematic topologies for high-DoF objects and (ii) estimating joint parameters from static geometry. To achieve this, we combine MCTS search for structural inference with geometry-driven optimization for joint reasoning, producing physically consistent and functionally valid descriptions. We evaluate Kinematify on diverse inputs from both synthetic and real-world environments, demonstrating improvements in registration and kinematic topology accuracy over prior work.

Authors (6)

Jiawei Wang

Dingyou Wang

Jiaming Hu

Qixuan Zhang

Jingyi Yu

Lan Xu

Submitted

November 3, 2025

arXiv Category

cs.RO

arXiv PDF

Key Contributions

Kinematify introduces an automated framework for synthesizing articulated objects directly from RGB images or text prompts, overcoming the limitations of motion sequences and hand-curated datasets. It addresses the core challenges of inferring kinematic topologies for high-DoF objects and estimating joint parameters from static geometry, enabling more scalable and versatile modeling of complex objects for robotics and simulation.

Business Value

Enables faster and more cost-effective creation of 3D assets for robotics simulation, game development, and virtual/augmented reality applications, reducing manual effort and improving model fidelity.

Paper Metadata

Innovation Type

Novel Framework/Methodology

Deployment Feasibility

Moderate. Requires significant computational resources for synthesis, but the resulting models can be used in various downstream applications.

Limitations Addressed

Difficulty in creating articulated object models, especially for high DoF systems,Reliance on motion sequences or strong assumptions from hand-curated datasets,Lack of scalability in existing methods

Technical Tags

articulated objectskinematic structureshigh degrees of freedom (DoF)RGB imagestext promptsMCTS searchgeometric estimationmotion planningphysical simulationrobot manipulation

Research Topics

Robotics3D Object RepresentationComputer VisionGenerative ModelingMotion Planning

Methods & Architectures

MCTS (Monte Carlo Tree Search)Geometric Parameter EstimationImage-to-3D SynthesisText-to-3D Synthesis MCTS-based inferenceGeometric models

Applications & Tasks

Robotics 3D Content Creation Virtual Reality Augmented Reality Synthesizing articulated objectsInferring kinematic topologiesEstimating joint parametersModeling complex systems with high DoF Articulated object synthesisPhysical simulationMotion planningRobot manipulation modeling

Related Fields

RoboticsComputer Vision3D GraphicsMachine LearningGenerative Models

Keywords

Articulated ObjectsKinematicsRobotics3D SynthesisRGB ImagesText PromptsMCTSGeometric ModelingMotion PlanningPhysical SimulationHigh DoF

Academic Context

#Robotics#3D Object Representation#Computer Vision#Generative Modeling#Motion Planning

Commercial Potential

Potential Products

3D asset generation toolsRobotics simulation environmentsAR/VR content creation platforms

Target Industries

RoboticsGamingEntertainmentManufacturingArchitecture

Use Case Examples

Generating realistic robot models for simulationCreating interactive 3D objects from descriptions for VR experiencesSynthesizing complex machinery for industrial design

Competitive Edge

Provides a novel, automated approach to synthesizing articulated objects from diverse inputs, potentially outperforming methods reliant on specific motion data or manual curation.

Market Opportunity

Significant market for 3D content creation tools and robotics simulation platforms.

Revenue Models

Licensing of the synthesis technologysubscription-based access to a generation service.

Resource Requirements

Compute Needs

High, especially for the MCTS search and geometric inference during synthesis.

Data Requirements

Large datasets of RGB images and/or text descriptions of articulated objects.

Deployment Constraints

Computational cost of synthesis, potential need for fine-tuning for specific object categories.

Scalability

Scalability is a key focus, aiming to handle high DoF objects and diverse inputs.

Production Readiness

Maturity Level

Research/Prototype

Time to Market

2-4 years for a usable tool.

Patent Potential

Moderate, for the novel synthesis pipeline and MCTS application.

View Full Paper Back to Papers