Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 90% Match Research Paper Robotics Researchers,ML Engineers,Roboticists 2 weeks ago

MoMaGen: Generating Demonstrations under Soft and Hard Constraints for Multi-Step Bimanual Mobile Manipulation

robotics › manipulation
📄 Abstract

Abstract: Imitation learning from large-scale, diverse human demonstrations has proven effective for training robots, but collecting such data is costly and time-consuming. This challenge is amplified for multi-step bimanual mobile manipulation, where humans must teleoperate both a mobile base and two high-degree-of-freedom arms. Prior automated data generation frameworks have addressed static bimanual manipulation by augmenting a few human demonstrations in simulation, but they fall short for mobile settings due to two key challenges: (1) determining base placement to ensure reachability, and (2) positioning the camera to provide sufficient visibility for visuomotor policies. To address these issues, we introduce MoMaGen, which formulates data generation as a constrained optimization problem that enforces hard constraints (e.g., reachability) while balancing soft constraints (e.g., visibility during navigation). This formulation generalizes prior approaches and provides a principled foundation for future methods. We evaluate MoMaGen on four multi-step bimanual mobile manipulation tasks and show that it generates significantly more diverse datasets than existing methods. Leveraging this diversity, MoMaGen can train successful imitation learning policies from a single source demonstration, and these policies can be fine-tuned with as few as 40 real-world demonstrations to achieve deployment on physical robotic hardware. More details are available at our project page: momagen.github.io.
Authors (14)
Chengshu Li
Mengdi Xu
Arpit Bahety
Hang Yin
Yunfan Jiang
Huang Huang
+8 more
Submitted
October 21, 2025
arXiv Category
cs.RO
arXiv PDF

Key Contributions

MoMaGen introduces a novel framework for generating diverse human demonstrations for multi-step bimanual mobile manipulation by formulating data generation as a constrained optimization problem. This approach effectively addresses challenges in mobile settings, such as ensuring base reachability and optimizing camera placement for visuomotor policies, generalizing prior static manipulation methods.

Business Value

Accelerates the development and deployment of sophisticated robotic systems by reducing the bottleneck of data collection, leading to faster automation in logistics, manufacturing, and service industries.