Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Imitation learning from large-scale, diverse human demonstrations has proven
effective for training robots, but collecting such data is costly and
time-consuming. This challenge is amplified for multi-step bimanual mobile
manipulation, where humans must teleoperate both a mobile base and two
high-degree-of-freedom arms. Prior automated data generation frameworks have
addressed static bimanual manipulation by augmenting a few human demonstrations
in simulation, but they fall short for mobile settings due to two key
challenges: (1) determining base placement to ensure reachability, and (2)
positioning the camera to provide sufficient visibility for visuomotor
policies. To address these issues, we introduce MoMaGen, which formulates data
generation as a constrained optimization problem that enforces hard constraints
(e.g., reachability) while balancing soft constraints (e.g., visibility during
navigation). This formulation generalizes prior approaches and provides a
principled foundation for future methods. We evaluate MoMaGen on four
multi-step bimanual mobile manipulation tasks and show that it generates
significantly more diverse datasets than existing methods. Leveraging this
diversity, MoMaGen can train successful imitation learning policies from a
single source demonstration, and these policies can be fine-tuned with as few
as 40 real-world demonstrations to achieve deployment on physical robotic
hardware. More details are available at our project page: momagen.github.io.
Authors (14)
Chengshu Li
Mengdi Xu
Arpit Bahety
Hang Yin
Yunfan Jiang
Huang Huang
+8 more
Submitted
October 21, 2025
Key Contributions
MoMaGen introduces a novel framework for generating diverse human demonstrations for multi-step bimanual mobile manipulation by formulating data generation as a constrained optimization problem. This approach effectively addresses challenges in mobile settings, such as ensuring base reachability and optimizing camera placement for visuomotor policies, generalizing prior static manipulation methods.
Business Value
Accelerates the development and deployment of sophisticated robotic systems by reducing the bottleneck of data collection, leading to faster automation in logistics, manufacturing, and service industries.