Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Learned visuomotor policies are capable of performing increasingly complex
manipulation tasks. However, most of these policies are trained on data
collected from limited robot positions and camera viewpoints. This leads to
poor generalization to novel robot positions, which limits the use of these
policies on mobile platforms, especially for precise tasks like pressing
buttons or turning faucets. In this work, we formulate the policy mobilization
problem: find a mobile robot base pose in a novel environment that is in
distribution with respect to a manipulation policy trained on a limited set of
camera viewpoints. Compared to retraining the policy itself to be more robust
to unseen robot base pose initializations, policy mobilization decouples
navigation from manipulation and thus does not require additional
demonstrations. Crucially, this problem formulation complements existing
efforts to improve manipulation policy robustness to novel viewpoints and
remains compatible with them. We propose a novel approach for policy
mobilization that bridges navigation and manipulation by optimizing the robot's
base pose to align with an in-distribution base pose for a learned policy. Our
approach utilizes 3D Gaussian Splatting for novel view synthesis, a score
function to evaluate pose suitability, and sampling-based optimization to
identify optimal robot poses. To understand policy mobilization in more depth,
we also introduce the Mobi-$\pi$ framework, which includes: (1) metrics that
quantify the difficulty of mobilizing a given policy, (2) a suite of simulated
mobile manipulation tasks based on RoboCasa to evaluate policy mobilization,
and (3) visualization tools for analysis. In both our developed simulation task
suite and the real world, we show that our approach outperforms baselines,
demonstrating its effectiveness for policy mobilization.