Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Vision-Language-Action models (VLAs) have demonstrated remarkable performance
on complex robotic manipulation tasks through imitation learning. However,
existing imitation learning datasets contain only successful trajectories and
lack failure or recovery data, especially for out-of-distribution (OOD) states
where the robot deviates from the main policy due to minor perturbations or
errors, leading VLA models to struggle with states deviating from the training
distribution. To this end, we propose an automated OOD data augmentation
framework named RESample through exploratory sampling. Specifically, we first
leverage offline reinforcement learning to obtain an action-value network that
accurately identifies sub-optimal actions under the current manipulation
policy. We further sample potential OOD states from trajectories via rollout,
and design an exploratory sampling mechanism that adaptively incorporates these
action proxies into the training dataset to ensure efficiency. Subsequently,
our framework explicitly encourages the VLAs to recover from OOD states and
enhances their robustness against distributional shifts. We conduct extensive
experiments on the LIBERO benchmark as well as real-world robotic manipulation
tasks, demonstrating that RESample consistently improves the stability and
generalization ability of VLA models.
Authors (8)
Yuquan Xue
Guanxing Lu
Zhenyu Wu
Chuanrui Zhang
Bofang Jia
Zhengyi Gu
+2 more
Submitted
October 20, 2025
Key Contributions
Proposes RESample, an automated Out-of-Distribution (OOD) data augmentation framework for robotic manipulation that uses exploratory sampling. By leveraging offline RL to identify sub-optimal actions and adaptively incorporating sampled OOD states, RESample enhances the robustness of Vision-Language-Action models to deviations from the training distribution.
Business Value
Enables the development of more reliable and adaptable robots that can handle unexpected situations and recover from errors, crucial for safe and efficient deployment in real-world environments.