Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
Fast-SmartWay presents an end-to-end zero-shot Vision-and-Language Navigation framework that eliminates the need for panoramic views and waypoint predictors, using only three frontal RGB-D images. It enhances decision robustness with an Uncertainty-Aware Reasoning module, enabling MLLMs to directly predict actions and achieve significantly improved performance in both simulated and real-robot environments.
Enables more responsive and adaptable robots for tasks like indoor navigation, delivery, and assistance, reducing development complexity and improving user experience.