Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Recent advancements in legged robot perceptive locomotion have shown
promising progress. However, terrain-aware humanoid locomotion remains largely
constrained to two paradigms: depth image-based end-to-end learning and
elevation map-based methods. The former suffers from limited training
efficiency and a significant sim-to-real gap in depth perception, while the
latter depends heavily on multiple vision sensors and localization systems,
resulting in latency and reduced robustness. To overcome these challenges, we
propose a novel framework that tightly integrates three key components: (1)
Terrain-Aware Locomotion Policy with a Blind Backbone, which leverages
pre-trained elevation map-based perception to guide reinforcement learning with
minimal visual input; (2) Multi-Modality Cross-Attention Transformer, which
reconstructs structured terrain representations from noisy depth images; (3)
Realistic Depth Images Synthetic Method, which employs self-occlusion-aware ray
casting and noise-aware modeling to synthesize realistic depth observations,
achieving over 30\% reduction in terrain reconstruction error. This combination
enables efficient policy training with limited data and hardware resources,
while preserving critical terrain features essential for generalization. We
validate our framework on a full-sized humanoid robot, demonstrating agile and
adaptive locomotion across diverse and challenging terrains.
Key Contributions
This paper proposes a novel framework for depth-only perceptive humanoid locomotion that integrates a terrain-aware locomotion policy with a blind backbone, a multi-modality cross-attention transformer for terrain reconstruction from noisy depth images, and a realistic depth image synthesis method. This approach aims to overcome the limitations of existing paradigms by improving training efficiency and reducing the sim-to-real gap in depth perception.
Business Value
Enables more robust and adaptable humanoid robots capable of navigating complex and unknown terrains, crucial for applications in disaster response, exploration, and domestic assistance.