Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: State-of-the-art perceptive Reinforcement Learning controllers for legged
robots either (i) impose oscillator or IK-based gait priors that constrain the
action space, add bias to the policy optimization and reduce adaptability
across robot morphologies, or (ii) operate "blind", which struggle to
anticipate hind-leg terrain, and are brittle to noise. In this paper, we
propose Phase-Guided Terrain Traversal (PGTT), a perception-aware deep-RL
approach that overcomes these limitations by enforcing gait structure purely
through reward shaping, thereby reducing inductive bias in policy learning
compared to oscillator/IK-conditioned action priors. PGTT encodes per-leg phase
as a cubic Hermite spline that adapts swing height to local heightmap
statistics and adds a swing-phase contact penalty, while the policy acts
directly in joint space supporting morphology-agnostic deployment. Trained in
MuJoCo (MJX) on procedurally generated stair-like terrains with curriculum and
domain randomization, PGTT achieves the highest success under push disturbances
(median +7.5% vs. the next best method) and on discrete obstacles (+9%), with
comparable velocity tracking, and converging to an effective policy roughly 2x
faster than strong end-to-end baselines. We validate PGTT on a Unitree Go2
using a real-time LiDAR elevation-to-heightmap pipeline, and we report
preliminary results on ANYmal-C obtained with the same hyperparameters. These
findings indicate that terrain-adaptive, phase-guided reward shaping is a
simple and general mechanism for robust perceptive locomotion across platforms.
Authors (3)
Alexandros Ntagkas
Chairi Kiourt
Konstantinos Chatzilygeroudis
Submitted
October 21, 2025
Key Contributions
Proposes Phase-Guided Terrain Traversal (PGTT), a perception-aware deep-RL approach for legged locomotion that overcomes limitations of fixed gait priors and blind operation. PGTT enforces gait structure via reward shaping, reducing inductive bias, and adapts swing height to terrain statistics, enabling morphology-agnostic deployment.
Business Value
Enables the development of more versatile and reliable legged robots for exploration, inspection, and logistics in unstructured environments.