Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Model-free reinforcement learning (RL) has enabled adaptable and agile
quadruped locomotion; however, policies often converge to a single gait,
leading to suboptimal performance. Traditionally, Model Predictive Control
(MPC) has been extensively used to obtain task-specific optimal policies but
lacks the ability to adapt to varying environments. To address these
limitations, we propose an optimization framework for real-time gait adaptation
in a continuous gait space, combining the Model Predictive Path Integral (MPPI)
algorithm with a Dreamer module to produce adaptive and optimal policies for
quadruped locomotion. At each time step, MPPI jointly optimizes the actions and
gait variables using a learned Dreamer reward that promotes velocity tracking,
energy efficiency, stability, and smooth transitions, while penalizing abrupt
gait changes. A learned value function is incorporated as terminal reward,
extending the formulation to an infinite-horizon planner. We evaluate our
framework in simulation on the Unitree Go1, demonstrating an average reduction
of up to 36.48\% in energy consumption across varying target speeds, while
maintaining accurate tracking and adaptive, task-appropriate gaits.
Authors (3)
Prakrut Kotecha
Ganga Nair B
Shishir Kolathaya
Submitted
October 23, 2025
Key Contributions
Proposes an optimization framework combining MPPI and Dreamer for real-time gait adaptation in quadrupeds, enabling policies to operate in a continuous gait space. This approach allows for adaptive and optimal policies that balance velocity tracking, energy efficiency, and stability, overcoming limitations of traditional RL and MPC.
Business Value
Enables more versatile and robust robotic platforms for tasks like exploration, inspection, and delivery in complex environments.