Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
π Abstract
Abstract: Autonomous vehicles (AV) offer a cost-effective solution for scientific
missions such as underwater tracking. Recently, reinforcement learning (RL) has
emerged as a powerful method for controlling AVs in complex marine
environments. However, scaling these techniques to a fleet--essential for
multi-target tracking or targets with rapid, unpredictable motion--presents
significant computational challenges. Multi-Agent Reinforcement Learning (MARL)
is notoriously sample-inefficient, and while high-fidelity simulators like
Gazebo's LRAUV provide 100x faster-than-real-time single-robot simulations,
they offer no significant speedup for multi-vehicle scenarios, making MARL
training impractical. To address these limitations, we propose an iterative
distillation method that transfers high-fidelity simulations into a simplified,
GPU-accelerated environment while preserving high-level dynamics. This approach
achieves up to a 30,000x speedup over Gazebo through parallelization, enabling
efficient training via end-to-end GPU acceleration. Additionally, we introduce
a novel Transformer-based architecture (TransfMAPPO) that learns multi-agent
policies invariant to the number of agents and targets, significantly improving
sample efficiency. Following large-scale curriculum learning conducted entirely
on GPU, we perform extensive evaluations in Gazebo, demonstrating that our
method maintains tracking errors below 5 meters over extended durations, even
in the presence of multiple fast-moving targets. This work bridges the gap
between large-scale MARL training and high-fidelity deployment, providing a
scalable framework for autonomous fleet control in real-world sea missions.
Authors (3)
Matteo Gallici
Ivan Masmitja
Mario MartΓn
Key Contributions
This paper proposes an iterative distillation method to scale Multi-Agent Reinforcement Learning (MARL) for underwater acoustic tracking using autonomous vehicles. By transferring high-fidelity simulations into a simplified, GPU-accelerated environment, the approach achieves up to a 30,000x speedup over traditional simulators, making MARL training practical for multi-vehicle scenarios.
Business Value
Enables cost-effective and efficient multi-vehicle operations for underwater missions, such as scientific exploration, resource surveying, and infrastructure inspection, leading to faster data acquisition and reduced operational costs.