Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: In this paper, we study the problem of reinforcement learning in multi-agent
systems where communication among agents is limited. We develop a decentralized
actor-critic learning framework in which each agent performs several local
updates of its policy and value function, where the latter is approximated by a
multi-layer neural network, before exchanging information with its neighbors.
This local training strategy substantially reduces the communication burden
while maintaining coordination across the network. We establish finite-time
convergence analysis for the algorithm under Markov-sampling. Specifically, to
attain the $\varepsilon$-accurate stationary point, the sample complexity is of
order $\mathcal{O}(\varepsilon^{-3})$ and the communication complexity is of
order $\mathcal{O}(\varepsilon^{-1}\tau^{-1})$, where tau denotes the number of
local training steps. We also show how the final error bound depends on the
neural network's approximation quality. Numerical experiments in a cooperative
control setting illustrate and validate the theoretical findings.
Authors (4)
Xiaoxing Ren
Nicola Bastianello
Thomas Parisini
Andreas A. Malikopoulos
Submitted
October 22, 2025
Key Contributions
This paper proposes a communication-efficient decentralized actor-critic algorithm for multi-agent reinforcement learning with limited communication. It allows agents to perform local updates before exchanging information, significantly reducing communication burden while maintaining coordination. Finite-time convergence analysis is provided, with sample complexity of $\mathcal{O}(\varepsilon^{-3})$ and communication complexity of $\mathcal{O}(\varepsilon^{-1}\tau^{-1})$.
Business Value
Enables efficient coordination and learning in large-scale multi-agent systems (e.g., fleets of robots, traffic control) where communication is a bottleneck, leading to more scalable and cost-effective AI solutions.