Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 98% Match Research Paper MARL researchers,Robotics engineers,AI researchers working on multi-agent systems,Game developers 1 week ago

Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning

reinforcement-learning › multi-agent
📄 Abstract

Abstract: Designing efficient algorithms for multi-agent reinforcement learning (MARL) is fundamentally challenging because the size of the joint state and action spaces grows exponentially in the number of agents. These difficulties are exacerbated when balancing sequential global decision-making with local agent interactions. In this work, we propose a new algorithm $\texttt{SUBSAMPLE-MFQ}$ ($\textbf{Subsample}$-$\textbf{M}$ean-$\textbf{F}$ield-$\textbf{Q}$-learning) and a decentralized randomized policy for a system with $n$ agents. For any $k\leq n$, our algorithm learns a policy for the system in time polynomial in $k$. We prove that this learned policy converges to the optimal policy on the order of $\tilde{O}(1/\sqrt{k})$ as the number of subsampled agents $k$ increases. In particular, this bound is independent of the number of agents $n$.
Authors (3)
Emile Anand
Ishani Karmarkar
Guannan Qu
Submitted
December 1, 2024
arXiv Category
cs.LG
arXiv PDF

Key Contributions

Introduces SUBSAMPLE-MFQ, a novel MARL algorithm that learns policies for systems with 'n' agents in time polynomial in 'k' (where k <= n), effectively decoupling learning complexity from the total number of agents. It provides theoretical convergence guarantees to the optimal policy as 'k' increases, independent of 'n'.

Business Value

Enables more efficient and scalable coordination of large fleets of autonomous agents (e.g., drones, robots) for tasks like logistics, exploration, or swarm control, reducing computational overhead.