Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
Introduces SUBSAMPLE-MFQ, a novel MARL algorithm that learns policies for systems with 'n' agents in time polynomial in 'k' (where k <= n), effectively decoupling learning complexity from the total number of agents. It provides theoretical convergence guarantees to the optimal policy as 'k' increases, independent of 'n'.
Enables more efficient and scalable coordination of large fleets of autonomous agents (e.g., drones, robots) for tasks like logistics, exploration, or swarm control, reducing computational overhead.