Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
Proposes a puppeteer-style paradigm for LLM multi-agent collaboration where a centralized orchestrator dynamically directs agents. This approach uses reinforcement learning to adaptively sequence and prioritize agents, enabling flexible and evolvable collective reasoning, which outperforms static structures in complex scenarios with reduced computational costs.
Enables more efficient and scalable deployment of LLM-based systems for complex tasks, potentially reducing operational costs and improving service quality in areas like customer support or complex data analysis.