Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: The advancement of large language models (LLMs) has enabled the construction
of multi-agent systems to solve complex tasks by dividing responsibilities
among specialized agents, such as a planning agent for subgoal generation and a
grounding agent for executing tool-use actions. Most existing methods typically
fine-tune these agents independently, leading to capability gaps among them
with poor coordination. To address this, we propose MOAT, a Multi-Agent Joint
Alignment Tuning framework that improves agents collaboration through iterative
alignment. MOAT alternates between two key stages: (1) Planning Agent
Alignment, which optimizes the planning agent to generate subgoal sequences
that better guide the grounding agent; and (2) Grounding Agent Improving, which
fine-tunes the grounding agent using diverse subgoal-action pairs generated by
the agent itself to enhance its generalization capablity. Theoretical analysis
proves that MOAT ensures a non-decreasing and progressively convergent training
process. Experiments across six benchmarks demonstrate that MOAT outperforms
state-of-the-art baselines, achieving average improvements of 3.1% on held-in
tasks and 4.4% on held-out tasks.