Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Building Task-Oriented Dialogue (TOD) systems that generalize across
different tasks remains a challenging problem. Data-driven approaches often
struggle to transfer effectively to unseen tasks. While recent schema-based TOD
frameworks improve generalization by decoupling task logic from language
understanding, their reliance on neural or generative models often obscures how
task schemas influence behaviour and hence impair interpretability. In this
work, we introduce a novel framework, CoDial (Code for Dialogue), which
converts a TOD task schema, represented as a novel structured heterogeneous
graph, to programmatic LLM guardrailing code, such as NVIDIA's Colang, enabling
interpretable and efficient alignment of dialogue policies during inference. We
introduce two paradigms, $\text{CoDial}_{\text{free}}$ and
$\text{CoDial}_{\text{structured}}$ for generating LLM guardrails, and propose
a feedback mechanism that integrates human feedback to iteratively improve the
generated code. Empirically, CoDial achieves state-of-the-art (SOTA)
performance on the widely used STAR dataset and is on par with SOTA on the
MultiWOZ dataset, while also providing interpretability. We additionally
demonstrate CoDial's iterative improvement via manual and LLM-aided feedback,
making it a practical tool for expert-guided alignment of LLMs in high-stakes
domains.
Authors (5)
Radin Shayanfar
Chu Fei Luo
Rohan Bhambhoria
Samuel Dahan
Xiaodan Zhu
Key Contributions
Introduces CoDial, a framework that converts task-oriented dialogue schemas into programmatic LLM guardrailing code (like NVIDIA's Colang). This enables interpretable and efficient alignment of dialogue policies during inference, improving generalization and allowing human feedback integration.
Business Value
Creates more trustworthy and controllable conversational AI systems, reducing development complexity and improving user experience in task-oriented applications.