Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 92% Match Research Paper AI Researchers,Legal Tech Developers,Legal Professionals,NLP Engineers 2 weeks ago

LeCoDe: A Benchmark Dataset for Interactive Legal Consultation Dialogue Evaluation

large-language-models › evaluation
📄 Abstract

Abstract: Legal consultation is essential for safeguarding individual rights and ensuring access to justice, yet remains costly and inaccessible to many individuals due to the shortage of professionals. While recent advances in Large Language Models (LLMs) offer a promising path toward scalable, low-cost legal assistance, current systems fall short in handling the interactive and knowledge-intensive nature of real-world consultations. To address these challenges, we introduce LeCoDe, a real-world multi-turn benchmark dataset comprising 3,696 legal consultation dialogues with 110,008 dialogue turns, designed to evaluate and improve LLMs' legal consultation capability. With LeCoDe, we innovatively collect live-streamed consultations from short-video platforms, providing authentic multi-turn legal consultation dialogues. The rigorous annotation by legal experts further enhances the dataset with professional insights and expertise. Furthermore, we propose a comprehensive evaluation framework that assesses LLMs' consultation capabilities in terms of (1) clarification capability and (2) professional advice quality. This unified framework incorporates 12 metrics across two dimensions. Through extensive experiments on various general and domain-specific LLMs, our results reveal significant challenges in this task, with even state-of-the-art models like GPT-4 achieving only 39.8% recall for clarification and 59% overall score for advice quality, highlighting the complexity of professional consultation scenarios. Based on these findings, we further explore several strategies to enhance LLMs' legal consultation abilities. Our benchmark contributes to advancing research in legal domain dialogue systems, particularly in simulating more real-world user-expert interactions.
Authors (9)
Weikang Yuan
Kaisong Song
Zhuoren Jiang
Junjie Cao
Yujie Zhang
Jun Lin
+3 more
Submitted
May 26, 2025
arXiv Category
cs.CL
arXiv PDF

Key Contributions

Introduces LeCoDe, a real-world multi-turn benchmark dataset for evaluating LLMs in legal consultation dialogues. Comprising 3,696 dialogues with 110,008 turns, it uses live-streamed consultations annotated by legal experts. LeCoDe aims to address the challenges LLMs face in handling the interactive and knowledge-intensive nature of legal consultations.

Business Value

Facilitates the development of more capable and accessible AI-powered legal assistance, potentially lowering costs and increasing access to justice for individuals and businesses.