Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Grammar refers to the system of rules that governs the structural
organization and the semantic relations among linguistic units such as
sentences, phrases, and words within a given language. In natural language
processing, there remains a notable scarcity of grammar focused evaluation
protocols, a gap that is even more pronounced for low-resource languages.
Moreover, the extent to which large language models genuinely comprehend
grammatical structure, especially the mapping between syntactic structures and
meanings, remains under debate. To investigate this issue, we propose a Grammar
Book Guided evaluation pipeline intended to provide a systematic and
generalizable framework for grammar evaluation consisting of four key stages,
and in this work we take Luxembourgish as a case study. The results show a weak
positive correlation between translation performance and grammatical
understanding, indicating that strong translations do not necessarily imply
deep grammatical competence. Larger models perform well overall due to their
semantic strength but remain weak in morphology and syntax, struggling
particularly with Minimal Pair tasks, while strong reasoning ability offers a
promising way to enhance their grammatical understanding.
Authors (10)
Lujun Li
Yewei Song
Lama Sleem
Yiqun Wang
Yangjie Xu
Cedric Lothritz
+4 more
Submitted
October 28, 2025
Key Contributions
This paper introduces a Grammar Book Guided evaluation pipeline to systematically assess LLMs' comprehension of grammatical structure, particularly for low-resource languages like Luxembourgish. It investigates the extent to which LLMs genuinely grasp grammar, finding a weak positive correlation between translation performance and grammatical understanding, suggesting strong translations don't always imply deep grammatical competence.
Business Value
Improves the development of more linguistically sophisticated LLMs, especially for under-represented languages. This can lead to better NLP tools for diverse linguistic communities and a deeper understanding of language acquisition in AI.