arxiv_cl 90% Match Research paper NLP researchers,Linguists,AI ethicists,Developers of multilingual NLP systems 1 week ago

Do Large Language Models Grasp The Grammar? Evidence from Grammar-Book-Guided Probing in Luxembourgish

large-language-models › evaluation

📄 Abstract

Abstract: Grammar refers to the system of rules that governs the structural organization and the semantic relations among linguistic units such as sentences, phrases, and words within a given language. In natural language processing, there remains a notable scarcity of grammar focused evaluation protocols, a gap that is even more pronounced for low-resource languages. Moreover, the extent to which large language models genuinely comprehend grammatical structure, especially the mapping between syntactic structures and meanings, remains under debate. To investigate this issue, we propose a Grammar Book Guided evaluation pipeline intended to provide a systematic and generalizable framework for grammar evaluation consisting of four key stages, and in this work we take Luxembourgish as a case study. The results show a weak positive correlation between translation performance and grammatical understanding, indicating that strong translations do not necessarily imply deep grammatical competence. Larger models perform well overall due to their semantic strength but remain weak in morphology and syntax, struggling particularly with Minimal Pair tasks, while strong reasoning ability offers a promising way to enhance their grammatical understanding.

Authors (10)

Lujun Li

Yewei Song

Lama Sleem

Yiqun Wang

Yangjie Xu

Cedric Lothritz

+4 more

Submitted

October 28, 2025

arXiv Category

cs.CL

arXiv PDF

Key Contributions

This paper introduces a Grammar Book Guided evaluation pipeline to systematically assess LLMs' comprehension of grammatical structure, particularly for low-resource languages like Luxembourgish. It investigates the extent to which LLMs genuinely grasp grammar, finding a weak positive correlation between translation performance and grammatical understanding, suggesting strong translations don't always imply deep grammatical competence.

Business Value

Improves the development of more linguistically sophisticated LLMs, especially for under-represented languages. This can lead to better NLP tools for diverse linguistic communities and a deeper understanding of language acquisition in AI.

Paper Metadata

Innovation Type

Evaluation Framework/Methodology

Deployment Feasibility

High, as it's an evaluation framework that can be applied to existing models.

Limitations Addressed

Scarcity of grammar-focused evaluation protocols, especially for low-resource languages, and the debate on LLMs' true grammatical comprehension.

Performance Gains

Quantified the weak correlation between translation and grammatical understanding.,Provided a framework for grammar evaluation.

Technical Tags

Grammar evaluationLow-resource languagesGrammar book guided probingSyntactic structureSemantic relationsLinguistic comprehensionLuxembourgishTranslation performance

Research Topics

Linguistic Competence of LLMsGrammar UnderstandingLow-Resource NLPLLM EvaluationCross-lingual Transfer

Methods & Architectures

Grammar book guided evaluation pipelineProbing tasksTranslation evaluation Transformer-based LLMs

Applications & Tasks

Linguistics research NLP model development Low-resource language processing Scarcity of grammar evaluation protocolsLimited understanding of LLM grammatical comprehensionChallenges in low-resource languages Evaluating LLM grammatical understandingDeveloping grammar-focused evaluation frameworksAssessing LLMs for low-resource languages

Related Fields

LinguisticsNatural Language ProcessingComputational LinguisticsMachine Learning Evaluation

Keywords

Large Language ModelsLLMGrammarEvaluationLinguisticsLow-resource languagesLuxembourgishSyntactic structureSemantic relationsProbingTranslationComprehensionNLP

Academic Context

#Linguistic Competence of LLMs#Grammar Understanding#Low-Resource NLP#LLM Evaluation#Cross-lingual Transfer

Commercial Potential

Potential Products

Grammar-aware LLM fine-tuning toolsEvaluation suites for linguistic capabilities

Target Industries

EducationTechnologyPublishing

Use Case Examples

Assessing if an LLM truly understands German sentence structure beyond surface-level translation.Developing NLP tools for Luxembourgish speakers.

Competitive Edge

Provides a specialized, grammar-centric evaluation method, particularly valuable for low-resource languages where standard benchmarks are scarce.

Market Opportunity

Growing interest in understanding LLM linguistic capabilities and supporting low-resource languages.

Revenue Models

N/A (research focus)

Resource Requirements

Compute Needs

Moderate (requires running LLMs for evaluation)

Data Requirements

Grammar books, parallel corpora for Luxembourgish

Deployment Constraints

Requires expertise in linguistics and the target language.

Scalability

The framework itself is scalable; application depends on LLM inference speed.

Production Readiness

Maturity Level

Research

Time to Market

N/A (research focus)

Patent Potential

Low (evaluation methodology)

View Full Paper Back to Papers