arxiv_cl 90% Match Research Paper NLP researchers,Machine learning engineers,Computational linguists 3 weeks ago

Multilinguality Does not Make Sense: Investigating Factors Behind Zero-Shot Transfer in Sense-Aware Tasks

large-language-models › evaluation

📄 Abstract

Abstract: Cross-lingual transfer is central to modern NLP, enabling models to perform tasks in languages different from those they were trained on. A common assumption is that training on more languages improves zero-shot transfer. We test this on sense-aware tasks-polysemy and lexical semantic change-and find that multilinguality is not necessary for effective transfer. Our large-scale analysis across 28 languages reveals that other factors, such as differences in pretraining and fine-tuning data and evaluation artifacts, better explain the perceived benefits of multilinguality. We also release fine-tuned models and provide empirical baselines to support future research. While focused on two sense-aware tasks, our findings offer broader insights into cross-lingual transfer, especially for low-resource languages.

Authors (2)

Roksana Goworek

Haim Dubossarsky

Submitted

May 30, 2025

arXiv Category

cs.CL

arXiv PDF

Key Contributions

Challenges the assumption that multilinguality is necessary for effective zero-shot transfer in sense-aware tasks. Through large-scale analysis across 28 languages, it identifies pretraining/fine-tuning data and evaluation artifacts as more significant factors, offering a refined understanding of cross-lingual transfer, especially for low-resource languages.

Business Value

Optimizes the development of multilingual NLP models by focusing on more impactful factors than just language count, potentially reducing training costs and improving performance for low-resource languages.

Paper Metadata

Innovation Type

Analysis and Insight

Deployment Feasibility

High, as it provides insights for model development and evaluation.

Limitations Addressed

The assumption that more languages in training lead to better zero-shot transfer; lack of understanding of factors influencing cross-lingual transfer.

Technical Tags

cross-lingual transferzero-shot learningsense-aware taskspolysemylexical semantic changemultilingualitypretrainingfine-tuning

Research Topics

Cross-lingual Transfer LearningZero-Shot LearningMultilingual NLPLexical SemanticsModel Evaluation

Methods & Architectures

Large-scale analysisEmpirical evaluationFine-tuningZero-shot transfer testing Multilingual Language Models

Applications & Tasks

Natural Language Processing Computational Linguistics Understanding cross-lingual transferInvestigating factors in zero-shot transferEvaluating sense-aware tasks Polysemy detectionLexical semantic change detectionCross-lingual transfer

Related Fields

Natural Language ProcessingMachine LearningComputational LinguisticsLinguistics

Keywords

cross-lingualzero-shotmultilingualNLPsense-awarepolysemysemantic changetransfer learningpretrainingfine-tuninglow-resource languagesevaluation

Academic Context

#Cross-lingual Transfer Learning#Zero-Shot Learning#Multilingual NLP#Lexical Semantics#Model Evaluation

Commercial Potential

Potential Products

More efficient multilingual NLP modelsTools for analyzing cross-lingual transfer

Target Industries

TechnologySoftware DevelopmentGlobal Communication

Use Case Examples

Developing translation systems for low-resource languagesImproving cross-lingual information retrieval

Competitive Edge

Provides a more nuanced understanding of cross-lingual transfer than previous work that focused solely on the number of languages.

Resource Requirements

Compute Needs

Significant for large-scale analysis across 28 languages.

Data Requirements

Large multilingual datasets for pretraining and fine-tuning.

Deployment Constraints

Model performance may still vary based on specific language pairs and task complexities.

Scalability

The findings are based on a large-scale analysis, suggesting scalability of the insights.

Production Readiness

Maturity Level

Research

View Full Paper Back to Papers