Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Large language models exhibit uneven performance across languages, with
substantial gaps between high- and low-resource languages. We present a
framework for enhancing monolingual capabilities of LLMs in underrepresented
languages while preserving their general-purpose performance through targeted
fine-tuning of language-specific subnetworks. Our approach identifies
language-specific neurons using Language Activation Probability Entropy and
fine-tunes only the weights associated with these neurons, a dedicated
subnetwork, on target-language data. Experiments on Llama-3.1-8B and
Mistral-Nemo-12B across 12 mid- and low-resource languages demonstrate that our
method consistently outperforms full fine-tuning, FFN-only fine-tuning, LoRA
adaptation, and random subset fine-tuning baselines while efficiently updating
only up to 1% of model parameters. Beyond performance improvements, we observe
enhanced favorable training dynamics, cross-lingual representational alignment,
and systematic weight update changes. To facilitate future research, we release
language-specific neuron identifications for over 100 languages as well as our
adaptation pipeline, offering a cost-effective pathway for adapting
state-of-the-art models to underrepresented languages.
Key Contributions
This paper proposes a framework for enhancing LLM performance in underrepresented languages by fine-tuning sparse subnetworks identified using Language Activation Probability Entropy. This method outperforms full fine-tuning and other PEFT methods, updating only up to 1% of parameters while preserving general capabilities and improving training dynamics and cross-lingual alignment.
Business Value
Enables the development of more equitable and globally accessible AI technologies by improving LLM performance for underrepresented languages, unlocking new markets and applications.