Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
This paper re-examines the cross-lingual gap in LLMs from a statistical perspective, hypothesizing that response variance, rather than latent representation divergence, is the main cause. It formalizes the gap using bias-variance decomposition and provides extensive experimental evidence supporting this hypothesis, offering a new framework for understanding and potentially mitigating cross-lingual performance differences.
A better statistical understanding of cross-lingual gaps can lead to more equitable and reliable LLM performance across different languages, crucial for global applications and accessibility.