Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cl 70% Match Research Paper Linguists,Computational Linguists,NLP Researchers 3 weeks ago

Quantifying Phonosemantic Iconicity Distributionally in 6 Languages

speech-audio › text-to-speech
📄 Abstract

Abstract: Language is, as commonly theorized, largely arbitrary. Yet, systematic relationships between phonetics and semantics have been observed in many specific cases. To what degree could those systematic relationships manifest themselves in large scale, quantitative investigations--both in previously identified and unidentified phenomena? This work undertakes a distributional approach to quantifying phonosemantic iconicity at scale across 6 diverse languages (English, Spanish, Hindi, Finnish, Turkish, and Tamil). In each language, we analyze the alignment of morphemes' phonetic and semantic similarity spaces with a suite of statistical measures, and discover an array of interpretable phonosemantic alignments not previously identified in the literature, along with crosslinguistic patterns. We also analyze 5 previously hypothesized phonosemantic alignments, finding support for some such alignments and mixed results for others.
Authors (2)
George Flint
Kaustubh Kislay
Submitted
October 15, 2025
arXiv Category
cs.CL
arXiv PDF

Key Contributions

This work presents a novel distributional approach to quantitatively measure phonosemantic iconicity across six diverse languages. It discovers new interpretable phonosemantic alignments and cross-linguistic patterns, contributing to a deeper understanding of the non-arbitrary aspects of language.

Business Value

Understanding the systematic relationships between sound and meaning can inform the design of more intuitive and effective natural language processing systems, potentially improving machine translation, speech recognition, and text generation.