Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 95% Match Research Paper Music technologists,AI researchers in creative domains,Musicians,Music theorists 2 weeks ago

MuseTok: Symbolic Music Tokenization for Generation and Semantic Understanding

speech-audio › music-ai
📄 Abstract

Abstract: Discrete representation learning has shown promising results across various domains, including generation and understanding in image, speech and language. Inspired by these advances, we propose MuseTok, a tokenization method for symbolic music, and investigate its effectiveness in both music generation and understanding tasks. MuseTok employs the residual vector quantized-variational autoencoder (RQ-VAE) on bar-wise music segments within a Transformer-based encoder-decoder framework, producing music codes that achieve high-fidelity music reconstruction and accurate understanding of music theory. For comprehensive evaluation, we apply MuseTok to music generation and semantic understanding tasks, including melody extraction, chord recognition, and emotion recognition. Models incorporating MuseTok outperform previous representation learning baselines in semantic understanding while maintaining comparable performance in content generation. Furthermore, qualitative analyses on MuseTok codes, using ground-truth categories and synthetic datasets, reveal that MuseTok effectively captures underlying musical concepts from large music collections.
Authors (7)
Jingyue Huang
Zachary Novack
Phillip Long
Yupeng Hou
Ke Chen
Taylor Berg-Kirkpatrick
+1 more
Submitted
October 18, 2025
arXiv Category
cs.SD
arXiv PDF

Key Contributions

MuseTok is a novel tokenization method for symbolic music using RQ-VAE within a Transformer framework, producing music codes for generation and understanding. It achieves high-fidelity reconstruction and accurate music theory understanding, outperforming baselines in semantic understanding tasks while maintaining comparable generation performance.

Business Value

Enables the creation of more sophisticated AI-powered music tools for composers, producers, and listeners, potentially revolutionizing music creation and consumption.