Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 95% Match Research Paper AI researchers,ML engineers,Developers of multimodal systems,Benchmark creators 2 weeks ago

MMAO-Bench: MultiModal All in One Benchmark Reveals Compositional Law between Uni-modal and Omni-modal in OmniModels

large-language-models › multimodal-llms
📄 Abstract

Abstract: Multimodal Large Languages models have been progressing from uni-modal understanding toward unifying visual, audio and language modalities, collectively termed omni models. However, the correlation between uni-modal and omni-modal remains unclear, which requires comprehensive evaluation to drive omni model's intelligence evolution. In this work, we propose a novel, high quality and diversity omni model benchmark, MultiModal All in One Benchmark (MMAO-Bench), which effectively assesses both uni-modal and omni-modal understanding capabilities. The benchmark consists of 1880 human curated samples, across 44 task types, and a innovative multi-step open-ended question type that better assess complex reasoning tasks. Experimental result shows the compositional law between cross-modal and uni-modal performance and the omni-modal capability manifests as a bottleneck effect on weak models, while exhibiting synergistic promotion on strong models.
Authors (9)
Chen Chen
ZeYang Hu
Fengjiao Chen
Liya Ma
Jiaxing Liu
Xiaoyu Li
+3 more
Submitted
October 21, 2025
arXiv Category
cs.CL
arXiv PDF

Key Contributions

This paper introduces MMAO-Bench, a novel, high-quality benchmark for evaluating Multimodal Large Language Models (Omni Models). It assesses both uni-modal and omni-modal understanding capabilities and reveals the compositional law between them, highlighting how uni-modal performance influences omni-modal capabilities and identifying bottleneck effects.

Business Value

Provides a standardized way to measure and compare the progress of multimodal AI, accelerating development and identifying areas for improvement in creating more capable and versatile AI systems.