Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Pain is a manifold condition that impacts a significant percentage of the
population. Accurate and reliable pain evaluation for the people suffering is
crucial to developing effective and advanced pain management protocols.
Automatic pain assessment systems provide continuous monitoring and support
decision-making processes, ultimately aiming to alleviate distress and prevent
functionality decline. This study introduces PainFormer, a vision foundation
model based on multi-task learning principles trained simultaneously on 14
tasks/datasets with a total of 10.9 million samples. Functioning as an
embedding extractor for various input modalities, the foundation model provides
feature representations to the Embedding-Mixer, a transformer-based module that
performs the final pain assessment. Extensive experiments employing behavioral
modalities - including RGB, synthetic thermal, and estimated depth videos - and
physiological modalities such as ECG, EMG, GSR, and fNIRS revealed that
PainFormer effectively extracts high-quality embeddings from diverse input
modalities. The proposed framework is evaluated on two pain datasets, BioVid
and AI4Pain, and directly compared to 75 different methodologies documented in
the literature. Experiments conducted in unimodal and multimodal settings
demonstrate state-of-the-art performances across modalities and pave the way
toward general-purpose models for automatic pain assessment. The foundation
model's architecture (code) and weights are available at:
https://github.com/GkikasStefanos/PainFormer.