Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Recent advances in video diffusion models have significantly enhanced
text-to-video generation, particularly through alignment tuning using reward
models trained on human preferences. While these methods improve visual
quality, they can unintentionally encode and amplify social biases. To
systematically trace how such biases evolve throughout the alignment pipeline,
we introduce VideoBiasEval, a comprehensive diagnostic framework for evaluating
social representation in video generation. Grounded in established social bias
taxonomies, VideoBiasEval employs an event-based prompting strategy to
disentangle semantic content (actions and contexts) from actor attributes
(gender and ethnicity). It further introduces multi-granular metrics to
evaluate (1) overall ethnicity bias, (2) gender bias conditioned on ethnicity,
(3) distributional shifts in social attributes across model variants, and (4)
the temporal persistence of bias within videos. Using this framework, we
conduct the first end-to-end analysis connecting biases in human preference
datasets, their amplification in reward models, and their propagation through
alignment-tuned video diffusion models. Our results reveal that alignment
tuning not only strengthens representational biases but also makes them
temporally stable, producing smoother yet more stereotyped portrayals. These
findings highlight the need for bias-aware evaluation and mitigation throughout
the alignment process to ensure fair and socially responsible video generation.
Authors (9)
Zefan Cai
Haoyi Qiu
Haozhe Zhao
Ke Wan
Jiachen Li
Jiuxiang Gu
+3 more
Submitted
October 20, 2025
Key Contributions
This paper introduces VideoBiasEval, a diagnostic framework for evaluating social bias in video diffusion models, particularly after alignment tuning. It uses event-based prompting and multi-granular metrics to systematically analyze how biases related to gender and ethnicity are encoded and amplified, highlighting the unintended consequences of optimizing for human preferences.
Business Value
Helps developers create more ethical and inclusive AI-generated content, reducing the risk of perpetuating harmful stereotypes. This is crucial for brand reputation and responsible AI deployment in media and entertainment.