Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Medical artificial intelligence systems have achieved remarkable diagnostic
capabilities, yet they consistently exhibit performance disparities across
demographic groups, causing real-world harm to underrepresented populations.
While recent multimodal reasoning foundation models have advanced clinical
diagnosis through integrated analysis of diverse medical data, reasoning
trainings via reinforcement learning inherit and often amplify biases present
in training datasets dominated by majority populations. We introduce
Fairness-aware Group Relative Policy Optimization (FairGRPO), a hierarchical
reinforcement learning approach that promotes equitable learning across
heterogeneous clinical populations. FairGRPO employs adaptive importance
weighting of advantages based on representation, task difficulty, and data
source. To address the common issue of missing demographic labels in the
clinical domain, we further employ unsupervised clustering, which automatically
discovers latent demographic groups when labels are unavailable. Through
comprehensive experiments across 7 clinical diagnostic datasets spanning 5
clinical modalities across X-ray, CT scan, dermoscropy, mammography and
ultrasound, we demonstrate that FairGRPO reduces predictive parity by 27.2%
against all vanilla and bias mitigated RL baselines, while improving F1 score
by 12.49%. Furthermore, training dynamics analysis reveals that FairGRPO
progressively improves fairness throughout optimization, while baseline RL
methods exhibit deteriorating fairness as training progresses. Based on
FairGRPO, we release FairMedGemma-4B, a fairness-aware clinical VLLM that
achieves state-of-the-art performance while demonstrating significantly reduced
disparities across demographic groups.
Authors (4)
Shiqi Dai
Wei Dai
Jiaee Cheong
Paul Pu Liang
Submitted
October 22, 2025
Key Contributions
Introduces FairGRPO, a hierarchical reinforcement learning approach to promote equitable learning across heterogeneous clinical populations. FairGRPO uses adaptive importance weighting and unsupervised clustering to address performance disparities and bias in medical AI, particularly in scenarios with missing demographic labels.
Business Value
Enhances the trustworthiness and reliability of AI systems in healthcare, ensuring equitable outcomes for all patient populations and mitigating risks associated with biased decision-making.