Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Algorithmic bias in medical imaging can perpetuate health disparities, yet
its causes remain poorly understood in segmentation tasks. While fairness has
been extensively studied in classification, segmentation remains underexplored
despite its clinical importance. In breast cancer segmentation, models exhibit
significant performance disparities against younger patients, commonly
attributed to physiological differences in breast density. We audit the
MAMA-MIA dataset, establishing a quantitative baseline of age-related bias in
its automated labels, and reveal a critical Biased Ruler effect where
systematically flawed labels for validation misrepresent a model's actual bias.
However, whether this bias originates from lower-quality annotations (label
bias) or from fundamentally more challenging image characteristics remains
unclear. Through controlled experiments, we systematically refute hypotheses
that the bias stems from label quality sensitivity or quantitative case
difficulty imbalance. Balancing training data by difficulty fails to mitigate
the disparity, revealing that younger patient cases are intrinsically harder to
learn. We provide direct evidence that systemic bias is learned and amplified
when training on biased, machine-generated labels, a critical finding for
automated annotation pipelines. This work introduces a systematic framework for
diagnosing algorithmic bias in medical segmentation and demonstrates that
achieving fairness requires addressing qualitative distributional differences
rather than merely balancing case counts.
Authors (3)
Aditya Parikh
Sneha Das
Aasa Feragen
Submitted
November 1, 2025
Key Contributions
This paper investigates the sources of age-related disparities in medical segmentation, specifically in breast cancer segmentation, by auditing the MAMA-MIA dataset. It quantifies age-related bias in automated labels, identifies the 'Biased Ruler Effect' where validation labels misrepresent model bias, and systematically refutes hypotheses about label quality or case difficulty being the sole origin, highlighting the complex interplay of factors contributing to algorithmic unfairness.
Business Value
Crucial for developing equitable AI healthcare solutions, ensuring that medical AI tools do not exacerbate existing health disparities and provide reliable performance across all patient demographics.