Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: While image-text foundation models have succeeded across diverse downstream
tasks, they still face challenges in the presence of spurious correlations
between the input and label. To address this issue, we propose a simple
three-step approach,Project-Probe-Aggregate (PPA), that enables
parameter-efficient fine-tuning for foundation models without relying on group
annotations. Building upon the failure-based debiasing scheme, our method, PPA,
improves its two key components: minority samples identification and the robust
training algorithm. Specifically, we first train biased classifiers by
projecting image features onto the nullspace of class proxies from text
encoders. Next, we infer group labels using the biased classifier and probe
group targets with prior correction. Finally, we aggregate group weights of
each class to produce the debiased classifier. Our theoretical analysis shows
that our PPA enhances minority group identification and is Bayes optimal for
minimizing the balanced group error, mitigating spurious correlations.
Extensive experimental results confirm the effectiveness of our PPA: it
outperforms the state-of-the-art by an average worst-group accuracy while
requiring less than 0.01% tunable parameters without training group labels.