Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Data-Free Robustness Distillation (DFRD) aims to transfer the robustness from
the teacher to the student without accessing the training data. While existing
methods focus on overall robustness, they overlook the robust fairness issues,
leading to severe disparity of robustness across different categories. In this
paper, we find two key problems: (1) student model distilled with equal class
proportion data behaves significantly different across distinct categories; and
(2) the robustness of student model is not stable across different attacks
target. To bridge these gaps, we present the first Fairness-Enhanced data-free
Robustness Distillation (FERD) framework to adjust the proportion and
distribution of adversarial examples. For the proportion, FERD adopts a
robustness-guided class reweighting strategy to synthesize more samples for the
less robust categories, thereby improving robustness of them. For the
distribution, FERD generates complementary data samples for advanced robustness
distillation. It generates Fairness-Aware Examples (FAEs) by enforcing a
uniformity constraint on feature-level predictions, which suppress the
dominance of class-specific non-robust features, providing a more balanced
representation across all categories. Then, FERD constructs Uniform-Target
Adversarial Examples (UTAEs) from FAEs by applying a uniform target class
constraint to avoid biased attack directions, which distribute the attack
targets across all categories and prevents overfitting to specific vulnerable
categories. Extensive experiments on three public datasets show that FERD
achieves state-of-the-art worst-class robustness under all adversarial attack
(e.g., the worst-class robustness under FGSM and AutoAttack are improved by
15.1\% and 6.4\% using MobileNet-V2 on CIFAR-10), demonstrating superior
performance in both robustness and fairness aspects.