Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: This paper presents a generation-based debiasing framework for object
detection. Prior debiasing methods are often limited by the representation
diversity of samples, while naive generative augmentation often preserves the
biases it aims to solve. Moreover, our analysis reveals that simply generating
more data for rare classes is suboptimal due to two core issues: i) instance
frequency is an incomplete proxy for the true data needs of a model, and ii)
current layout-to-image synthesis lacks the fidelity and control to generate
high-quality, complex scenes. To overcome this, we introduce the representation
score (RS) to diagnose representational gaps beyond mere frequency, guiding the
creation of new, unbiased layouts. To ensure high-quality synthesis, we replace
ambiguous text prompts with a precise visual blueprint and employ a generative
alignment strategy, which fosters communication between the detector and
generator. Our method significantly narrows the performance gap for
underrepresented object groups, \eg, improving large/rare instances by 4.4/3.6
mAP over the baseline, and surpassing prior L2I synthesis models by 15.9 mAP
for layout accuracy in generated images.
Authors (7)
Xinhao Cai
Liulei Li
Gensheng Pei
Tao Chen
Jinshan Pan
Yazhou Yao
+1 more
Submitted
October 21, 2025
Key Contributions
Proposes a generation-based debiasing framework for object detection that uses a 'representation score' to guide data synthesis beyond simple instance frequency. It employs visual blueprints and generative alignment for higher fidelity scene generation, overcoming limitations of naive augmentation and text-prompted synthesis.
Business Value
Leads to more robust and fair object detection systems, crucial for applications where under-detection of certain objects (e.g., pedestrians, specific types of vehicles) can have serious consequences. Improves reliability in diverse real-world scenarios.