Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: The scarcity of data in various scenarios, such as medical, industry and
autonomous driving, leads to model overfitting and dataset imbalance, thus
hindering effective detection and segmentation performance. Existing studies
employ the generative models to synthesize more training samples to mitigate
data scarcity. However, these synthetic samples are repetitive or simplistic
and fail to provide "crucial information" that targets the downstream model's
weaknesses. Additionally, these methods typically require separate training for
different objects, leading to computational inefficiencies. To address these
issues, we propose Crucial-Diff, a domain-agnostic framework designed to
synthesize crucial samples. Our method integrates two key modules. The Scene
Agnostic Feature Extractor (SAFE) utilizes a unified feature extractor to
capture target information. The Weakness Aware Sample Miner (WASM) generates
hard-to-detect samples using feedback from the detection results of downstream
model, which is then fused with the output of SAFE module. Together, our
Crucial-Diff framework generates diverse, high-quality training data, achieving
a pixel-level AP of 83.63% and an F1-MAX of 78.12% on MVTec. On polyp dataset,
Crucial-Diff reaches an mIoU of 81.64% and an mDice of 87.69%. Code is publicly
available at https://github.com/JJessicaYao/Crucial-diff.
Key Contributions
Introduces Crucial-Diff, a domain-agnostic framework using diffusion models to synthesize 'crucial' training samples that specifically target a downstream model's weaknesses, rather than just generic augmentation. It employs a Scene Agnostic Feature Extractor (SAFE) and a Weakness Aware Sample Miner (WASM) for efficient and targeted sample generation.
Business Value
Significantly improves the performance of AI models in data-scarce domains like medical imaging and autonomous driving by generating more effective training data. This leads to more reliable and accurate AI systems, reducing development costs and time.