Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 95% Match Research Paper Machine Learning Engineers,Computer Vision Researchers,AI Developers in data-scarce domains 17 hours ago

Crucial-Diff: A Unified Diffusion Model for Crucial Image and Annotation Synthesis in Data-scarce Scenarios

generative-ai › diffusion
📄 Abstract

Abstract: The scarcity of data in various scenarios, such as medical, industry and autonomous driving, leads to model overfitting and dataset imbalance, thus hindering effective detection and segmentation performance. Existing studies employ the generative models to synthesize more training samples to mitigate data scarcity. However, these synthetic samples are repetitive or simplistic and fail to provide "crucial information" that targets the downstream model's weaknesses. Additionally, these methods typically require separate training for different objects, leading to computational inefficiencies. To address these issues, we propose Crucial-Diff, a domain-agnostic framework designed to synthesize crucial samples. Our method integrates two key modules. The Scene Agnostic Feature Extractor (SAFE) utilizes a unified feature extractor to capture target information. The Weakness Aware Sample Miner (WASM) generates hard-to-detect samples using feedback from the detection results of downstream model, which is then fused with the output of SAFE module. Together, our Crucial-Diff framework generates diverse, high-quality training data, achieving a pixel-level AP of 83.63% and an F1-MAX of 78.12% on MVTec. On polyp dataset, Crucial-Diff reaches an mIoU of 81.64% and an mDice of 87.69%. Code is publicly available at https://github.com/JJessicaYao/Crucial-diff.

Key Contributions

Introduces Crucial-Diff, a domain-agnostic framework using diffusion models to synthesize 'crucial' training samples that specifically target a downstream model's weaknesses, rather than just generic augmentation. It employs a Scene Agnostic Feature Extractor (SAFE) and a Weakness Aware Sample Miner (WASM) for efficient and targeted sample generation.

Business Value

Significantly improves the performance of AI models in data-scarce domains like medical imaging and autonomous driving by generating more effective training data. This leads to more reliable and accurate AI systems, reducing development costs and time.