Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: As a computer vision task, automatic object segmentation remains challenging
in specialized image domains without massive labeled data, such as synthetic
aperture sonar images, remote sensing, biomedical imaging, etc. In any domain,
obtaining pixel-wise segmentation masks is expensive. In this work, we propose
a method for training a masking network to perform binary object segmentation
using weak supervision in the form of image-wise presence or absence of an
object of interest, which provides less information but may be obtained more
quickly from manual or automatic labeling. A key step in our method is that the
segmented objects can be placed into background-only images to create realistic
images of the objects with counterfactual backgrounds. To create a contrast
between the original and counterfactual background images, we propose to first
cluster the background-only images and then, during learning, create
counterfactual images that blend objects segmented from their original source
backgrounds to backgrounds chosen from a targeted cluster. One term in the
training loss is the divergence between these counterfactual images and the
real object images with backgrounds of the target cluster. The other term is a
supervised loss for background-only images. While an adversarial critic could
provide the divergence, we use sample-based divergences. We conduct experiments
on side-scan and synthetic aperture sonar in which our approach succeeds
compared to previous unsupervised segmentation baselines that were only tested
on natural images. Furthermore, to show generality we extend our experiments to
natural images, obtaining reasonable performance with our method that avoids
pretrained networks, generative networks, and adversarial critics. The code for
this work can be found at \href{GitHub}{https://github.com/bakerhassan/WSOS}.
Key Contributions
This work proposes a weakly supervised method for object segmentation using image-wise presence/absence labels, which are easier to obtain. It introduces 'Background Conditional Divergence' and counterfactual image generation by placing segmented objects onto background-only images to create contrast and train a masking network.
Business Value
Reduces the cost and effort required for image annotation in specialized fields like medical imaging and remote sensing, enabling faster development and deployment of segmentation solutions.