Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Image segmentation is a powerful computer vision technique for scene
understanding. However, real-world deployment is stymied by the need for
high-quality, meticulously labeled datasets. Synthetic data provides
high-quality labels while reducing the need for manual data collection and
annotation. However, deep neural networks trained on synthetic data often face
the Syn2Real problem, leading to poor performance in real-world deployments.
To mitigate the aforementioned gap in image segmentation, we propose RAFT, a
novel framework for adapting image segmentation models using minimal labeled
real-world data through data and feature augmentations, as well as active
learning. To validate RAFT, we perform experiments on the synthetic-to-real
"SYNTHIA->Cityscapes" and "GTAV->Cityscapes" benchmarks. We managed to surpass
the previous state of the art, HALO. SYNTHIA->Cityscapes experiences an
improvement in mIoU* upon domain adaptation of 2.1%/79.9%, and GTAV->Cityscapes
experiences a 0.4%/78.2% improvement in mIoU. Furthermore, we test our approach
on the real-to-real benchmark of "Cityscapes->ACDC", and again surpass HALO,
with a gain in mIoU upon adaptation of 1.3%/73.2%. Finally, we examine the
effect of the allocated annotation budget and various components of RAFT upon
the final transfer mIoU.