Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: In dynamic environments, unfamiliar objects and distribution shifts are often
encountered, which challenge the generalization abilities of the deployed
trained models. This work addresses Incremental Test Time Adaptation of Vision
Language Models, tackling scenarios where unseen classes and unseen domains
continuously appear during testing. Unlike traditional Test Time Adaptation
approaches, where the test stream comes only from a predefined set of classes,
our framework allows models to adapt simultaneously to both covariate and label
shifts, actively incorporating new classes as they emerge. Towards this goal,
we establish a new benchmark for ITTA, integrating single image TTA methods for
VLMs with active labeling techniques that query an oracle for samples
potentially representing unseen classes during test time. We propose a
segmentation assisted active labeling module, termed SegAssist, which is
training free and repurposes the segmentation capabilities of VLMs to refine
active sample selection, prioritizing samples likely to belong to unseen
classes. Extensive experiments on several benchmark datasets demonstrate the
potential of SegAssist to enhance the performance of VLMs in real world
scenarios, where continuous adaptation to emerging data is essential.
Project-page:https://manogna-s.github.io/segassist/