Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Pre-trained vision-language models (VLMs) have advanced out-of-distribution
(OOD) detection recently. However, existing CLIP-based methods often focus on
learning OOD-related knowledge to improve OOD detection, showing limited
generalization or reliance on external large-scale auxiliary datasets. In this
study, instead of delving into the intricate OOD-related knowledge, we propose
an innovative CLIP-based framework based on Forced prompt leArning (FA),
designed to make full use of the In-Distribution (ID) knowledge and ultimately
boost the effectiveness of OOD detection. Our key insight is to learn a prompt
(i.e., forced prompt) that contains more diversified and richer descriptions of
the ID classes beyond the textual semantics of class labels. Specifically, it
promotes better discernment for ID images, by forcing more notable semantic
similarity between ID images and the learnable forced prompt. Moreover, we
introduce a forced coefficient, encouraging the forced prompt to learn more
comprehensive and nuanced descriptions of the ID classes. In this way, FA is
capable of achieving notable improvements in OOD detection, even when trained
without any external auxiliary datasets, while maintaining an identical number
of trainable parameters as CoOp. Extensive empirical evaluations confirm our
method consistently outperforms current state-of-the-art methods. Code is
available at https://github.com/0xFAFA/FA.