Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Open-vocabulary object detection (OvOD) is set to revolutionize security
screening by enabling systems to recognize any item in X-ray scans. However,
developing effective OvOD models for X-ray imaging presents unique challenges
due to data scarcity and the modality gap that prevents direct adoption of
RGB-based solutions. To overcome these limitations, we propose RAXO, a
training-free framework that repurposes off-the-shelf RGB OvOD detectors for
robust X-ray detection. RAXO builds high-quality X-ray class descriptors using
a dual-source retrieval strategy. It gathers relevant RGB images from the web
and enriches them via a novel X-ray material transfer mechanism, eliminating
the need for labeled databases. These visual descriptors replace text-based
classification in OvOD, leveraging intra-modal feature distances for robust
detection. Extensive experiments demonstrate that RAXO consistently improves
OvOD performance, providing an average mAP increase of up to 17.0 points over
base detectors. To further support research in this emerging field, we also
introduce DET-COMPASS, a new benchmark featuring bounding box annotations for
over 300 object categories, enabling large-scale evaluation of OvOD in X-ray.
Code and dataset available at: https://github.com/PAGF188/RAXO.
Authors (8)
Pablo Garcia-Fernandez
Lorenzo Vaquero
Mingxuan Liu
Feng Xue
Daniel Cores
Nicu Sebe
+2 more
Key Contributions
This paper proposes RAXO, a training-free framework that adapts off-the-shelf RGB open-vocabulary object detectors (OvOD) for X-ray imaging. It generates high-quality X-ray class descriptors using a dual-source retrieval and X-ray material transfer mechanism, enabling robust detection without X-ray specific training data.
Business Value
Revolutionizes security screening by enabling systems to identify a wider range of items in X-ray scans without extensive retraining, improving efficiency and security effectiveness.