Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Hierarchical visual localization methods achieve state-of-the-art accuracy
but require substantial memory as they need to store all database images.
Direct 2D-3D matching requires significantly less memory but suffers from lower
accuracy due to the larger and more ambiguous search space. We address this
ambiguity by fusing local and global descriptors using a weighted average
operator. This operator rearranges the local descriptor space so that
geographically nearby local descriptors are closer in the feature space
according to the global descriptors. This decreases the number of irrelevant
competing descriptors, especially if they are geographically distant, thus
increasing the correct matching likelihood. We consistently improve the
accuracy over local-only systems, and we achieve performance close to
hierarchical methods while using 43\% less memory and running 1.6 times faster.
Extensive experiments on four challenging datasets -- Cambridge Landmarks,
Aachen Day/Night, RobotCar Seasons, and Extended CMU Seasons -- demonstrate
that, for the first time, direct matching algorithms can benefit from global
descriptors without compromising computational efficiency. Our code is
available at
\href{https://github.com/sontung/descriptor-disambiguation}{https://github.com/sontung/descriptor-disambiguation}.