Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Finding correspondences between semantically similar points across images and
object instances is one of the everlasting challenges in computer vision. While
large pre-trained vision models have recently been demonstrated as effective
priors for semantic matching, they still suffer from ambiguities for symmetric
objects or repeated object parts. We propose improving semantic correspondence
estimation through 3D-aware pseudo-labeling. Specifically, we train an adapter
to refine off-the-shelf features using pseudo-labels obtained via 3D-aware
chaining, filtering wrong labels through relaxed cyclic consistency, and 3D
spherical prototype mapping constraints. While reducing the need for
dataset-specific annotations compared to prior work, we establish a new
state-of-the-art on SPair-71k, achieving an absolute gain of over 4% and of
over 7% compared to methods with similar supervision requirements. The
generality of our proposed approach simplifies the extension of training to
other data sources, which we demonstrate in our experiments.