Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: We present a new generalizable NeRF method that is able to directly
generalize to new unseen scenarios and perform novel view synthesis with as few
as two source views. The key to our approach lies in the explicitly modeled
correspondence matching information, so as to provide the geometry prior to the
prediction of NeRF color and density for volume rendering. The explicit
correspondence matching is quantified with the cosine similarity between image
features sampled at the 2D projections of a 3D point on different views, which
is able to provide reliable cues about the surface geometry. Unlike previous
methods where image features are extracted independently for each view, we
consider modeling the cross-view interactions via Transformer cross-attention,
which greatly improves the feature matching quality. Our method achieves
state-of-the-art results on different evaluation settings, with the experiments
showing a strong correlation between our learned cosine feature similarity and
volume density, demonstrating the effectiveness and superiority of our proposed
method. The code and model are on our project page:
https://donydchen.github.io/matchnerf