Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Constrained by the low-rank bottleneck inherent in attention mechanisms,
current stereo matching transformers suffer from limited nonlinear
expressivity, which renders their feature representations sensitive to
challenging conditions such as reflections. To overcome this difficulty, we
present the Hadamard Attention Recurrent Stereo Transformer (HART). HART
includes a novel attention mechanism that incorporates the following
components: 1) The Dense Attention Kernel (DAK) maps the attention weight
distribution into a high-dimensional space over (0, +$\infty$). By removing the
upper bound constraint on attention weights, DAK enables more flexible modeling
of complex feature interactions. This reduces feature collinearity. 2) The
Multi Kernel & Order Interaction (MKOI) module extends the attention mechanism
by unifying semantic and spatial knowledge learning. This integration improves
the ability of HART to learn features in binocular images. Experimental results
demonstrate the effectiveness of our HART. In reflective area, HART ranked 1st
on the KITTI 2012 benchmark among all published methods at the time of
submission. Code is available at https://github.com/ZYangChen/HART.