Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 92% Match Research Paper Computer Vision Researchers,Robotics Engineers,Autonomous Systems Developers 2 weeks ago

HAD: Hierarchical Asymmetric Distillation to Bridge Spatio-Temporal Gaps in Event-Based Object Tracking

computer-vision › object-detection
📄 Abstract

Abstract: RGB cameras excel at capturing rich texture details with high spatial resolution, whereas event cameras offer exceptional temporal resolution and a high dynamic range (HDR). Leveraging their complementary strengths can substantially enhance object tracking under challenging conditions, such as high-speed motion, HDR environments, and dynamic background interference. However, a significant spatio-temporal asymmetry exists between these two modalities due to their fundamentally different imaging mechanisms, hindering effective multi-modal integration. To address this issue, we propose {Hierarchical Asymmetric Distillation} (HAD), a multi-modal knowledge distillation framework that explicitly models and mitigates spatio-temporal asymmetries. Specifically, HAD proposes a hierarchical alignment strategy that minimizes information loss while maintaining the student network's computational efficiency and parameter compactness. Extensive experiments demonstrate that HAD consistently outperforms state-of-the-art methods, and comprehensive ablation studies further validate the effectiveness and necessity of each designed component. The code will be released soon.
Authors (6)
Yao Deng
Xian Zhong
Wenxuan Liu
Zhaofei Yu
Jingling Yuan
Tiejun Huang
Submitted
October 22, 2025
arXiv Category
cs.CV
arXiv PDF

Key Contributions

HAD is a novel multi-modal knowledge distillation framework designed to mitigate spatio-temporal asymmetries between RGB and event cameras for object tracking. It employs a hierarchical alignment strategy to minimize information loss while maintaining efficiency, enabling better fusion of complementary sensor strengths.

Business Value

Improves the robustness and accuracy of object tracking systems by effectively combining data from different sensor types, leading to safer autonomous systems and more reliable surveillance.