Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 85% Match Research Paper Computer Vision Researchers,Security System Developers,AI Engineers,Surveillance Analysts 2 days ago

Vision Transformer for Robust Occluded Person Reidentification in Complex Surveillance Scenes

computer-vision › object-detection
📄 Abstract

Abstract: Person re-identification (ReID) in surveillance is challenged by occlusion, viewpoint distortion, and poor image quality. Most existing methods rely on complex modules or perform well only on clear frontal images. We propose Sh-ViT (Shuffling Vision Transformer), a lightweight and robust model for occluded person ReID. Built on ViT-Base, Sh-ViT introduces three components: First, a Shuffle module in the final Transformer layer to break spatial correlations and enhance robustness to occlusion and blur; Second, scenario-adapted augmentation (geometric transforms, erasing, blur, and color adjustment) to simulate surveillance conditions; Third, DeiT-based knowledge distillation to improve learning with limited labels.To support real-world evaluation, we construct the MyTT dataset, containing over 10,000 pedestrians and 30,000+ images from base station inspections, with frequent equipment occlusion and camera variations. Experiments show that Sh-ViT achieves 83.2% Rank-1 and 80.1% mAP on MyTT, outperforming CNN and ViT baselines, and 94.6% Rank-1 and 87.5% mAP on Market1501, surpassing state-of-the-art methods.In summary, Sh-ViT improves robustness to occlusion and blur without external modules, offering a practical solution for surveillance-based personnel monitoring.
Authors (8)
Bo Li
Duyuan Zheng
Xinyang Liu
Qingwen Li
Hong Li
Hongyan Cui
+2 more
Submitted
October 31, 2025
arXiv Category
cs.CV
arXiv PDF

Key Contributions

Sh-ViT is a lightweight and robust Vision Transformer model for occluded person re-identification. It introduces a Shuffle module to enhance occlusion robustness, scenario-adapted augmentation to simulate surveillance conditions, and knowledge distillation for improved learning, achieving high performance on challenging datasets.

Business Value

Improves the effectiveness of surveillance systems by enabling reliable identification of individuals even in challenging conditions, leading to enhanced security and operational efficiency.