Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 90% Match Research Paper Computer Vision Researchers,Security and Surveillance Professionals,AI Engineers 1 week ago

SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification

computer-vision › object-detection
📄 Abstract

Abstract: Aerial-Ground Person Re-IDentification (AG-ReID) aims to retrieve specific persons across cameras with different viewpoints. Previous works focus on designing discriminative models to maintain the identity consistency despite drastic changes in camera viewpoints. The core idea behind these methods is quite natural, but designing a view-robust model is a very challenging task. Moreover, they overlook the contribution of view-specific features in enhancing the model's ability to represent persons. To address these issues, we propose a novel generative framework named SD-ReID for AG-ReID, which leverages generative models to mimic the feature distribution of different views while extracting robust identity representations. More specifically, we first train a ViT-based model to extract person representations along with controllable conditions, including identity and view conditions. We then fine-tune the Stable Diffusion (SD) model to enhance person representations guided by these controllable conditions. Furthermore, we introduce the View-Refined Decoder (VRD) to bridge the gap between instance-level and global-level features. Finally, both person representations and all-view features are employed to retrieve target persons. Extensive experiments on five AG-ReID benchmarks (i.e., CARGO, AG-ReIDv1, AG-ReIDv2, LAGPeR and G2APS-ReID) demonstrate the effectiveness of our proposed method. The source code will be available.
Authors (5)
Yuhao Wang
Xiang Hu
Lixin Wang
Pingping Zhang
Huchuan Lu
Submitted
April 13, 2025
arXiv Category
cs.CV
arXiv PDF

Key Contributions

Proposes SD-ReID, a novel generative framework for Aerial-Ground Person Re-Identification (AG-ReID) that leverages Stable Diffusion to mimic feature distributions across different views. It addresses the challenge of viewpoint variation by using a ViT to extract representations conditioned on identity and view, and then fine-tuning SD to enhance these representations.

Business Value

Improves the accuracy and robustness of person identification systems used in security and surveillance, particularly in scenarios involving aerial and ground-based cameras, enabling better tracking and identification of individuals.