Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Aerial-Ground Person Re-IDentification (AG-ReID) aims to retrieve specific
persons across cameras with different viewpoints. Previous works focus on
designing discriminative models to maintain the identity consistency despite
drastic changes in camera viewpoints. The core idea behind these methods is
quite natural, but designing a view-robust model is a very challenging task.
Moreover, they overlook the contribution of view-specific features in enhancing
the model's ability to represent persons. To address these issues, we propose a
novel generative framework named SD-ReID for AG-ReID, which leverages
generative models to mimic the feature distribution of different views while
extracting robust identity representations. More specifically, we first train a
ViT-based model to extract person representations along with controllable
conditions, including identity and view conditions. We then fine-tune the
Stable Diffusion (SD) model to enhance person representations guided by these
controllable conditions. Furthermore, we introduce the View-Refined Decoder
(VRD) to bridge the gap between instance-level and global-level features.
Finally, both person representations and all-view features are employed to
retrieve target persons. Extensive experiments on five AG-ReID benchmarks
(i.e., CARGO, AG-ReIDv1, AG-ReIDv2, LAGPeR and G2APS-ReID) demonstrate the
effectiveness of our proposed method. The source code will be available.
Authors (5)
Yuhao Wang
Xiang Hu
Lixin Wang
Pingping Zhang
Huchuan Lu
Key Contributions
Proposes SD-ReID, a novel generative framework for Aerial-Ground Person Re-Identification (AG-ReID) that leverages Stable Diffusion to mimic feature distributions across different views. It addresses the challenge of viewpoint variation by using a ViT to extract representations conditioned on identity and view, and then fine-tuning SD to enhance these representations.
Business Value
Improves the accuracy and robustness of person identification systems used in security and surveillance, particularly in scenarios involving aerial and ground-based cameras, enabling better tracking and identification of individuals.