CM-DASN: Visible-Infrared Cross-Modality Person Re-Identification via Dynamic Attention Selection Network

Yuxin Li, Hu Lu, Tingting Qin, Juanjuan Tu, Shengli Wu

Research output: Contribution to journalArticlepeer-review

Abstract

Cross-modality person re-identification between RGB and IR images presents significant challenges due to substantial modality discrepancies. While existing approaches often focus on learning either modality-specific or modality-shared features, overemphasis on the former may hinder cross-modality matching, whereas the latter are more beneficial for this task. To address this challenge, we propose CM-DASN (Cross-Modality Dynamic Attention Selection Network), a novel approach based on dynamic attention optimization. The core of our method is the Dynamic Attention Selection Module (DASM), which adaptively selects the most effective combination of attention heads in the later stages of training, thereby balancing the learning of modality-shared and modality-specific features. We employ a softmax score-based feature selection mechanism to extract and enhance the most discriminative cross-modality feature representations. By alternating supervised learning of high-scoring modality-shared and modality-specific features in the later training stages, the model focuses on learning highly discriminative modality-shared features while retaining beneficial modality-specific information. Furthermore, we design a multi-stage, multi-scale cross-modality feature alignment strategy to more effectively learn cross-modality representations by aligning features of different scales in a phased, progressive manner. This approach considers both global structure and local details, thereby improving cross-modality person re-identification performance. Our method achieves higher cross-modality matching accuracy with minimal increases in model parameters and computational time. Extensive experiments on the SYSU-MM01 and RegDB datasets validate the effectiveness of our proposed framework, demonstrating that it outperforms most existing state-of-the-art approaches in terms of performance. The source code is available at https://github.com/hulu88/CM_DASN.

Original languageEnglish
Article number138
Pages (from-to)1-14
Number of pages14
JournalMultimedia Systems
Volume31
Issue number2
Early online date5 Mar 2025
DOIs
Publication statusPublished (in print/issue) - 5 Mar 2025

Bibliographical note

© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.

Keywords

  • person re-identification
  • Visible-infrared
  • Cross-modality
  • Vision Transformer
  • Vision transformer
  • Person re-identification

Fingerprint

Dive into the research topics of 'CM-DASN: Visible-Infrared Cross-Modality Person Re-Identification via Dynamic Attention Selection Network'. Together they form a unique fingerprint.

Cite this