Skip to main navigation Skip to search Skip to main content

Dual-Driven Cross-Modal Contrastive Hashing Retrieval Network Via Structural Feature and Semantic Information

  • Cheng Huang
  • , Wenzhe Liu
  • , Jinghua Wang
  • , Jinrong Cui*
  • , Jie Wen
  • *Corresponding author for this work
  • Fujian Polytechnic of Water Conservancy and Electric Power
  • Huzhou University
  • Harbin Institute of Technology
  • Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies
  • South China Agricultural University

Research output: Contribution to journalArticlepeer-review

Abstract

The contrastive-based cross-modal hashing retrieval network, which is widely acknowledged for its exceptional performance in binary hash code learning, has garnered significant recognition in the field. However, there remain three issues that worth further investigation, including: (1) How to capture the structural features among intra-modal data and efficiently utilize them for subsequent hash code representation learning; (2) How to promote intra-modal learning and enhance the robustness of the resulting intra-model features, which are equally important as the inter-modal features; (3) How to effectively harness the semantic information to guide the hash code learning process. In response to above issues, this paper proposes a method called Dual-Driven Cross-Modal Contrastive Hashing Retrieval Network via Structural Feature and Semantic Information (DDSS), which consists of three components. Firstly, DDSS extracts visual-modal and textual-modal features via Contrastive Language-Image Pre-training (CLIP) and takes them as the input for cross-modal hashing retrieval. Secondly, DDSS uses a Dual Branch Feature Learning Module to learn both structural features and self-attention features. Through intra-modal and inter-modal feature contrastive learning, our DDSS promotes the information consistency of different modalities and eliminates low-quality private features within single modality. Thirdly, our DDSS has a Dual Path Instance Hashing Module to guide hash code representation learning process through instance level and semantic level contrastive learning. The experimental results demonstrated that DDSS outperforms the benchmark methods of cross-modal hashing retrieval field. The experimental source code can be accessed through the following link: https://github.com/hcpaper/DDSS.

Original languageEnglish
Article number103252
JournalInformation Fusion
Volume123
DOIs
StatePublished - Nov 2025
Externally publishedYes

Keywords

  • Contrastive learning
  • Cross-modal learning
  • Graph learning
  • Hashing retrieval

Fingerprint

Dive into the research topics of 'Dual-Driven Cross-Modal Contrastive Hashing Retrieval Network Via Structural Feature and Semantic Information'. Together they form a unique fingerprint.

Cite this