Skip to main navigation Skip to search Skip to main content

Contrastive Label Correlation Enhanced Unified Hashing Encoder for Cross-modal Retrieval

  • Hongfa Wu
  • , Lisai Zhang
  • , Qingcai Chen*
  • , Yimeng Deng
  • , Joanna Siebert
  • , Yunpeng Han
  • , Zhonghua Li
  • , Dejiang Kong
  • , Zhao Cao
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • Peng Cheng Laboratory
  • Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies
  • Huawei Technologies Co., Ltd.

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Cross-modal hashing (CMH) has been widely used in multimedia retrieval applications for its low storage cost and fast indexing speed. Thanks to the success of deep learning, cross-modal hashing has made significant progress with high-quality deep features. However, the modal gap is still a crucial bottleneck for existing cross-modal hashing methods: the commonly used convolutional neural network and bag-of-words encoders are customized for single modal prior, limiting the models to learn semantics representation in a cross-modal space. To overcome modality heterogeneity, we propose a shared transformer encoder (UniHash) to unify the cross-modal hashing into the same semantic space. A contrastive label correlation learning (CLC) loss using the category labels as modality bridge is designed together to improve the representation quality. Moreover, we take advantage of the multi-hot label space and propose a negative label generation (NegLG) strategy to get richer and uniformly distributed negative labels for contrast. Extensive experiments on three benchmarks verify the advantage of our proposed method. Besides, the proposed UniHash outperforms state-of-the-art cross-modal hashing methods significantly, establishing a new important baseline for the cross-modal hashing research. Codes are released github.com/idealwhite/Unihash.

Original languageEnglish
Title of host publicationCIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages2158-2168
Number of pages11
ISBN (Electronic)9781450392365
DOIs
StatePublished - 17 Oct 2022
Externally publishedYes
Event31st ACM International Conference on Information and Knowledge Management, CIKM 2022 - Atlanta, United States
Duration: 17 Oct 202221 Oct 2022

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings
ISSN (Print)2155-0751

Conference

Conference31st ACM International Conference on Information and Knowledge Management, CIKM 2022
Country/TerritoryUnited States
CityAtlanta
Period17/10/2221/10/22

Keywords

  • contrastive learning
  • cross-modal hashing
  • cross-modal retrieval
  • vision and language

Fingerprint

Dive into the research topics of 'Contrastive Label Correlation Enhanced Unified Hashing Encoder for Cross-modal Retrieval'. Together they form a unique fingerprint.

Cite this