Skip to main navigation Skip to search Skip to main content

Image-Text Retrieval via Contrastive Learning with Auxiliary Generative Features and Support-set Regularization

  • Lei Zhang
  • , Min Yang*
  • , Chengming Li
  • , Ruifeng Xu
  • *Corresponding author for this work
  • Shenzhen Institute of Advanced Technology
  • Sun Yat-Sen University
  • Harbin Institute of Technology Shenzhen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we bridge the heterogeneity gap between different modalities and improve image-text retrieval by taking advantage of auxiliary image-to-text and text-to-image generative features with contrastive learning. Concretely, contrastive learning is devised to narrow the distance between the aligned image-text pairs and push apart the distance between the unaligned pairs from both inter- and intra-modality perspectives with the help of cross-modal retrieval features and auxiliary generative features. In addition, we devise a support-set regularization term to further improve contrastive learning by constraining the distance between each image/text and its corresponding cross-modal support-set information contained in the same semantic category. To evaluate the effectiveness of the proposed method, we conduct experiments on three benchmark datasets (i.e., MIRFLICKR-25K, NUS-WIDE, MS COCO). Experimental results show that our model significantly outperforms the strong baselines for cross-modal image-text retrieval. For reproducibility, we submit the code and data publicly at: \urlhttps: //github.com/Hambaobao/CRCGS.

Original languageEnglish
Title of host publicationSIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages1938-1943
Number of pages6
ISBN (Electronic)9781450387323
DOIs
StatePublished - 7 Jul 2022
Externally publishedYes
Event45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022 - Madrid, Spain
Duration: 11 Jul 202215 Jul 2022

Publication series

NameSIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022
Country/TerritorySpain
CityMadrid
Period11/07/2215/07/22

Keywords

  • contrastive learning
  • cross-modal image-text retrieval
  • generative features
  • support-set regularization

Fingerprint

Dive into the research topics of 'Image-Text Retrieval via Contrastive Learning with Auxiliary Generative Features and Support-set Regularization'. Together they form a unique fingerprint.

Cite this