Skip to main navigation Skip to search Skip to main content

IIT-GAT: Instance-level image transformation via unsupervised generative attention networks with disentangled representations

  • Mingwen Shao*
  • , Youcai Zhang
  • , Yuan Fan
  • , Wangmeng Zuo
  • , Deyu Meng
  • *Corresponding author for this work
  • China University of Petroleum (East China)
  • School of Computer Science and Technology, Harbin Institute of Technology
  • Xi'an Jiaotong University

Research output: Contribution to journalArticlepeer-review

Abstract

Image-to-image translation is an important research field in computer vision, which is widely associated with Generative Adversarial Networks (GANs) and dual learning. However, the existing methods mainly translate the global image of the source domain to the target domain, which fails to implement instance-level image-to-image translation, and the translation results in the target domain cannot be controlled. In this paper, an instance-level image-to-image translation network (IIT-GAT) is proposed, which includes attention module and feature-encoder module. The attention module is used to guide our model to focus on more interesting instance to generate instance masks, which helps to separate instance and background of an image. The feature-encoder module is used to embed the images into two different spaces: domain-invariant content space and domain-specific attribute space. The content features and attribute features of different images are used as input to generator simultaneously to improve the controllability of image-to-image translation. To this end, we introduce a local self-reconstruction loss that encourages the network to learn the style feature of target instances. Generally, our method not only improves the quality of instance-level image-to-image translation, but also increases controllability on this basis. Extensive experiments are conducted on multiple datasets to validate the effectiveness of the proposed framework, and the results show our method has better performance than previous methods.

Original languageEnglish
Article number107122
JournalKnowledge-Based Systems
Volume225
DOIs
StatePublished - 5 Aug 2021
Externally publishedYes

Keywords

  • Attention mechanism
  • Disentangled representation
  • Generative adversarial networks
  • Image-to-image translation

Fingerprint

Dive into the research topics of 'IIT-GAT: Instance-level image transformation via unsupervised generative attention networks with disentangled representations'. Together they form a unique fingerprint.

Cite this