Skip to main navigation Skip to search Skip to main content

Multimodal dialog system: Generating responses via adaptive decoders

  • Liqiang Nie*
  • , Wenjie Wang
  • , Richang Hong
  • , Meng Wang
  • , Qi Tian
  • *Corresponding author for this work
  • Shandong University
  • Hefei University of Technology
  • Huawei Technologies Co., Ltd.

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

On the shoulders of textual dialog systems, the multimodal ones, recently have engaged increasing attention, especially in the retail domain. Despite the commercial value of multimodal dialog systems, they still suffer from the following challenges: 1) automatically generate the right responses in appropriate medium forms; 2) jointly consider the visual cues and the side information while selecting product images; and 3) guide the response generation with multi-faceted and heterogeneous knowledge. To address the aforementioned issues, we present a Multimodal diAloG system with adaptIve deCoders, MAGIC for short. In particular, MAGIC first judges the response type and the corresponding medium form via understanding the intention of the given multimodal context. Hereafter, it employs adaptive decoders to generate the desired responses: a simple recurrent neural network (RNN) is applied to generating general responses, then a knowledge-aware RNN decoder is designed to encode the multiform domain knowledge to enrich the response, and the multimodal response decoder incorporates an image recommendation model which jointly considers the textual attributes and the visual images via a neural model optimized by the max-margin loss. We comparatively justify MAGIC over a benchmark dataset. Experiment results demonstrate that MAGIC outperforms the existing methods and achieves the state-of-the-art performance.

Original languageEnglish
Title of host publicationMM 2019 - Proceedings of the 27th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages1098-1106
Number of pages9
ISBN (Electronic)9781450368896
DOIs
StatePublished - 15 Oct 2019
Externally publishedYes
Event27th ACM International Conference on Multimedia, MM 2019 - Nice, France
Duration: 21 Oct 201925 Oct 2019

Publication series

NameMM 2019 - Proceedings of the 27th ACM International Conference on Multimedia

Conference

Conference27th ACM International Conference on Multimedia, MM 2019
Country/TerritoryFrance
CityNice
Period21/10/1925/10/19

Keywords

  • Adaptive Decoders
  • Multiform Knowledge-aware Decoder
  • Multimodal Dialog Systems

Fingerprint

Dive into the research topics of 'Multimodal dialog system: Generating responses via adaptive decoders'. Together they form a unique fingerprint.

Cite this