Skip to main navigation Skip to search Skip to main content

Multimodal matching transformer for live commenting

  • Harbin Institute of Technology
  • Microsoft USA

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Automatic live commenting aims to provide real-time comments on videos for viewers. It encourages users engagement on online video sites, and is also a good benchmark for video-to-text generation. Recent work on this task adopts encoder-decoder models to generate comments. However, these methods do not model the interaction between videos and comments explicitly, so they tend to generate popular comments that are often irrelevant to the videos. In this work, we aim to improve the relevance between live comments and videos by modeling the cross-modal interactions among different modalities. To this end, we propose a multimodal matching transformer to capture the relationships among comments, vision, and audio. The proposed model is based on the transformer framework and can iteratively learn the attention-aware representations for each modality. We evaluate the model on a publicly available live commenting dataset. Experiments show that the multimodal matching transformer model outperforms the state-of-the-art methods.

Original languageEnglish
Title of host publicationECAI 2020 - 24th European Conference on Artificial Intelligence, including 10th Conference on Prestigious Applications of Artificial Intelligence, PAIS 2020 - Proceedings
EditorsGiuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senen Barro, Alberto Bugarin, Jerome Lang
PublisherIOS Press BV
Pages1998-2005
Number of pages8
ISBN (Electronic)9781643681009
DOIs
StatePublished - 24 Aug 2020
Event24th European Conference on Artificial Intelligence, ECAI 2020, including 10th Conference on Prestigious Applications of Artificial Intelligence, PAIS 2020 - Santiago de Compostela, Online, Spain
Duration: 29 Aug 20208 Sep 2020

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume325
ISSN (Print)0922-6389
ISSN (Electronic)1879-8314

Conference

Conference24th European Conference on Artificial Intelligence, ECAI 2020, including 10th Conference on Prestigious Applications of Artificial Intelligence, PAIS 2020
Country/TerritorySpain
CitySantiago de Compostela, Online
Period29/08/208/09/20

Fingerprint

Dive into the research topics of 'Multimodal matching transformer for live commenting'. Together they form a unique fingerprint.

Cite this