Skip to main navigation Skip to search Skip to main content

Unlocking the Power of Multimodal Learning for Emotion Recognition in Conversation

  • Yunxiao Wang
  • , Meng Liu*
  • , Zhe Li
  • , Yupeng Hu*
  • , Xin Luo
  • , Liqiang Nie
  • *Corresponding author for this work
  • Shandong University
  • Shandong Jianzhu University
  • Harbin Institute of Technology Shenzhen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Emotion recognition in conversation aims to identify the emotions underlying each utterance, and it has great potential in various domains. Human perception of emotions relies on multiple modalities, such as language, vocal tonality, and facial expressions. While many studies have incorporated multimodal information to enhance emotion recognition, the performance of multimodal models often plateaus when additional modalities are added. We demonstrate through experiments that the main reason for this plateau is an imbalanced assignment of gradients across modalities. To address this issue, we propose fine-grained adaptive gradient modulation, a plug-in approach to rebalance the gradients of modalities. Experimental results show that our method improves the performance of all baseline models and outperforms existing plug-in methods.

Original languageEnglish
Title of host publicationMM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages5947-5955
Number of pages9
ISBN (Electronic)9798400701085
DOIs
StatePublished - 27 Oct 2023
Externally publishedYes
Event31st ACM International Conference on Multimedia, MM 2023 - Ottawa, Canada
Duration: 29 Oct 20233 Nov 2023

Publication series

NameMM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

Conference

Conference31st ACM International Conference on Multimedia, MM 2023
Country/TerritoryCanada
CityOttawa
Period29/10/233/11/23

Keywords

  • emotion recognition in conversation
  • fine-grained adaptive gradient modulation
  • multimodal balanced learning

Fingerprint

Dive into the research topics of 'Unlocking the Power of Multimodal Learning for Emotion Recognition in Conversation'. Together they form a unique fingerprint.

Cite this