Skip to main navigation Skip to search Skip to main content

Multimodal Blockwise Transformer for Robust Sentiment Recognition

  • Harbin Institute of Technology
  • Xi'an Jiaotong University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The MER-NOISE challenges participants to classify emotions from multimodal data, specifically audio and visual, with added noise. In this paper, we present a solution for the NOISE track of the MER2024 competition, which focuses on the robustness of emotion recognition in noisy environments. We propose a novel multimodal Blockwise Transformer (MBT) architecture, which effectively integrates visual, auditory, and textual features to improve emotion classification accuracy. Our approach includes several key innovations: the MBT network structure, the TIE module for weighted encoder input, and the momentum contrast. Additionally, we employed diverse data augmentation methods, both conventional and novel, and introduced a confidence-based decision-level fusion strategy to enhance model performance. In the MER2024 NOISE track, our solution achieved a Weighted Average F-score (WAF) of 0.8365, securing third place. This result demonstrates the effectiveness and robustness of our approach in handling noisy data for emotion recognition tasks.

Original languageEnglish
Title of host publicationMRAC 2024 - Proceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing
PublisherAssociation for Computing Machinery, Inc
Pages88-92
Number of pages5
ISBN (Electronic)9798400712036
DOIs
StatePublished - 28 Oct 2024
Event2nd International Workshop on Multimodal and Responsible Affective Computing, MRAC 2024 - Melbourne, Australia
Duration: 28 Oct 20241 Nov 2024

Publication series

NameMRAC 2024 - Proceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing

Conference

Conference2nd International Workshop on Multimodal and Responsible Affective Computing, MRAC 2024
Country/TerritoryAustralia
CityMelbourne
Period28/10/241/11/24

Keywords

  • modality robustness
  • multimodal fusion
  • multimodal sentiment analysis

Fingerprint

Dive into the research topics of 'Multimodal Blockwise Transformer for Robust Sentiment Recognition'. Together they form a unique fingerprint.

Cite this