Skip to main navigation Skip to search Skip to main content

Angle-Optimized Partial Disentanglement for Multimodal Emotion Recognition in Conversation

  • Xinyi Che
  • , Wenbo Wang
  • , Yuanbo Hou
  • , Mingjie Xie
  • , Qijun Zhao
  • , Jian Guan*
  • *Corresponding author for this work
  • College of Computer Science
  • Faculty of Computing, Harbin Institute of Technology
  • University of Oxford
  • Beihang University
  • Harbin Engineering University

Research output: Contribution to journalArticlepeer-review

Abstract

Multimodal Emotion Recognition in Conversation (MERC) aims to enhance emotion understanding by integrating complementary cues from text, audio, and visual modalities. Existing approaches primarily emphasize cross-modal shared features while overlooking modality-specific cues such as micro-expressions, prosodic variations, and sarcasm. Although prior multimodal emotion recognition (MER) methods attempt to disentangle shared and modality-specific features, most impose rigid orthogonality, assuming a fixed geometric relationship. However, in MERC, their interaction is inherently context-dependent and may be complementary or conflicting. To address this limitation, we propose Angle-Optimized Feature Learning (AO-FL), which achieves partial disentanglement via adaptive angular optimization. AO-FL aligns shared features across modalities and adaptively regulates the angular relationship between shared and specific features within each modality to balance distinctiveness and complementarity. An Angle-Scale Refinement (ASR) module further performs angle-guided scaling and contextual enhancement for improved fusion. Experiments on IEMOCAP and MELD demonstrate state-of-the-art performance. Operating at the feature-fusion level, AO-FL introduces only lightweight geometric regulation with negligible computational overhead. Moreover, its effectiveness is validated across different tasks and diverse encoder architectures, demonstrating strong generalization beyond specific backbone or task settings.

Original languageEnglish
JournalIEEE Transactions on Affective Computing
DOIs
StateAccepted/In press - 2026
Externally publishedYes

Keywords

  • Adaptive angle optimization
  • angle-scale refinement
  • feature partial disentanglement
  • multimodal emotion recognition in conversation

Fingerprint

Dive into the research topics of 'Angle-Optimized Partial Disentanglement for Multimodal Emotion Recognition in Conversation'. Together they form a unique fingerprint.

Cite this