TY - GEN
T1 - Unlocking the Power of Multimodal Learning for Emotion Recognition in Conversation
AU - Wang, Yunxiao
AU - Liu, Meng
AU - Li, Zhe
AU - Hu, Yupeng
AU - Luo, Xin
AU - Nie, Liqiang
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/10/27
Y1 - 2023/10/27
N2 - Emotion recognition in conversation aims to identify the emotions underlying each utterance, and it has great potential in various domains. Human perception of emotions relies on multiple modalities, such as language, vocal tonality, and facial expressions. While many studies have incorporated multimodal information to enhance emotion recognition, the performance of multimodal models often plateaus when additional modalities are added. We demonstrate through experiments that the main reason for this plateau is an imbalanced assignment of gradients across modalities. To address this issue, we propose fine-grained adaptive gradient modulation, a plug-in approach to rebalance the gradients of modalities. Experimental results show that our method improves the performance of all baseline models and outperforms existing plug-in methods.
AB - Emotion recognition in conversation aims to identify the emotions underlying each utterance, and it has great potential in various domains. Human perception of emotions relies on multiple modalities, such as language, vocal tonality, and facial expressions. While many studies have incorporated multimodal information to enhance emotion recognition, the performance of multimodal models often plateaus when additional modalities are added. We demonstrate through experiments that the main reason for this plateau is an imbalanced assignment of gradients across modalities. To address this issue, we propose fine-grained adaptive gradient modulation, a plug-in approach to rebalance the gradients of modalities. Experimental results show that our method improves the performance of all baseline models and outperforms existing plug-in methods.
KW - emotion recognition in conversation
KW - fine-grained adaptive gradient modulation
KW - multimodal balanced learning
UR - https://www.scopus.com/pages/publications/85179548364
U2 - 10.1145/3581783.3613846
DO - 10.1145/3581783.3613846
M3 - 会议稿件
AN - SCOPUS:85179548364
T3 - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
SP - 5947
EP - 5955
BT - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 31st ACM International Conference on Multimedia, MM 2023
Y2 - 29 October 2023 through 3 November 2023
ER -