TY - GEN
T1 - UniMSE
T2 - 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
AU - Hu, Guimin
AU - Lin, Ting En
AU - Zhao, Yi
AU - Lu, Guangming
AU - Wu, Yuchuan
AU - Li, Yongbin
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - Multimodal sentiment analysis (MSA) and emotion recognition in conversation (ERC) are key research topics for computers to understand human behaviors. From a psychological perspective, emotions are the expression of affect or feelings during a short period, while sentiments are formed and held for a longer period. However, most existing works study sentiment and emotion separately and do not fully exploit the complementary knowledge behind the two. In this paper, we propose a multimodal sentiment knowledge-sharing framework (UniMSE) that unifies MSA and ERC tasks from features, labels, and models. We perform modality fusion at the syntactic and semantic levels and introduce contrastive learning between modalities and samples to better capture the difference and consistency between sentiments and emotions. Experiments on four public benchmark datasets, MOSI, MOSEI, MELD, and IEMOCAP, demonstrate the effectiveness of the proposed method and achieve consistent improvements compared with state-of-the-art methods.
AB - Multimodal sentiment analysis (MSA) and emotion recognition in conversation (ERC) are key research topics for computers to understand human behaviors. From a psychological perspective, emotions are the expression of affect or feelings during a short period, while sentiments are formed and held for a longer period. However, most existing works study sentiment and emotion separately and do not fully exploit the complementary knowledge behind the two. In this paper, we propose a multimodal sentiment knowledge-sharing framework (UniMSE) that unifies MSA and ERC tasks from features, labels, and models. We perform modality fusion at the syntactic and semantic levels and introduce contrastive learning between modalities and samples to better capture the difference and consistency between sentiments and emotions. Experiments on four public benchmark datasets, MOSI, MOSEI, MELD, and IEMOCAP, demonstrate the effectiveness of the proposed method and achieve consistent improvements compared with state-of-the-art methods.
UR - https://www.scopus.com/pages/publications/85149435574
U2 - 10.18653/v1/2022.emnlp-main.534
DO - 10.18653/v1/2022.emnlp-main.534
M3 - 会议稿件
AN - SCOPUS:85149435574
T3 - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
SP - 7837
EP - 7851
BT - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
A2 - Goldberg, Yoav
A2 - Kozareva, Zornitsa
A2 - Zhang, Yue
PB - Association for Computational Linguistics (ACL)
Y2 - 7 December 2022 through 11 December 2022
ER -