TY - GEN
T1 - Multi-View Contrastive Parsing Network for Emotion Recognition in Multi-Party Conversations
AU - Xie, Yunhe
AU - Sun, Chengjie
AU - Liu, Bingquan
AU - Ji, Zhenzhou
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Recent Emotion Recognition in Conversation (ERC) works significantly outperform large language models, represented by ChatGPT, in the dyadic conversation environment by introducing knowledge and adjusting training strategies. However, Multi-Party Conversations (MPCs) are more complex due to their multi-thread nature, low information density, and general long-range dependencies. In addition, previous studies have overlooked the phenomenon of utterance polysemy. To address these challenges, this paper proposes a Multi-View Contrastive Parsing Network (MuVCPN). Specifically, we first parse the entire conversation and extract emotion-related cues from independent sub-conversation views. Then, we update the utterance distance based on the parsing results and use a discourse structure-aware self-attention mechanism to capture the conversational information flow from the global view. At the same time, we adopt supervised contrastive learning to group utterances from the same sub-conversation together. Extensive experiments on four benchmarks show that the proposed MuVCPN model outperforms baseline models on the ERC task. Additionally, experimental results indicate that utilizing different views and sub-conversation level contrastive learning can improve performance in the MPCs environment.
AB - Recent Emotion Recognition in Conversation (ERC) works significantly outperform large language models, represented by ChatGPT, in the dyadic conversation environment by introducing knowledge and adjusting training strategies. However, Multi-Party Conversations (MPCs) are more complex due to their multi-thread nature, low information density, and general long-range dependencies. In addition, previous studies have overlooked the phenomenon of utterance polysemy. To address these challenges, this paper proposes a Multi-View Contrastive Parsing Network (MuVCPN). Specifically, we first parse the entire conversation and extract emotion-related cues from independent sub-conversation views. Then, we update the utterance distance based on the parsing results and use a discourse structure-aware self-attention mechanism to capture the conversational information flow from the global view. At the same time, we adopt supervised contrastive learning to group utterances from the same sub-conversation together. Extensive experiments on four benchmarks show that the proposed MuVCPN model outperforms baseline models on the ERC task. Additionally, experimental results indicate that utilizing different views and sub-conversation level contrastive learning can improve performance in the MPCs environment.
KW - contrastive learning
KW - emotion recognition in conversation
KW - multi-party conversation
KW - multi-view
UR - https://www.scopus.com/pages/publications/85205024195
U2 - 10.1109/IJCNN60899.2024.10650690
DO - 10.1109/IJCNN60899.2024.10650690
M3 - 会议稿件
AN - SCOPUS:85205024195
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 International Joint Conference on Neural Networks, IJCNN 2024
Y2 - 30 June 2024 through 5 July 2024
ER -