TY - GEN
T1 - Multi-Modal Sarcasm Detection with Interactive In-Modal and Cross-Modal Graphs
AU - Liang, Bin
AU - Lou, Chenwei
AU - Li, Xiang
AU - Gui, Lin
AU - Yang, Min
AU - Xu, Ruifeng
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/10/17
Y1 - 2021/10/17
N2 - Sarcasm is a peculiar form and sophisticated linguistic act to express the incongruity of someone's implied sentiment expression, which is a pervasive phenomenon in social media platforms. Compared with sarcasm detection purely on texts, multi-modal sarcasm detection is more adapted to the rapidly growing social media platforms, where people are interested in creating multi-modal messages. When focusing on the multi-modal sarcasm detection for tweets consisting of texts and images on Twitter, the significant clue of improving the performance of multi-modal sarcasm detection evolves into how to determine the incongruity relations between texts and images. In this paper, we investigate multi-modal sarcasm detection from a novel perspective, so as to determine the sentiment inconsistencies within a certain modality and across different modalities by constructing heterogeneous in-modal and cross-modal graphs (InCrossMGs) for each multi-modal example. Based on it, we explore an interactive graph convolution network (GCN) structure to jointly and interactively learn the incongruity relations of in-modal and cross-modal graphs for determining the significant clues in sarcasm detection. Experimental results demonstrate that our proposed model achieves state-of-the-art performance in multi-modal sarcasm detection.
AB - Sarcasm is a peculiar form and sophisticated linguistic act to express the incongruity of someone's implied sentiment expression, which is a pervasive phenomenon in social media platforms. Compared with sarcasm detection purely on texts, multi-modal sarcasm detection is more adapted to the rapidly growing social media platforms, where people are interested in creating multi-modal messages. When focusing on the multi-modal sarcasm detection for tweets consisting of texts and images on Twitter, the significant clue of improving the performance of multi-modal sarcasm detection evolves into how to determine the incongruity relations between texts and images. In this paper, we investigate multi-modal sarcasm detection from a novel perspective, so as to determine the sentiment inconsistencies within a certain modality and across different modalities by constructing heterogeneous in-modal and cross-modal graphs (InCrossMGs) for each multi-modal example. Based on it, we explore an interactive graph convolution network (GCN) structure to jointly and interactively learn the incongruity relations of in-modal and cross-modal graphs for determining the significant clues in sarcasm detection. Experimental results demonstrate that our proposed model achieves state-of-the-art performance in multi-modal sarcasm detection.
KW - graph networks
KW - multi-modal sarcasm detection
KW - sarcasm detection
UR - https://www.scopus.com/pages/publications/85119365226
U2 - 10.1145/3474085.3475190
DO - 10.1145/3474085.3475190
M3 - 会议稿件
AN - SCOPUS:85119365226
T3 - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
SP - 4707
EP - 4715
BT - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 29th ACM International Conference on Multimedia, MM 2021
Y2 - 20 October 2021 through 24 October 2021
ER -