Skip to main navigation Skip to search Skip to main content

LLM-Driven Multimodal Opinion Expression Identification

Research output: Contribution to journalConference articlepeer-review

Abstract

Opinion Expression Identification (OEI) is essential in NLP for applications ranging from voice assistants to depression diagnosis. This study extends OEI to encompass multimodal inputs, underlining the significance of auditory cues in delivering emotional subtleties beyond the capabilities of text. We introduce a novel multimodal OEI (MOEI) task, integrating text and speech to mirror real-world scenarios. Utilizing CMU MOSEI and IEMOCAP datasets, we construct the CI-MOEI dataset. Additionally, Text-to-Speech (TTS) technology is applied to the MPQA dataset to obtain the CIM-OEI dataset. We design a template for the OEI task to take full advantage of the generative power of large language models (LLMs). Advancing further, we propose an LLM-driven method STOEI, which combines speech and text modal to identify opinion expressions. Our experiments demonstrate that MOEI significantly improves the performance while our method outperforms existing methods by 9.20% and obtains SOTA results.

Original languageEnglish
Pages (from-to)2930-2934
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOIs
StatePublished - 2024
Externally publishedYes
Event25th Interspeech Conferece 2024 - Kos Island, Greece
Duration: 1 Sep 20245 Sep 2024

Keywords

  • large language model
  • multimodal opinion expression identification
  • speech modal

Fingerprint

Dive into the research topics of 'LLM-Driven Multimodal Opinion Expression Identification'. Together they form a unique fingerprint.

Cite this