Skip to main navigation Skip to search Skip to main content

Large language model enhanced multimodal fake news detection with masked feature reconstruction

  • Jie Li
  • , Guoying Sun*
  • , Zhaoxin Zhang
  • *Corresponding author for this work
  • Inner Mongolia Normal University China
  • Macau University of Science and Technology
  • Harbin Institute of Technology
  • Faculty of Computing, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Multimodal fake news detection aims to identify deceptive information by jointly analyzing textual and visual content. Nevertheless, the problems of cross-modal semantic inconsistency and imbalanced dataset still exist despite extensive research. To address these shortcomings, we introduce a Large Language Model Enhanced Fake News Detection with Masked Feature Reconstruction (LLMMFR) model. First, two LLM-based prompt templates are constructed to generate a Description Enhanced Document (DED) that provides an objective visual fact baseline independent of the original text, and a Reasoning Enhanced Document (RED) that simulates fact-checker reasoning by constructing alternative narratives and identifying contradictions, thereby expanding the semantic representation of images. Subsequently, a domain feature alignment method is designed by introducing a weighted Frobenius norm into Maximum Mean Discrepancy, which quantifies semantic distribution divergence between modalities while emphasizing discriminative feature dimensions through a learnable weight matrix. To alleviate data imbalance, a dual-attention mechanism dynamically assesses modality importance, and historical information is integrated via residual connections to retain low-frequency sample features. Furthermore, a mask enhanced classifier with progressive learnable masking and hybrid encoder-decoder architecture dynamically filters and enhances discriminative features. Experimental results on GossipCop, Weibo and PolitiFact datasets show that LLMMFR improves Accuracy by at least 0.022, 0.035 and 0.034 respectively, effectively alleviating both cross-modal inconsistency and data imbalance. Despite the additional computational overhead from LLM generation and the hybrid architecture, the performance gains justify the cost in high-stakes scenarios. The main code is available at https://github.com/sgysgwayityou/LLMMFR.

Original languageEnglish
Article number132850
JournalExpert Systems with Applications
Volume327
DOIs
StatePublished - 25 Sep 2026
Externally publishedYes

Keywords

  • Description enhanced document
  • Domain feature alignment
  • Dual-attention mechanism
  • Mask enhanced classifier
  • Multimodal fake news detection
  • Reasoning enhanced document

Fingerprint

Dive into the research topics of 'Large language model enhanced multimodal fake news detection with masked feature reconstruction'. Together they form a unique fingerprint.

Cite this