Skip to main navigation Skip to search Skip to main content

Robust Benchmark for Propagandist Text Detection and Mining High-Quality Data

  • Pir Noman Ahmad*
  • , Yuanchao Liu
  • , Gauhar Ali
  • , Mudasir Ahmad Wani*
  • , Mohammed ElAffendi
  • *Corresponding author for this work
  • School of Computer Science and Technology, Harbin Institute of Technology
  • Prince Sultan University (PSU)

Research output: Contribution to journalArticlepeer-review

Abstract

Social media, fake news, and different propaganda strategies have all contributed to an increase in misinformation online during the past ten years. As a result of the scarcity of high-quality data, the present datasets cannot be used to train a deep-learning model, making it impossible to establish an identification. We used a natural language processing approach to the issue in order to create a system that uses deep learning to automatically identify propaganda in news items. To assist the scholarly community in identifying propaganda in text news, this study suggested the propaganda texts (ProText) library. Truthfulness labels are assigned to ProText repositories after being manually and automatically verified with fact-checking methods. Additionally, this study proposed using a fine-tuned Robustly Optimized BERT Pre-training Approach (RoBERTa) and word embedding using multi-label multi-class text classification. Through experimentation and comparative research analysis, we address critical issues and collaborate to discover answers. We achieved an evaluation performance accuracy of 90%, 75%, 68%, and 65% on ProText, PTC, TSHP-17, and Qprop, respectively. The big-data method, particularly with deep-learning models, can assist us in filling out unsatisfactory big data in a novel text classification strategy. We urge collaboration to inspire researchers to acquire, exchange datasets, and develop a standard aimed at organizing, labeling, and fact-checking.

Original languageEnglish
Article number2668
JournalMathematics
Volume11
Issue number12
DOIs
StatePublished - Jun 2023
Externally publishedYes

Keywords

  • ProText
  • big data
  • fact-check
  • misinformation
  • propaganda
  • social media

Fingerprint

Dive into the research topics of 'Robust Benchmark for Propagandist Text Detection and Mining High-Quality Data'. Together they form a unique fingerprint.

Cite this