Skip to main navigation Skip to search Skip to main content

AdaptiveWordBug: Generating adversarial texts with an adaptive scoring strategy against deep learning classifiers

  • Yunting Zhang
  • , Lin Ye*
  • , Baisong Li
  • , Hongli Zhang
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • Antiy Labs

Research output: Contribution to journalArticlepeer-review

Abstract

Deep learning models demonstrate vulnerability to textual adversarial attacks. Research on adversarial text generation methods contributes to the subsequent design of corresponding countermeasures. Current word-level adversarial text generation methods are typically designed in the framework based on word importance. In this framework, we need to score the importance of each word in a text with a scoring method and subsequently perturb these words in descending order of importance. However, current approaches typically employ a single model-dependent method to score the word importance during the scoring process. This scoring strategy often struggles to select important words comprehensively and accurately and may even fail when faced with some texts. To address this issue, we propose a black-box adversarial text generation method for text classification tasks in the framework based on word importance, named AdaptiveWordBug. AdaptiveWordBug introduces a new scoring strategy, Adaptive Scoring Strategy (ASS), which combines three model-dependent scoring approaches and one model-independent approach. Simultaneously, an adaptive parameter is assigned to each scoring method. Each parameter can be automatically adjusted for different texts. This scoring strategy has two advantages. On the one hand, it can comprehensively and accurately identify important words in any text, greatly enhancing the effectiveness of the generated adversarial texts. On the other hand, each scoring method in this strategy can be easily integrated or removed as a component. This allows simple adjustment of the scoring strategy for different target models, resulting in good suitability for various target models. In experiments conducted on Chinese text classification datasets, we employ the proposed AdaptiveWordBug to attack Chinese BERT and ChatGPT. The results demonstrate that, compared to baseline methods, AdaptiveWordBug exhibits superior attack effectiveness.

Original languageEnglish
Article number108262
JournalNeural Networks
Volume195
DOIs
StatePublished - Mar 2026
Externally publishedYes

Keywords

  • Adversarial example
  • Deep neural network
  • Scoring method
  • Text classification
  • Textual adversarial attack

Fingerprint

Dive into the research topics of 'AdaptiveWordBug: Generating adversarial texts with an adaptive scoring strategy against deep learning classifiers'. Together they form a unique fingerprint.

Cite this