Abstract
Deep learning models demonstrate vulnerability to textual adversarial attacks. Research on adversarial text generation methods contributes to the subsequent design of corresponding countermeasures. Current word-level adversarial text generation methods are typically designed in the framework based on word importance. In this framework, we need to score the importance of each word in a text with a scoring method and subsequently perturb these words in descending order of importance. However, current approaches typically employ a single model-dependent method to score the word importance during the scoring process. This scoring strategy often struggles to select important words comprehensively and accurately and may even fail when faced with some texts. To address this issue, we propose a black-box adversarial text generation method for text classification tasks in the framework based on word importance, named AdaptiveWordBug. AdaptiveWordBug introduces a new scoring strategy, Adaptive Scoring Strategy (ASS), which combines three model-dependent scoring approaches and one model-independent approach. Simultaneously, an adaptive parameter is assigned to each scoring method. Each parameter can be automatically adjusted for different texts. This scoring strategy has two advantages. On the one hand, it can comprehensively and accurately identify important words in any text, greatly enhancing the effectiveness of the generated adversarial texts. On the other hand, each scoring method in this strategy can be easily integrated or removed as a component. This allows simple adjustment of the scoring strategy for different target models, resulting in good suitability for various target models. In experiments conducted on Chinese text classification datasets, we employ the proposed AdaptiveWordBug to attack Chinese BERT and ChatGPT. The results demonstrate that, compared to baseline methods, AdaptiveWordBug exhibits superior attack effectiveness.
| Original language | English |
|---|---|
| Article number | 108262 |
| Journal | Neural Networks |
| Volume | 195 |
| DOIs | |
| State | Published - Mar 2026 |
| Externally published | Yes |
Keywords
- Adversarial example
- Deep neural network
- Scoring method
- Text classification
- Textual adversarial attack
Fingerprint
Dive into the research topics of 'AdaptiveWordBug: Generating adversarial texts with an adaptive scoring strategy against deep learning classifiers'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver