Skip to main navigation Skip to search Skip to main content

基于掩码语言模型的中文 BERT 攻击方法

Translated title of the contribution: Chinese BERT Attack Method Based on Masked Language Model
  • Yun Ting Zhang
  • , Lin Ye*
  • , Hao Lin Tang
  • , Hong Li Zhang
  • , Shang Li
  • *Corresponding author for this work
  • Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Adversarial texts are malicious samples that can cause deep learning classifiers to make errors. The adversary creates an adversarial text that can deceive the target model by adding subtle perturbations to the original text that are imperceptible to humans. The study of adversarial text generation methods can evaluate the robustness of deep neural networks and contribute to the subsequent robustness improvement of the model. Among the current adversarial text generation methods designed for Chinese text, few attack the robust Chinese BERT model as the target model. For Chinese text classification tasks, this study proposes an attack method against Chinese BERT, that is Chinese BERT Tricker. This method adopts a character-level word importance scoring method, important Chinese character positioning. Meanwhile, a word-level perturbation method for Chinese based on the masked language model with two types of strategies is designed to achieve the replacement of important words. Experimental results show that for the text classification tasks, the proposed method can significantly reduce the classification accuracy of the Chinese BERT model to less than 40% on two real datasets, and it outperforms other baseline methods in terms of multiple attack performance.

Translated title of the contributionChinese BERT Attack Method Based on Masked Language Model
Original languageChinese (Traditional)
Pages (from-to)3392-3409
Number of pages18
JournalRuan Jian Xue Bao/Journal of Software
Volume35
Issue number7
DOIs
StatePublished - 2024
Externally publishedYes

Fingerprint

Dive into the research topics of 'Chinese BERT Attack Method Based on Masked Language Model'. Together they form a unique fingerprint.

Cite this