Skip to main navigation Skip to search Skip to main content

Adversarial Perturbation Prediction for Real-Time Protection of Speech Privacy

  • Harbin Institute of Technology
  • Shenzhen MSU-BIT University

Research output: Contribution to journalArticlepeer-review

Abstract

The widespread collection and analysis of private speech signals have become increasingly prevalent, raising significant privacy concerns. To protect speech signals from unauthorized analysis, adversarial attack methods for deceiving speaker recognition models have been proposed. While a few of these methods are specifically designed for real-time protection of speech signals, they introduce significant delays that can severely impact speech communication when applied to streaming speech data. In this paper, we present a novel approach that aims to offer real-time protection for speech signals without delays. By utilizing observed data only, we generate initial adversarial seed perturbations and refine them to obtain the necessary adversarial perturbations predicted for adjacent unobserved signals. This refinement process is conducted via a proposed model called PAPG. On the basis of perturbation prediction, we develop a streaming audio processing framework that generates perturbations in synchronization with the playback of the original signal, effectively eliminating delays. The experimental results demonstrate that under the proposed attack, the average Top-1 accuracy of various advanced speaker recognition methods is reduced by 89%, and the average equal error rate (EER) increases to 36%. Remarkably, these results are achieved without delays while maintaining superior perceptual quality.

Original languageEnglish
Pages (from-to)8701-8716
Number of pages16
JournalIEEE Transactions on Information Forensics and Security
Volume19
DOIs
StatePublished - 2024

Keywords

  • Speaker recognition
  • adversarial machine learning
  • real-time attack

Fingerprint

Dive into the research topics of 'Adversarial Perturbation Prediction for Real-Time Protection of Speech Privacy'. Together they form a unique fingerprint.

Cite this