Abstract
The widespread collection and analysis of private speech signals have become increasingly prevalent, raising significant privacy concerns. To protect speech signals from unauthorized analysis, adversarial attack methods for deceiving speaker recognition models have been proposed. While a few of these methods are specifically designed for real-time protection of speech signals, they introduce significant delays that can severely impact speech communication when applied to streaming speech data. In this paper, we present a novel approach that aims to offer real-time protection for speech signals without delays. By utilizing observed data only, we generate initial adversarial seed perturbations and refine them to obtain the necessary adversarial perturbations predicted for adjacent unobserved signals. This refinement process is conducted via a proposed model called PAPG. On the basis of perturbation prediction, we develop a streaming audio processing framework that generates perturbations in synchronization with the playback of the original signal, effectively eliminating delays. The experimental results demonstrate that under the proposed attack, the average Top-1 accuracy of various advanced speaker recognition methods is reduced by 89%, and the average equal error rate (EER) increases to 36%. Remarkably, these results are achieved without delays while maintaining superior perceptual quality.
| Original language | English |
|---|---|
| Pages (from-to) | 8701-8716 |
| Number of pages | 16 |
| Journal | IEEE Transactions on Information Forensics and Security |
| Volume | 19 |
| DOIs | |
| State | Published - 2024 |
Keywords
- Speaker recognition
- adversarial machine learning
- real-time attack
Fingerprint
Dive into the research topics of 'Adversarial Perturbation Prediction for Real-Time Protection of Speech Privacy'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver