Abstract
In recent years, the use of millimetre wave radio signals for speech recognition has rapidly developed. The absence of high-frequency components resulting from the material vibration constraints of fully viewed indoor objects has undermined the recognition accuracy in this field. This paper presents a new solution to the Chinese digits speech recognition problem by reconstructing the high-frequency harmonic and non-harmonic components with the radio signals received by millimetre wave radar sensors. A time–frequency analysis was conducted to convert the phase variations extracted from the radar I/Q signals to spectrograms. An improved threshold strategy was used to enhance the harmonic components on the spectrogram. Subsequently, a CycleGAN-based network was constructed to recover non-harmonic components on the spectrograms. An evaluation experiment was performed with a 77-GHz frequency modulated continuous wave radar sensor to use the induced vibrations of aluminium foils, glass, and anti-static bags to recognise the speeches of standard Chinese digit numbers (0–9). The F1 score in the speech recognition experiment reached 96.6%, with a micro average accuracy exceeding 98.3%. These results show that the proposed method can improve recognition accuracy by generating finer signatures from radio signals.
| Original language | English |
|---|---|
| Article number | e70000 |
| Journal | IET Radar, Sonar and Navigation |
| Volume | 19 |
| Issue number | 1 |
| DOIs | |
| State | Published - 1 Jan 2025 |
Keywords
- acoustic signal processing
- millimetre wave radar
- neural nets
Fingerprint
Dive into the research topics of 'WaveMic: Speech recognition of Chinese digit numbers from radio signals'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver