TY - GEN
T1 - Speech Enhancement Model for High Sampling Rate Speech Datasets Based on Multi-branch Time Convolutional Network
AU - Zhang, Zehua
AU - Wang, Mingjiang
AU - Zhuang, Xuyi
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - In this paper, a high sampling rate speech enhancement method based on multi-branch time convolutional networks (TCN) is proposed. The most important parameter in traditional speech enhancement algorithms is the prior signal-to-noise ratio (SNR). In this paper, Deep Xi framework is used to estimate the prior SNR, and multi-branch TCN is proposed to realize the mapping of the amplitude spectrum of noisy speech to the prior SNR. The multi-branch time convolutional network proposed in this paper can better capture context information and smaller model size. In addition, in the waveform reconstruction stage, this paper proposes to use the weighted Euclidean distortion measure to correct the gain function. Experimental results on a speech dataset with a 48kHz sampling rate show that our strategy has more advanced performance and superior performance.
AB - In this paper, a high sampling rate speech enhancement method based on multi-branch time convolutional networks (TCN) is proposed. The most important parameter in traditional speech enhancement algorithms is the prior signal-to-noise ratio (SNR). In this paper, Deep Xi framework is used to estimate the prior SNR, and multi-branch TCN is proposed to realize the mapping of the amplitude spectrum of noisy speech to the prior SNR. The multi-branch time convolutional network proposed in this paper can better capture context information and smaller model size. In addition, in the waveform reconstruction stage, this paper proposes to use the weighted Euclidean distortion measure to correct the gain function. Experimental results on a speech dataset with a 48kHz sampling rate show that our strategy has more advanced performance and superior performance.
KW - Minimum mean square error
KW - Multi-branch time convolutional network
KW - Prior signal-to-noise ratio
KW - Speech enhancement
UR - https://www.scopus.com/pages/publications/85125198967
U2 - 10.1109/ICSIP52628.2021.9688870
DO - 10.1109/ICSIP52628.2021.9688870
M3 - 会议稿件
AN - SCOPUS:85125198967
T3 - 2021 6th International Conference on Signal and Image Processing, ICSIP 2021
SP - 836
EP - 840
BT - 2021 6th International Conference on Signal and Image Processing, ICSIP 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th International Conference on Signal and Image Processing, ICSIP 2021
Y2 - 22 October 2021 through 24 October 2021
ER -