Skip to main navigation Skip to search Skip to main content

Speech Enhancement Model for High Sampling Rate Speech Datasets Based on Multi-branch Time Convolutional Network

  • Zehua Zhang*
  • , Mingjiang Wang
  • , Xuyi Zhuang
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, a high sampling rate speech enhancement method based on multi-branch time convolutional networks (TCN) is proposed. The most important parameter in traditional speech enhancement algorithms is the prior signal-to-noise ratio (SNR). In this paper, Deep Xi framework is used to estimate the prior SNR, and multi-branch TCN is proposed to realize the mapping of the amplitude spectrum of noisy speech to the prior SNR. The multi-branch time convolutional network proposed in this paper can better capture context information and smaller model size. In addition, in the waveform reconstruction stage, this paper proposes to use the weighted Euclidean distortion measure to correct the gain function. Experimental results on a speech dataset with a 48kHz sampling rate show that our strategy has more advanced performance and superior performance.

Original languageEnglish
Title of host publication2021 6th International Conference on Signal and Image Processing, ICSIP 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages836-840
Number of pages5
ISBN (Electronic)9780738133737
DOIs
StatePublished - 2021
Externally publishedYes
Event6th International Conference on Signal and Image Processing, ICSIP 2021 - Nanjing, China
Duration: 22 Oct 202124 Oct 2021

Publication series

Name2021 6th International Conference on Signal and Image Processing, ICSIP 2021

Conference

Conference6th International Conference on Signal and Image Processing, ICSIP 2021
Country/TerritoryChina
CityNanjing
Period22/10/2124/10/21

Keywords

  • Minimum mean square error
  • Multi-branch time convolutional network
  • Prior signal-to-noise ratio
  • Speech enhancement

Fingerprint

Dive into the research topics of 'Speech Enhancement Model for High Sampling Rate Speech Datasets Based on Multi-branch Time Convolutional Network'. Together they form a unique fingerprint.

Cite this