Skip to main navigation Skip to search Skip to main content

The HW-TSC’s Speech-to-Speech Translation System for IWSLT 2023

  • Wang Minghan
  • , Li Yinglu
  • , Guo Jiaxin
  • , Li Zongyao
  • , Shang Hengchao
  • , Wei Daimeng
  • , Su Chang
  • , Zhang Min
  • , Tao Shimin
  • , Yang Hao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper describes our work on the IWSLT2023 Speech-to-Speech task. Our proposed cascaded system consists of an ensemble of Conformer and S2T-Transformer-based ASR models, a Transformer-based MT model, and a Diffusion-based TTS model. Our primary focus in this competition was to investigate the modeling ability of the Diffusion model for TTS tasks in high-resource scenarios and the role of TTS in the overall S2S task. To this end, we proposed DTS, an end-to-end diffusion-based TTS model that takes raw text as input and generates waveform by iteratively denoising on pure Gaussian noise. Compared to previous TTS models, the speech generated by DTS is more natural and performs better in code-switching scenarios. As the training process is end-to-end, it is relatively straightforward. Our experiments demonstrate that DTS outperforms other TTS models on the GigaS2S benchmark, and also brings positive gain for the entire S2S system.

Original languageEnglish
Title of host publication20th International Conference on Spoken Language Translation, IWSLT 2023 - Proceedings of the Conference
EditorsElizabeth Salesky, Marcello Federico, Marine Carpuat
PublisherAssociation for Computational Linguistics
Pages277-282
Number of pages6
ISBN (Electronic)9781959429845
StatePublished - 2023
Externally publishedYes
Event20th International Conference on Spoken Language Translation, IWSLT 2023 - Hybrid, Toronto, Canada
Duration: 13 Jul 202314 Jul 2023

Publication series

Name20th International Conference on Spoken Language Translation, IWSLT 2023 - Proceedings of the Conference

Conference

Conference20th International Conference on Spoken Language Translation, IWSLT 2023
Country/TerritoryCanada
CityHybrid, Toronto
Period13/07/2314/07/23

Fingerprint

Dive into the research topics of 'The HW-TSC’s Speech-to-Speech Translation System for IWSLT 2023'. Together they form a unique fingerprint.

Cite this