Self-Supervised Solution to the Control Problem of Articulatory Synthesis

  • Paul K. Krug
  • , Peter Birkholz
  • , Branislav Gerazov
  • , Daniel R. van Niekerk
  • , Anqi Xu
  • , Yi Xu

Research output: Contribution to journalConference articlepeer-review

Abstract

Given an articulatory-to-acoustic forward model, it is a priori unknown how its motor control must be operated to achieve a desired acoustic result. This control problem is a fundamental issue of articulatory speech synthesis and the cradle of acoustic-to-articulatory inversion, a discipline which attempts to address the issue by the means of various methods. This work presents an end-to-end solution to the articulatory control problem, in which synthetic motor trajectories of Monte-Carlo-generated artificial speech are linked to input modalities (such as natural speech recordings or phoneme sequence input) via speaker-independent latent representations of a vector-quantized variational autoencoder. The proposed method is self-supervised and thus, in principle, synthesizer and speaker model independent.

Original languageEnglish
Pages (from-to)4329-4333
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2023-August
DOIs
StatePublished - 2023
Externally publishedYes
Event24th Annual conference of the International Speech Communication Association, Interspeech 2023 - Dublin, Ireland
Duration: 20 Aug 202324 Aug 2023

Keywords

  • Acoustic-to-articulatory inversion
  • VQ-VAE

Fingerprint

Dive into the research topics of 'Self-Supervised Solution to the Control Problem of Articulatory Synthesis'. Together they form a unique fingerprint.

Cite this