Abstract
Given an articulatory-to-acoustic forward model, it is a priori unknown how its motor control must be operated to achieve a desired acoustic result. This control problem is a fundamental issue of articulatory speech synthesis and the cradle of acoustic-to-articulatory inversion, a discipline which attempts to address the issue by the means of various methods. This work presents an end-to-end solution to the articulatory control problem, in which synthetic motor trajectories of Monte-Carlo-generated artificial speech are linked to input modalities (such as natural speech recordings or phoneme sequence input) via speaker-independent latent representations of a vector-quantized variational autoencoder. The proposed method is self-supervised and thus, in principle, synthesizer and speaker model independent.
| Original language | English |
|---|---|
| Pages (from-to) | 4329-4333 |
| Number of pages | 5 |
| Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
| Volume | 2023-August |
| DOIs | |
| State | Published - 2023 |
| Externally published | Yes |
| Event | 24th Annual conference of the International Speech Communication Association, Interspeech 2023 - Dublin, Ireland Duration: 20 Aug 2023 → 24 Aug 2023 |
Keywords
- Acoustic-to-articulatory inversion
- VQ-VAE
Fingerprint
Dive into the research topics of 'Self-Supervised Solution to the Control Problem of Articulatory Synthesis'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver