Skip to main navigation Skip to search Skip to main content

A study on the emotional prosody model building based on small-scale emotional data and large-scale neutral data

  • Yanqiu Shao*
  • , Zhifang Sui
  • , Jiqing Han
  • , Zhiwei Wang
  • *Corresponding author for this work
  • Peking University
  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Emotional prosody model building is very important for emotional speech synthesis. However, in the courses of researches, it is a serious problem that the quantity of emotional data is much less than neutral data. The corpus including three emotions, i.e. happiness, anger and sadness, is built in this paper. The parameters that affect the emotional prosody are analyzed and an emotional prosody model based on neural network is built. In the process of training the prosody model, because emotional corpus is too small, the problem of over-fitting caused by data sparsity will occur. In order to utilize the large-scale neutral data to improve the quality of emotional prosody model, three methods are proposed, namely, the method of mixed corpus, data fusion based on least-square algorithm, and multistage network. All of these methods amplify the impact of emotional corpus. So, the prediction results of emotional parameters are all improved to some extent. Especially the method of multistage network, which uses the result of neutral model as one input of the network, corresponds to enlarge the features space and strengthen the function of the emotional input features. The results show that the multistage network is the best one of the three methods.

Original languageEnglish
Pages (from-to)1624-1631
Number of pages8
JournalJisuanji Yanjiu yu Fazhan/Computer Research and Development
Volume44
Issue number9
DOIs
StatePublished - Sep 2007
Externally publishedYes

Keywords

  • Data fusion
  • Data sparsity
  • Emotional speech synthesis
  • Over-fitting
  • Prosody model

Fingerprint

Dive into the research topics of 'A study on the emotional prosody model building based on small-scale emotional data and large-scale neutral data'. Together they form a unique fingerprint.

Cite this