Skip to main navigation Skip to search Skip to main content

Speech emotion recognition using multi-granularity feature fusion through auditory cognitive mechanism

  • Harbin Institute of Technology
  • Shen Zhen Academy of Aerospace Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we focus on the problems of single granularity in feature extraction, loss of temporal information and inefficient use of frame features in discrete speech emotion recognition. Firstly, preliminary cognitive mechanism of auditory emotion is explored through cognitive experiments, and then a multi-granularity fusion feature extraction method inspired by the mechanism for discrete emotional speech signals is proposed. The method can extract 3 different granularity features, including short-term dynamic features of frame granularity, dynamic features of segment granularity and long-term static features of global granularity. Finally, we use the LSTM network model to classify emotions according to the long-term and short-term characteristics of the fusion features. We implement experiment on the discrete emotion datasets of CHEAVD (CASIA Chinese Emotional Audio-Visual Database) released by the Institute of automation, China Research Academy of Sciences, and achieved improvement in recognition rate, increasing the MAP by 6.48%.

Original languageEnglish
Title of host publicationCognitive Computing – ICCC 2019 - 3rd International Conference, Held as Part of the Services Conference Federation, SCF 2019, Proceedings
EditorsRuifeng Xu, Jianzong Wang, Liang-Jie Zhang
PublisherSpringer Verlag
Pages117-131
Number of pages15
ISBN (Print)9783030234065
DOIs
StatePublished - 2019
Event3rd International Conference on Cognitive Computing, ICCC 2019, held as part of the Services Conference Federation, SCF 2019 - San Diego, United States
Duration: 25 Jun 201930 Jun 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11518 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd International Conference on Cognitive Computing, ICCC 2019, held as part of the Services Conference Federation, SCF 2019
Country/TerritoryUnited States
CitySan Diego
Period25/06/1930/06/19

Keywords

  • Auditory cognitive mechanism
  • CNN-LSTM
  • Multi-granularity feature fusion
  • Speech emotion recognition

Fingerprint

Dive into the research topics of 'Speech emotion recognition using multi-granularity feature fusion through auditory cognitive mechanism'. Together they form a unique fingerprint.

Cite this