Skip to main navigation Skip to search Skip to main content

Deep Convolutional Neural Network with Transfer Learning for Environmental Sound Classification

  • Harbin Institute of Technology Weihai
  • Beiyang Electric Group Co. Ltd.

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Environmental sound classification (ESC) is an important issue. However, due to the lack of datasets, high-Accuracy ESC has always been challenging. In this paper, we propose a new convolutional neural network (CNN) model using transfer learning technology for ESC task. First, we represent sound as RGB image, where the red channel corresponds to the Log-Mel spectrogram, the green channel corresponds to the scalogram, and the blue channel corresponds to the Mel frequency cepstrum coefficient (MFCC). Second, we train a CNN architecture based on Xception model which has a better performance on the JFT dataset. Test results show that the proposed approach is with a better performance on the ESC accuracy.

Original languageEnglish
Title of host publication2021 International Conference on Computer, Control and Robotics, ICCCR 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages242-245
Number of pages4
ISBN (Electronic)9781728190358
DOIs
StatePublished - 8 Jan 2021
Externally publishedYes
Event2021 International Conference on Computer, Control and Robotics, ICCCR 2021 - Shanghai, China
Duration: 8 Jan 202110 Jan 2021

Publication series

Name2021 International Conference on Computer, Control and Robotics, ICCCR 2021

Conference

Conference2021 International Conference on Computer, Control and Robotics, ICCCR 2021
Country/TerritoryChina
CityShanghai
Period8/01/2110/01/21

Keywords

  • CNN
  • ESC-50
  • Log-Mel spectrogram
  • MFCC
  • Xception
  • environmental sound classification
  • scalogram
  • transfer learning

Fingerprint

Dive into the research topics of 'Deep Convolutional Neural Network with Transfer Learning for Environmental Sound Classification'. Together they form a unique fingerprint.

Cite this