Abstract
Humans have a remarkable ability to focus on one of the sound sources in a multi-speaker environment. Auditory spatial attention detection (ASAD) aims to identify the direction of the speech source a person is attending to based on their brain signals, with potential applications in enhancing hearing aids, improving communication systems, and advancing brain-computer interface (BCI) technologies. Most prior studies formulated the problem as binary classification, however, real-world scenarios are much more complex. Our study explores the feasibility of detecting auditory attention among 10 competing speakers. To address the needs of low-resource computing equipment, we further propose a novel approach using a binary temporal convolutional neural network (B-TCNN) for multi-class ASAD tasks. This study effectively reduces memory consumption and accelerates inference. Experimental results show that the B-TCNN achieves an average accuracy of 93.8% with only 33K parameters in 1-second decision windows for a 10-class ASAD dataset. The proposed network significantly outperforms other competitive models, offering a lightweight and efficient solution for multi-class ASAD tasks.
| Original language | English |
|---|---|
| Title of host publication | 2024 14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024 |
| Editors | Yanmin Qian, Qin Jin, Zhijian Ou, Zhenhua Ling, Zhiyong Wu, Ya Li, Lei Xie, Jianhua Tao |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 46-50 |
| Number of pages | 5 |
| ISBN (Electronic) | 9798331516826 |
| DOIs | |
| State | Published - 2024 |
| Externally published | Yes |
| Event | 14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024 - Beijing, China Duration: 7 Nov 2024 → 10 Nov 2024 |
Publication series
| Name | 2024 14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024 |
|---|
Conference
| Conference | 14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024 |
|---|---|
| Country/Territory | China |
| City | Beijing |
| Period | 7/11/24 → 10/11/24 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Auditory attention
- binary neural network
- brain-computer interface
- cocktail party
- electroencephalography
Fingerprint
Dive into the research topics of 'Binary-Temporal Convolutional Neural Network for Multi-Class Auditory Spatial Attention Detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver