Abstract
The mismatch between the training and the testing environments greatly degrades the performance of speaker recognition. Although many robust techniques have been proposed, speaker recognition in mismatch condition is still a challenge. To solve this problem, we propose a sparse-based auditory model as the front-end of speaker recognition by simulating auditory processing of speech signal. To this end, we introduce narrow-band filter-bank instead of the widely used wide-band filter-bank to simulate the basilar membrane filter-bank, use sparse representation as the approximation of basilar membrane coding strategy, and incorporate the frequency selectivity enhance mechanism between tectorial membrane and basilar membrane by practical engineering approximation. Compared with the standard Mel-frequency cepstral coefficient approach, our preliminary experimental results indicate that the sparse-based auditory model consistently improve the robustness of speaker recognition in mismatched condition.
| Original language | English |
|---|---|
| Article number | 1250015 |
| Journal | International Journal of Pattern Recognition and Artificial Intelligence |
| Volume | 26 |
| Issue number | 7 |
| DOIs | |
| State | Published - Nov 2012 |
| Externally published | Yes |
Keywords
- Sparse representation
- robust feature
- selectivity gain
- speaker recognition
Fingerprint
Dive into the research topics of 'Sparse-based auditory model for robust speaker recognition'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver