TY - GEN
T1 - CDMA
T2 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022
AU - Li, Jianchen
AU - Han, Jiqing
AU - Song, Hongwei
N1 - Publisher Copyright:
© 2022 IEEE
PY - 2022
Y1 - 2022
N2 - To solve the domain shift problem in speaker verification, one effective domain adaptation approach is to learn domain-invariant embeddings via aligning the source and target distributions in the embedding space. However, this approach could be problematic when the source and target domains are from the disjoint speaker label spaces as the embedding distributions of different speakers cannot be aligned. In this paper, we propose a Cross-domain Distance Metric Adaptation (CDMA) approach to alleviate the domain shift in the distance metric space, where the source and target domains share the same classes, i.e., within- and between-speaker. Specifically, the two target pairwise distance distributions are aligned with the source pairwise distance distributions and further separated to learn a domain-invariant metric, which is more suitable for speaker verification based on metric learning. Experiments indicate that CDMA significantly outperforms the approach proposed in the embedding space.
AB - To solve the domain shift problem in speaker verification, one effective domain adaptation approach is to learn domain-invariant embeddings via aligning the source and target distributions in the embedding space. However, this approach could be problematic when the source and target domains are from the disjoint speaker label spaces as the embedding distributions of different speakers cannot be aligned. In this paper, we propose a Cross-domain Distance Metric Adaptation (CDMA) approach to alleviate the domain shift in the distance metric space, where the source and target domains share the same classes, i.e., within- and between-speaker. Specifically, the two target pairwise distance distributions are aligned with the source pairwise distance distributions and further separated to learn a domain-invariant metric, which is more suitable for speaker verification based on metric learning. Experiments indicate that CDMA significantly outperforms the approach proposed in the embedding space.
KW - Speaker verification
KW - open-set domain adaptation
KW - pairwise distance distributions
UR - https://www.scopus.com/pages/publications/85134029570
U2 - 10.1109/ICASSP43922.2022.9747907
DO - 10.1109/ICASSP43922.2022.9747907
M3 - 会议稿件
AN - SCOPUS:85134029570
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 7197
EP - 7201
BT - 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 22 May 2022 through 27 May 2022
ER -