TY - GEN
T1 - Less Learn Shortcut
T2 - 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023
AU - Du, Yanrui
AU - Yan, Jing
AU - Chen, Yan
AU - Liu, Jing
AU - Zhao, Sendong
AU - She, Qiaoqiao
AU - Wu, Hua
AU - Wang, Haifeng
AU - Qin, Bing
N1 - Publisher Copyright:
© 2023 International Joint Conferences on Artificial Intelligence. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Recent research has revealed that deep neural networks often take dataset biases as a shortcut to make decisions rather than understand tasks, leading to failures in real-world applications. In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution of training data. In particular, we define the word highly co-occurring with a specific label as biased word, and the example containing biased word as biased example. Our analysis shows that biased examples are easier for models to learn, while at the time of prediction, biased words make a significantly higher contribution to the models' predictions, and models tend to assign predicted labels over-relying on the spurious correlation between words and labels. To mitigate models' over-reliance on the shortcut (i.e. spurious correlation), we propose a training strategy Less-Learn-Shortcut (LLS): our strategy quantifies the biased degree of the biased examples and down-weights them accordingly. Experimental results on Question Matching, Natural Language Inference and Sentiment Analysis tasks show that LLS is a task-agnostic strategy and can improve the model performance on adversarial data while maintaining good performance on in-domain data.
AB - Recent research has revealed that deep neural networks often take dataset biases as a shortcut to make decisions rather than understand tasks, leading to failures in real-world applications. In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution of training data. In particular, we define the word highly co-occurring with a specific label as biased word, and the example containing biased word as biased example. Our analysis shows that biased examples are easier for models to learn, while at the time of prediction, biased words make a significantly higher contribution to the models' predictions, and models tend to assign predicted labels over-relying on the spurious correlation between words and labels. To mitigate models' over-reliance on the shortcut (i.e. spurious correlation), we propose a training strategy Less-Learn-Shortcut (LLS): our strategy quantifies the biased degree of the biased examples and down-weights them accordingly. Experimental results on Question Matching, Natural Language Inference and Sentiment Analysis tasks show that LLS is a task-agnostic strategy and can improve the model performance on adversarial data while maintaining good performance on in-domain data.
UR - https://www.scopus.com/pages/publications/85170369481
U2 - 10.24963/ijcai.2023/560
DO - 10.24963/ijcai.2023/560
M3 - 会议稿件
AN - SCOPUS:85170369481
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 5039
EP - 5048
BT - Proceedings of the 32nd International Joint Conference on Artificial Intelligence, IJCAI 2023
A2 - Elkind, Edith
PB - International Joint Conferences on Artificial Intelligence
Y2 - 19 August 2023 through 25 August 2023
ER -