TY - GEN
T1 - Biomedical Named Entity Recognition Model Based on Knowledge Distillation
AU - Han, Rong
AU - Zheng, Dequan
AU - Yu, Feng
AU - Li, Yannan
AU - Hu, Jia
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
PY - 2023
Y1 - 2023
N2 - As the literature in the biomedical field continues to grow, the identification of biomedical named entities becomes increasingly important. A pre-trained language model called BioBERT was created exclusively for identifying biological named entities, and the performance of using the BioBERT model in extracting biomedical named entity identification has been dramatically improved compared to the previously proposed BERT model. However, due to its large model and many parameters, even more than 110 million. Its drawback is that it is time-consuming and requires significant resources. Consequently, we suggest a knowledge distillation strategy in this study, in which knowledge from a teacher model with a complicated structure is learned by a student model with a simple structure to increase recognition performance. In this paper, the BioBERT model is used as the teacher model and the BiLSTM model is used as the student model, and the best distillation effect is finally found when the weighting factor a = 0.3 by experimentally comparing different weighting factors. At this time, the F1 value of the refined student model is improved by 0.29% compared with the original model.
AB - As the literature in the biomedical field continues to grow, the identification of biomedical named entities becomes increasingly important. A pre-trained language model called BioBERT was created exclusively for identifying biological named entities, and the performance of using the BioBERT model in extracting biomedical named entity identification has been dramatically improved compared to the previously proposed BERT model. However, due to its large model and many parameters, even more than 110 million. Its drawback is that it is time-consuming and requires significant resources. Consequently, we suggest a knowledge distillation strategy in this study, in which knowledge from a teacher model with a complicated structure is learned by a student model with a simple structure to increase recognition performance. In this paper, the BioBERT model is used as the teacher model and the BiLSTM model is used as the student model, and the best distillation effect is finally found when the weighting factor a = 0.3 by experimentally comparing different weighting factors. At this time, the F1 value of the refined student model is improved by 0.29% compared with the original model.
KW - Knowledge distillation
KW - Named entity recognition
KW - Pre-trained language models
UR - https://www.scopus.com/pages/publications/85208948886
U2 - 10.1007/978-981-97-3980-6_38
DO - 10.1007/978-981-97-3980-6_38
M3 - 会议稿件
AN - SCOPUS:85208948886
SN - 9789819739790
T3 - Smart Innovation, Systems and Technologies
SP - 443
EP - 451
BT - Business Intelligence and Information Technology - Proceedings of BIIT 2023
A2 - Hassanien, Aboul Ella
A2 - Zheng, Dequan
A2 - Zhao, Zhijie
A2 - Fan, Zhipeng
PB - Springer Science and Business Media Deutschland GmbH
T2 - International Conference on Business Intelligence and Information Technology, BIIT 2023
Y2 - 17 December 2023 through 18 December 2023
ER -