Skip to main navigation Skip to search Skip to main content

Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language Understanding

  • Xv Meng
  • , Jun Rao
  • , Shuhan Qi*
  • , Lei Wang
  • , Jing Xiao
  • , Xuan Wang
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • Alibaba Group Holding Ltd.
  • Ping An Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Enhanced with machine learning, language understanding enables computers to not only comprehend but also learn from human language, thereby augmenting the capabilities of various NLP applications in AI. Multi-teacher distillation is a prominent method for knowledge transfer in language understanding, leveraging multiple teacher models to train a single student model. However, this approach incurs significant time and storage costs for training and inference with multiple teachers. To address these issues, we introduce PEE-KD, a simple yet effective framework that generates supervision for training a student model from a single language model. We implemented a language model with multiple prompts as the teacher model in multi-teacher distillation, achieving lightweight training and inference. Additionally, we propose an uncertainty-based method to enhance the robustness and accuracy of multiple prompts during training, along with a selector module to improve the inference speed of multi-teacher models. Experiments on NLU and NER tasks demonstrate that PEE-KD improves accuracy by up to 1.8% and efficiency by up to 140% compared to existing methods. Logit visualization comparisons between teacher and student models further validate the effectiveness of our approach. Our code and data are available at https://anonymous.4open.science/r/PEEKD-DF50/.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases. Research Track and Demo Track - European Conference, ECML PKDD 2024, Proceedings
EditorsAlbert Bifet, Povilas Daniušis, Jesse Davis, Tomas Krilavičius, Meelis Kull, Eirini Ntoutsi, Kai Puolamäki, Indrė Žliobaitė
PublisherSpringer Science and Business Media Deutschland GmbH
Pages218-234
Number of pages17
ISBN (Print)9783031703706
DOIs
StatePublished - 2024
Externally publishedYes
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024 - Vilnius, Lithuania
Duration: 9 Sep 202413 Sep 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14948 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024
Country/TerritoryLithuania
CityVilnius
Period9/09/2413/09/24

Keywords

  • Deep learning
  • Multi-teacher knowledge distillation
  • Prompt tuning

Fingerprint

Dive into the research topics of 'Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language Understanding'. Together they form a unique fingerprint.

Cite this