Skip to main navigation Skip to search Skip to main content

Dual Pseudo Supervision for Semi-Supervised Text Classification with a Reliable Teacher

  • Shujie Li
  • , Min Yang*
  • , Chengming Li
  • , Ruifeng Xu
  • *Corresponding author for this work
  • University of Science and Technology of China
  • Shenzhen Institute of Advanced Technology
  • Sun Yat-Sen University
  • Harbin Institute of Technology Shenzhen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we study the semi-supervised text classification (SSTC) by exploring both labeled and extra unlabeled data. One of the most popular SSTC techniques is pseudo-labeling which assigns pseudo labels for unlabeled data via a teacher classifier trained on labeled data. These pseudo labeled data is then applied to train a student classifier. However, when the pseudo labels are inaccurate, the student classifier will learn from inaccurate data and get even worse performance than the teacher. To mitigate this issue, we propose a simple yet efficient pseudo-labeling framework called Dual Pseudo Supervision (DPS), which exploits the feedback signal from the student to guide the teacher to generate better pseudo labels. In particular, we alternately update the student based on the pseudo labeled data annotated by the teacher and optimize the teacher based on the student's performance via meta learning. In addition, we also design a consistency regularization term to further improve the stability of the teacher. With the above two strategies, the learned reliable teacher can provide more accurate pseudo-labels to the student and thus improve the overall performance of text classification. We conduct extensive experiments on three benchmark datasets (i.e., AG News, Yelp and Yahoo) to verify the effectiveness of our DPS method. Experimental results show that our approach achieves substantially better performance than the strong competitors. For reproducibility, we will release our code and data of this paper publicly at https: //github.com/GRIT621/DPS.

Original languageEnglish
Title of host publicationSIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages2513-2518
Number of pages6
ISBN (Electronic)9781450387323
DOIs
StatePublished - 7 Jul 2022
Externally publishedYes
Event45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022 - Madrid, Spain
Duration: 11 Jul 202215 Jul 2022

Publication series

NameSIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022
Country/TerritorySpain
CityMadrid
Period11/07/2215/07/22

Keywords

  • consistency regularization
  • meta learning
  • pseudo labeling
  • semi-supervised text classification

Fingerprint

Dive into the research topics of 'Dual Pseudo Supervision for Semi-Supervised Text Classification with a Reliable Teacher'. Together they form a unique fingerprint.

Cite this