Skip to main navigation Skip to search Skip to main content

Reserved Self-training: A Semi-supervised Sentiment Classification Method for Chinese Microblogs

  • Zhiguang Liu
  • , Xishuang Dong
  • , Yi Guan*
  • , Jinfeng Yang
  • *Corresponding author for this work
  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The imbalanced sentiment distribution of microblogs induces bad performance of binary classifiers on the minority class. To address this problem, we present a semi-supervised method for sentiment classification of Chinese microblogs. This method is similar to self-training, except that, a set of labeled samples is reserved for a confidence scores computing process through which samples that are less than a predefined confidence score threshold are incorporated into training set for retraining. By doing this, the classifier is able to boost the performance on the minority class samples. Experiments on the NLP&CC2012 Chinese microblog evaluation data set demonstrated that reserved self-training outperforms the best run by 2.06% macro-averaged and 2.30% micro-averaged F-measure, respectively.

Original languageEnglish
Title of host publication6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Proceedings of the Main Conference
EditorsRuslan Mitkov, Jong C. Park
PublisherAsian Federation of Natural Language Processing
Pages455-462
Number of pages8
ISBN (Electronic)9784990734800
StatePublished - 2013
Externally publishedYes
Event6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Nagoya, Japan
Duration: 14 Oct 2013 → …

Publication series

Name6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Proceedings of the Main Conference

Conference

Conference6th International Joint Conference on Natural Language Processing, IJCNLP 2013
Country/TerritoryJapan
CityNagoya
Period14/10/13 → …

Fingerprint

Dive into the research topics of 'Reserved Self-training: A Semi-supervised Sentiment Classification Method for Chinese Microblogs'. Together they form a unique fingerprint.

Cite this