Skip to main navigation Skip to search Skip to main content

A refinement approach to handling model misfit in semi-supervised learning

  • Hanjing Su*
  • , Ling Chen
  • , Yunming Ye
  • , Zhaocai Sun
  • , Qingyao Wu
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • University of Technology Sydney

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Semi-supervised learning has been the focus of machine learning and data mining research in the past few years. Various algorithms and techniques have been proposed, from generative models to graph-based algorithms. In this work, we focus on the Cluster-and-Label approaches for semi-supervised classification. Existing cluster-and-label algorithms are based on some underlying models and/or assumptions. When the data fits the model well, the classification accuracy will be high. Otherwise, the accuracy will be low. In this paper, we propose a refinement approach to address the model misfit problem in semi-supervised classification. We show that we do not need to change the cluster-and-label technique itself to make it more flexible. Instead, we propose to use successive refinement clustering of the dataset to correct the model misfit. A series of experiments on UCI benchmarking data sets have shown that the proposed approach outperforms existing cluster-and-label algorithms, as well as traditional semi-supervised classification techniques including Selftraining and Tri-training.

Original languageEnglish
Title of host publicationAdvanced Data Mining and Applications - 6th International Conference, ADMA 2010, Proceedings
Pages75-86
Number of pages12
EditionPART 2
DOIs
StatePublished - 2010
Externally publishedYes
Event6th International Conference on Advanced Data Mining and Applications, ADMA 2010 - Chongqing, China
Duration: 19 Nov 201021 Nov 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume6441 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th International Conference on Advanced Data Mining and Applications, ADMA 2010
Country/TerritoryChina
CityChongqing
Period19/11/1021/11/10

Keywords

  • Semi-supervised learning
  • classification
  • model misfit

Fingerprint

Dive into the research topics of 'A refinement approach to handling model misfit in semi-supervised learning'. Together they form a unique fingerprint.

Cite this