Skip to main navigation Skip to search Skip to main content

C_CART: An instance confidence-based decision tree algorithm for classification

  • Shuang Yu
  • , Xiongfei Li
  • , Hancheng Wang
  • , Xiaoli Zhang*
  • , Shiping Chen
  • *Corresponding author for this work
  • Ministry of Education of the People's Republic of China
  • College of Computer Science and Technology
  • CSIRO
  • Nanjing University

Research output: Contribution to journalArticlepeer-review

Abstract

In classification, a decision tree is a common model due to its simple structure and easy understanding. Most of decision tree algorithms assume all instances in a dataset have the same degree of confidence, so they use the same generation and pruning strategies for all training instances. In fact, the instances with greater degree of confidence are more useful than the ones with lower degree of confidence in the same dataset. Therefore, the instances should be treated discriminately according to their corresponding confidence degrees when training classifiers. In this paper, we investigate the impact and significance of degree of confidence of instances on the classification performance of decision tree algorithms, taking the classification and regression tree (CART) algorithm as an example. First, the degree of confidence of instances is quantified from a statistical perspective. Then, a developed CART algorithm named C_CART is proposed by introducing the confidence of instances into the generation and pruning processes of CART algorithm. Finally, we conduct experiments to evaluate the performance of C_CART algorithm. The experimental results show that our C_CART algorithm can significantly improve the generalization performance as well as avoiding the over-fitting problem to a certain extend.

Original languageEnglish
Pages (from-to)929-948
Number of pages20
JournalIntelligent Data Analysis
Volume25
Issue number4
DOIs
StatePublished - 2021
Externally publishedYes

Keywords

  • CART algorithm
  • Degree of confidence
  • classification
  • generalization
  • machine learning

Fingerprint

Dive into the research topics of 'C_CART: An instance confidence-based decision tree algorithm for classification'. Together they form a unique fingerprint.

Cite this