Skip to main navigation Skip to search Skip to main content

Random forest using tree selection method to classify unbalanced data

  • Baoxun Xu*
  • , Yunming Ye
  • , Qiang Wang
  • , Junjie Li
  • , Xiaojun Chen
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • Shenzhen Institute of Advanced Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Random forest is a popular classification algorithm used to build ensemble models of decision tree classifiers. However, owing to the complexity of unbalanced data distribution in high dimensional space, a random forest may include bad trees that can result in wrong results. This paper proposed an improved random forest algorithm with tree selection methods. This algorithm is particularly designed for analyzing unbalanced data. The novel tree selection methods are developed for making random forest framework well suited to classify unbalanced data. Experimental results on unbalanced datasets with diverse characteristics have demonstrated that the proposed method could generate a random forest model with higher performance than the random forests generated by Breiman's method.

Original languageEnglish
Title of host publicationFourth International Conference on Digital Image Processing, ICDIP 2012
DOIs
StatePublished - 2012
Externally publishedYes
Event4th International Conference on Digital Image Processing, ICDIP 2012 - Kuala Lumpur, Malaysia
Duration: 7 Apr 20128 Apr 2012

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume8334
ISSN (Print)0277-786X

Conference

Conference4th International Conference on Digital Image Processing, ICDIP 2012
Country/TerritoryMalaysia
CityKuala Lumpur
Period7/04/128/04/12

Keywords

  • Random forest
  • decision tree
  • in-of-bag
  • out-of-bag

Fingerprint

Dive into the research topics of 'Random forest using tree selection method to classify unbalanced data'. Together they form a unique fingerprint.

Cite this