Skip to main navigation Skip to search Skip to main content

基于安全样本筛选的不平衡数据抽样方法

Translated title of the contribution: Safe sample screening based sampling method for imbalanced data
  • Hongbo Shi*
  • , Yanxin Liu
  • , Suqin Ji
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The loss of valuable information may be caused by undersampling, and the class overlapping between the majority class and the minority class may be aggravated by the synthetic minority oversampling technique(SMOTE). A sampling method, Screening_SMOTE, is proposed in this paper, combining safe sample screening based undersampling with SMOTE. Parts of non-informative instances and noise instances in the majority class are identified and discarded by the undersampling method using safe screening rules. Then, the minority class instances generated by SMOTE are added into the screened dataset. The loss of informative information is avoided and the noise instances in the majority class are discarded using safe sample screening based undersampling, relieving the class overlapping. The experimental results show that Screening_SMOTE is an effective method of rebalancing imbalanced datasets, especially for high dimensional imbalanced datasets.

Translated title of the contributionSafe sample screening based sampling method for imbalanced data
Original languageChinese (Traditional)
Pages (from-to)545-556
Number of pages12
JournalMoshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence
Volume32
Issue number6
DOIs
StatePublished - 1 Jun 2019
Externally publishedYes

Fingerprint

Dive into the research topics of 'Safe sample screening based sampling method for imbalanced data'. Together they form a unique fingerprint.

Cite this