Abstract
The loss of valuable information may be caused by undersampling, and the class overlapping between the majority class and the minority class may be aggravated by the synthetic minority oversampling technique(SMOTE). A sampling method, Screening_SMOTE, is proposed in this paper, combining safe sample screening based undersampling with SMOTE. Parts of non-informative instances and noise instances in the majority class are identified and discarded by the undersampling method using safe screening rules. Then, the minority class instances generated by SMOTE are added into the screened dataset. The loss of informative information is avoided and the noise instances in the majority class are discarded using safe sample screening based undersampling, relieving the class overlapping. The experimental results show that Screening_SMOTE is an effective method of rebalancing imbalanced datasets, especially for high dimensional imbalanced datasets.
| Translated title of the contribution | Safe sample screening based sampling method for imbalanced data |
|---|---|
| Original language | Chinese (Traditional) |
| Pages (from-to) | 545-556 |
| Number of pages | 12 |
| Journal | Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence |
| Volume | 32 |
| Issue number | 6 |
| DOIs | |
| State | Published - 1 Jun 2019 |
| Externally published | Yes |
Fingerprint
Dive into the research topics of 'Safe sample screening based sampling method for imbalanced data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver