Skip to main navigation Skip to search Skip to main content

An Empirical Study on Software Defect Prediction Using Over-Sampling by SMOTE

  • Harbin Institute of Technology
  • Kim Il Sung University

Research output: Contribution to journalArticlepeer-review

Abstract

Software defect prediction suffers from the class-imbalance. Solving the class-imbalance is more important for improving the prediction performance. SMOTE is a useful over-sampling method which solves the class-imbalance. In this paper, we study about some problems that faced in software defect prediction using SMOTE algorithm. We perform experiments for investigating how they, the percentage of appended minority class and the number of nearest neighbors, influence the prediction performance, and compare the performance of classifiers. We use paired t-test to test the statistical significance of results. Also, we introduce the effectiveness and ineffectiveness of over-sampling, and evaluation criteria for evaluating if an over-sampling is effective or not. We use those concepts to evaluate the results in accordance with the evaluation criteria for the effectiveness of over-sampling. The results show that they, the percentage of appended minority class and the number of nearest neighbors, influence the prediction performance, and show that the over-sampling by SMOTE is effective in several classifiers.

Original languageEnglish
Pages (from-to)811-830
Number of pages20
JournalInternational Journal of Software Engineering and Knowledge Engineering
Volume28
Issue number6
DOIs
StatePublished - 1 Jun 2018

Keywords

  • SMOTE
  • Software defect prediction
  • class-imbalance
  • fault prediction
  • over-sampling

Fingerprint

Dive into the research topics of 'An Empirical Study on Software Defect Prediction Using Over-Sampling by SMOTE'. Together they form a unique fingerprint.

Cite this