TY - GEN
T1 - Empirical study on software bug prediction
AU - Rizwan, Syed
AU - Tiantian, Wang
AU - Xiaohong, Su
AU - Salahuddin,
N1 - Publisher Copyright:
Copyright 2010 ACM.
PY - 2017/12/28
Y1 - 2017/12/28
N2 - Software defect prediction is a vital research direction in software engineering field. Software defect prediction predicts whether software errors are present in the software by using machine learning analysis on software metrics. It can help software developers to improve the quality of the software. Software defect prediction is usually a binary classification problem, which relies on software metrics and the use of classifiers. There have been many research efforts to improve accuracy in software defect prediction using a variety of classifiers and data preprocessing techniques. However, the "classic classifier validity" and "data preprocessing techniques can enhance the functionality of software defect prediction" has not yet been answered explicitly. Therefore, it is necessary to conduct an empirical analysis tocompare these studies. In software defect prediction, the category of interest is a defective module, and the number of defective modules is much less than that of a non-defective module in data. This leads to a category of imbalance problem that reduces the accuracy of the prediction. Therefore, the problem of imbalance is a key problem that needs to be solved in software defect prediction. In this paper, we proposed an experimental model and used the NASA MDP data set to analyze the software defect prediction. Five research questions were defined and analyzed experimentally. In addition to experimental analysis, this paperfocuses on the improvement of SMOTE. SMOTE ASMO algorithm has been proposed to overcome the shortcomings of SMOTE.
AB - Software defect prediction is a vital research direction in software engineering field. Software defect prediction predicts whether software errors are present in the software by using machine learning analysis on software metrics. It can help software developers to improve the quality of the software. Software defect prediction is usually a binary classification problem, which relies on software metrics and the use of classifiers. There have been many research efforts to improve accuracy in software defect prediction using a variety of classifiers and data preprocessing techniques. However, the "classic classifier validity" and "data preprocessing techniques can enhance the functionality of software defect prediction" has not yet been answered explicitly. Therefore, it is necessary to conduct an empirical analysis tocompare these studies. In software defect prediction, the category of interest is a defective module, and the number of defective modules is much less than that of a non-defective module in data. This leads to a category of imbalance problem that reduces the accuracy of the prediction. Therefore, the problem of imbalance is a key problem that needs to be solved in software defect prediction. In this paper, we proposed an experimental model and used the NASA MDP data set to analyze the software defect prediction. Five research questions were defined and analyzed experimentally. In addition to experimental analysis, this paperfocuses on the improvement of SMOTE. SMOTE ASMO algorithm has been proposed to overcome the shortcomings of SMOTE.
KW - Classification
KW - Data preprocessing
KW - Defect prediction
KW - SMOTE
UR - https://www.scopus.com/pages/publications/85045942717
U2 - 10.1145/3178212.3178221
DO - 10.1145/3178212.3178221
M3 - 会议稿件
AN - SCOPUS:85045942717
T3 - ACM International Conference Proceeding Series
SP - 55
EP - 59
BT - Proceedings of the 2017 International Conference on Software and e-Business, ICSEB 2017
PB - Association for Computing Machinery
T2 - 2017 International Conference on Software and e-Business, ICSEB 2017
Y2 - 28 December 2017 through 30 December 2017
ER -