Abstract
With the increase of the scale and complexity of software, it is inevitable that there will be various software bugs. The security-related software bugs are easy to be exploited by malicious users to launch attacks and cause great damage. In software development and maintenance process, the bug report tracking systems such as Bugzilla are usually used to record and track the bugs in the form of bug reports. The identification of the security bug report automatically quickly identifies the security related bug reports in the bug report tracking systems, which could help the developers to work on fast fixing bugs. Recently, many existing methods for security bug report detection have been gaining much attention to tackle such problems by combining text mining and machine learning. However, owing to the small sample size and complex characteristics of security-related bug reports, it is difficult for most previous work based on machine learning methods to capture deep semantic information from textual fields of bug reports. In addition, previous approaches focus on filtering the noise bug reports from datasets using text mining models without considering the semantic information, which leads to a bottleneck for further improving the prediction performance of the trained model. In order to address the aforementioned problems, in this paper, we develop a novel framework to predict unknown security bug reports by combining semantic-based noise filtering with deep learning techniques. More concretely, it firstly leverages the word embedding technique to get the dense and low-dimensional vector representation of all words in corpus. Secondly, it leverages the proposed Filtering Semantically Deviating Outlier NSBRs (FSDON) method to filter the non-security bug reports (NSBRs) that have higher similarity with security bug reports (SBRs). Finally, it builds predictive models for SBRs detection based on different deep learning networks (LSTM, GRU, TextCNN and Multi-scale DCNN). This method is evaluated on 5 different datasets, and the experimental results show that the g-measure performance of this method can be improved by 8.26% on average compared with the state-of-the-art methods. Overall, the proposed method yields the best performance on all the datasets of different scales.
| Translated title of the contribution | Security Bug Report Detection Via Noise Filtering and Deep Learning |
|---|---|
| Original language | Chinese (Traditional) |
| Pages (from-to) | 1794-1813 |
| Number of pages | 20 |
| Journal | Jisuanji Xuebao/Chinese Journal of Computers |
| Volume | 45 |
| Issue number | 8 |
| DOIs | |
| State | Published - Aug 2022 |
| Externally published | Yes |
Fingerprint
Dive into the research topics of 'Security Bug Report Detection Via Noise Filtering and Deep Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver