Abstract
Correlated quality metrics extracted from a source code repository
can be utilized to design a model to automatically predict defects in a software system. It is obvious that the extracted metrics will result in a highly
unbalanced data, since the number of defects in a good quality software system should be far less than the number of normal instances. It is also a fact
that the selection of the best discriminating features significantly improves
the robustness and accuracy of a prediction model. Therefore, the contribution of this paper is twofold, first it selects the best discriminating features
that help in accurately predicting a defect in a software component. Secondly,
a cost-sensitive logistic regression and decision tree ensemble-based prediction
models are applied to the best discriminating features for precisely predicting
a defect in a software component. The proposed models are compared with
the most recent schemes in the literature in terms of accuracy, area under thecurve (AUC), and recall. The models are evaluated using 11 datasets and it
is evident from the results and analysis that the performance of the proposed
prediction models outperforms the schemes in the literature.
can be utilized to design a model to automatically predict defects in a software system. It is obvious that the extracted metrics will result in a highly
unbalanced data, since the number of defects in a good quality software system should be far less than the number of normal instances. It is also a fact
that the selection of the best discriminating features significantly improves
the robustness and accuracy of a prediction model. Therefore, the contribution of this paper is twofold, first it selects the best discriminating features
that help in accurately predicting a defect in a software component. Secondly,
a cost-sensitive logistic regression and decision tree ensemble-based prediction
models are applied to the best discriminating features for precisely predicting
a defect in a software component. The proposed models are compared with
the most recent schemes in the literature in terms of accuracy, area under thecurve (AUC), and recall. The models are evaluated using 11 datasets and it
is evident from the results and analysis that the performance of the proposed
prediction models outperforms the schemes in the literature.
| Original language | English |
|---|---|
| Article number | 11 |
| Number of pages | 18 |
| Journal | Automated software engineering |
| Volume | 28 |
| Early online date | 12 Jul 2021 |
| DOIs | |
| Publication status | Published online - 12 Jul 2021 |
Bibliographical note
Funding Information:This research is supported by the BTIIC (BT Ireland Innovation Centre) project, funded by BT and Invest Northern Ireland.
Publisher Copyright:
© 2021, The Author(s).
Funding
Funding Information: This research is supported by the BTIIC (BT Ireland Innovation Centre) project, funded by BT and Invest Northern Ireland. Publisher Copyright: © 2021, The Author(s).
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 9 Industry, Innovation, and Infrastructure
Keywords
- AUC
- Cost-sensitivity
- Discriminating features
- Machine learning models
- Recall
- Software bugs/defects
Fingerprint
Dive into the research topics of 'Discriminating features-based cost-sensitive approach for software defect prediction'. Together they form a unique fingerprint.Research output
- 32 Citations
- 1 Conference contribution
-
An Event-level Clustering Framework for Process Mining Using Common Sequential Rules
Muhammad Tariq, Z., Charles, D. K., McClean, S. I., McChesney, I. & Taylor, P., 4 Nov 2021, iCETiC ’21 Proceedings. Springer Nature, Vol. 395. p. 1-14 14 p. (Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Open AccessFile5 Link opens in a new tab Citations (Scopus)248 Downloads (Pure)
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver