Discriminating features-based cost-sensitive approach for software defect prediction

Research output: Contribution to journalArticlepeer-review

8 Downloads (Pure)


Correlated quality metrics extracted from a source code repository
can be utilized to design a model to automatically predict defects in a software system. It is obvious that the extracted metrics will result in a highly
unbalanced data, since the number of defects in a good quality software system should be far less than the number of normal instances. It is also a fact
that the selection of the best discriminating features significantly improves
the robustness and accuracy of a prediction model. Therefore, the contribution of this paper is twofold, first it selects the best discriminating features
that help in accurately predicting a defect in a software component. Secondly,
a cost-sensitive logistic regression and decision tree ensemble-based prediction
models are applied to the best discriminating features for precisely predicting
a defect in a software component. The proposed models are compared with
the most recent schemes in the literature in terms of accuracy, area under thecurve (AUC), and recall. The models are evaluated using 11 datasets and it
is evident from the results and analysis that the performance of the proposed
prediction models outperforms the schemes in the literature.
Original languageEnglish
JournalAutomated software engineering
Issue number11
Early online date12 Jul 2021
Publication statusE-pub ahead of print - 12 Jul 2021


Dive into the research topics of 'Discriminating features-based cost-sensitive approach for software defect prediction'. Together they form a unique fingerprint.

Cite this