TML: A Transformer-Based Meta-Learning Framework for Cross-Project Software Defect Prediction

Himanshu Bandhu, Aftab Ali, Sally McClean, Hanif Ullah, Mamun Abu-Tair, Adam Ziolkowski, Joost Noppen

Research output: Working paperPreprint

Abstract

Identifying software defects early is crucial for enhancing software quality and reducing costs. Traditional Within-Project Defect Prediction (WPDP) methods rely on historical project-specific data, limiting their effectiveness when such data is unavailable. Cross-Project Defect Prediction (CPDP) offers a solution by leveraging defect data from different projects, but challenges arise due to the diverse nature of data distributions across projects. This paper presents a novel framework, TML (Transformer-based Meta-Learning), designed to improve CPDP performance by addressing these challenges. TML integrates transformer-based encoder networks for feature extraction, adversarial domain adaptation to align data distributions, and meta-learning to enhance generalization across projects. Additionally, it incorporates ensemble learning and Bayesian optimization to improve model robustness and predictive accuracy. The framework is evaluated on 16 datasets from four major software repositories (AEEM, NASA, Promise, JIRA). Experimental results demonstrate that TML significantly outperforms existing CPDP methods such as ENTL, EGW, and EMKCA in key performance metrics including Precision, Recall, F1-score, G-Mean, and AUC. The results consistently demonstrate the robustness of the TML framework, establishing it as a promising approach for early defect detection in diverse software development environments.
Original languageEnglish
DOIs
Publication statusPublished online - 15 Nov 2024

Fingerprint

Dive into the research topics of 'TML: A Transformer-Based Meta-Learning Framework for Cross-Project Software Defect Prediction'. Together they form a unique fingerprint.

Cite this