Heterogeneous Cross-Project Defect Prediction using Encoder and Transfer Learning

Radowanul Haque, Joost Noppen

Research output: Contribution to journalArticlepeer-review

40 Downloads (Pure)

Abstract

Heterogeneous cross-project defect prediction (HCPDP) aims to predict defects in new software projects using defect data from previous software projects where the source and target projects have some different metrics. Most existing methods only find linear relationships in the software defect features and datasets. Additionally, these methods use multiple defect datasets from different projects as source datasets. In this paper, we propose a novel method called heterogeneous cross-project defect prediction using encoder and transfer learning (ETL). ETL uses encoders to extract the important features from source and target datasets. Also, to minimize negative transfer during transfer learning, we used an augmented dataset that contains pseudo-labels and the source dataset. Additionally, we have used very limited data to train the model. To evaluate the performance of the ETL approach, 16 datasets from four publicly available software defect projects were used. Furthermore, we compared the proposed method with four HCPDP methods namely EGW, HDP_KS, CTKCCA and EMKCA, and one WPDP method from existing literature. The proposed method on average outperforms the baseline methods in terms of PD, PF, F1-score, G-mean and AUC.

Original languageEnglish
Pages (from-to)409-419
Number of pages12
JournalIEEE Access
Volume12
Issue numberEarly Access
Early online date14 Dec 2023
DOIs
Publication statusPublished online - 14 Dec 2023

Bibliographical note

Publisher Copyright:
Authors

Keywords

  • software defect prediction
  • Software Engineering
  • Transfer Learning
  • Measurement
  • Adaptation models
  • Software defect
  • Transfer learning
  • Predictive models
  • Feature extraction
  • Data models
  • Software
  • Software engineering

Fingerprint

Dive into the research topics of 'Heterogeneous Cross-Project Defect Prediction using Encoder and Transfer Learning'. Together they form a unique fingerprint.

Cite this