Abstract
The growing prevalence of software vulnerabilities has increased the need for effective detection methods, particularly in cross-project settings where domain differences create significant challenges. Existing vulnerability detection models often struggle to generalise across projects due to variations in coding styles, feature distributions, and the absence of labelled target data. This paper presents ZSVulD, a zero-shot, cross-project vulnerability detection framework designed to operate without target-domain labels. ZSVulD uses domain-agnostic CodeBERT embeddings to capture both syntactic and semantic features of source code, enabling knowledge transfer between projects. The framework applies an iterative pseudo-labelling process in which a neural network and XGBoost classifier collaboratively refine predictions for the target domain. Feature alignment is incorporated as a diagnostic technique to assess and visualise distributional differences between source and target datasets. Experiments on the Devign and REVEAL datasets show that ZSVulD achieves higher recall, F1, and F2 scores compared to existing methods, with an emphasis on reducing false negatives. These findings indicate that ZSVulD can support automated vulnerability detection pipelines, contributing to more reliable security assessments across different software projects.
| Original language | English |
|---|---|
| Article number | 3 |
| Pages (from-to) | 1-27 |
| Number of pages | 27 |
| Journal | Empirical Software Engineering |
| Volume | 31 |
| Issue number | 1 |
| Early online date | 29 Oct 2025 |
| DOIs | |
| Publication status | Published online - 29 Oct 2025 |
Bibliographical note
Publisher Copyright:© The Author(s) 2025.
Data Availability Statement
To support full transparency and reproducibility, we have released the implementation at: https://github.com/Radowan98/ZSVulD.Funding
This research did not receive any external funding.
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 9 Industry, Innovation, and Infrastructure
Keywords
- LLMs
- Vulnerability Detection
- Code Classification
- Zero-shot learning
- Zero-Shot learning
- Vulnerability detection
- Code classification
Fingerprint
Dive into the research topics of 'A zero-shot framework for cross-project vulnerability detection in source code'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver