A fusion approach of YOLOv8 and CNN-Transformer for End-to-End road anomaly detection

  • Sarfaraz Abdul Sattar Natha
  • , Mohammad Siraj
  • , Saif A Alsaif
  • , Fahad Farooq
  • , Admali Shah
  • , Maqsood Mahmud

Research output: Contribution to journalArticlepeer-review

Abstract

Surveillance cameras are common in both the private and public sectors for security and monitoring, and closed-circuit television (CCTV) systems are used for surveillance, generating large amounts of video data that cannot be manually monitored 24/7. The traditional approach to analysis is time-consuming and inefficient, and there is a growing need for automated surveillance systems that can recognize and classify anomalies. The research area that has been the most challenging to solve is AD systems that detect anomalies in data that is not structured according to the normal patterns. RNNs are slow and have difficulty identifying anomalies in the road that occur in multiple frames at the same time, whereas CNNs are limited in extracting temporal features from objects and generally disregard the background noise in video frames. In this study, a new framework for background removal is presented that removes the irrelevant background elements during object recognition. This framework saves temporal and spatial information over frames and uses YOLOv8 and a spatial-temporal adaptive fusion method with an end-to-end model based on a CNN encoder and a Transformer decoder for parallel video investigation. The proposed method was tested on the UCF Crime dataset and a custom Road Anomaly Dataset (RAD), and the accuracy of the framework was 89.90% on the UCF Crime dataset and 98.28% on the RAD dataset.
Original languageEnglish
Article number45341
Pages (from-to)1-14
Number of pages14
JournalScientific Reports
Volume15
Issue number1
Early online date25 Nov 2025
DOIs
Publication statusPublished (in print/issue) - 30 Dec 2025

Bibliographical note

© 2025. The Author(s).

Funding

This work was supported by the Ongoing Research Funding program (ORF-2025-893), King Saud University, Riyadh, Saudi Arabia.

Keywords

  • Smart transportation system
  • CNN
  • Deep learning
  • Transfer Learning
  • YOLO
  • Road anomaly detection

Fingerprint

Dive into the research topics of 'A fusion approach of YOLOv8 and CNN-Transformer for End-to-End road anomaly detection'. Together they form a unique fingerprint.

Cite this