Abstract
Object detection is a key component in computer vision research, allowing a system to determine the location and type of object within any given scene. YOLOv5 is a modern object detection model, which utilises the advantages of the original YOLO implementation while being built from scratch in Python. In this paper, BiFPN-YOLO is proposed, featuring clear improvements over the existing range of YOLOv5 object detection models; these include replacing the traditional Path-Aggregation Network (PANet) with a higher performing Bi-Directional Feature Pyramid Network (BiFPN), requiring complex adaptation from its original implementation to function with YOLOv5, as well as exploring a replacement to the standard Swish activation function by evaluating the performance against a number of other activation functions. The proposed model showcases state-of-the-art performance, benchmarking against well-known datasets such as the German Traffic Sign Detection Benchmark (GTSDB), improving mAP by 3.1 %, and the RoboFEI@Home dataset, where Mean Average Precision (mAP) is improved by 2 % compared to the base YOLOv5 model. Performance was also improved on MSCOCO by 1.1 % and a custom subset of the OpenImagesV6 dataset by 2.4 %.
Original language | English |
---|---|
Article number | 111209 |
Number of pages | 15 |
Journal | Pattern Recognition |
Volume | 160 |
Issue number | 111209 |
Early online date | 18 Nov 2024 |
DOIs | |
Publication status | Published online - 18 Nov 2024 |
Bibliographical note
Publisher Copyright:© 2024 The Author(s)
Data Access Statement
Data will be made available on request.Keywords
- Object detection
- Computer vision
- Neural networks
- YOLO