Object detection is an important aspect of computer vision research, involving determining the location and class of objects within a scene. For an object detection system to run in real-time, it is vital to minimise the computational costs while maintaining an acceptably high accuracy. In a Convolutional Neural Network (CNN) there is a direct correlation between the accuracy and the computational cost incurred by increasing the number of layers. Activation functions play a key role in a CNN to utilise nonlinearity to help balance the computational cost and accuracy. In this paper, a series of improvements are proposed to the state-of-the-art one-stage real-time object detection model, YOLOv5, providing the capability to enhance the overall performance. The validity of replacing the current activation function in YOLOv5, Swish, with a variety of alternative activation functions was investigated to aid in improving the accuracy and lowering the computational costs associated with visual object detection. This research demonstrates the various improvements in accuracy and performance that are achievable by appropriately selecting a suitable activation function to use in YOLOv5, including ACON, FReLU and Hardswish. The improved YOLOv5 model was verified utilising transfer learning on the German Traffic Sign Detection Benchmark (GTSDB) achieving state-of-the-art performance.
|Number of pages||13|
|Publication status||Published (in print/issue) - 1 Jun 2022|
|Event||International conference on pattern recognition and artificial intelligence - Paris, France, Paris, France|
Duration: 1 Jun 2022 → 3 Jun 2022
|Conference||International conference on pattern recognition and artificial intelligence|
|Period||1/06/22 → 3/06/22|
Bibliographical notePublisher Copyright:
© 2022, Springer Nature Switzerland AG.
- Activation function
- Deep learning
- Object detection