Comparative Analysis of Deep Learning Architectures for Traffic Sign Detection

Abstract:

Traffic sign detection is a critical computer vision task for driver assistance and autonomous vehicles. In this paper, we perform a comparative analysis of deep learning architectures for traffic sign detection, focusing on state-of-the-art convolutional neural network (CNN) models. We
implement the YOLOv3 object detection model and evaluate its performance on a public road sign dataset containing 15 sign classes. We analyze detection accuracy and speed, and we compare YOLOv3’s results with those of other detection architectures reported in the literature, including two-stage region-based CNNs and other single-stage detectors. Our YOLOv3 model achieved high detection accuracy, with a mean average precision over 95% at an Intersection-over-Union threshold of 0.5 on the test set, while operating in real time. We found that YOLOv3 outperforms earlier single-shot detectors (such as the SSD model) in accuracy, and provides competitive performance relative to more computationally intensive two-stage detectors (e.g. Faster R-CNN), which achieve higher precision at the cost of speed. These results demonstrate that modern onestage CNN detectors offer an excellent trade-off between accuracy and efficiency for traffic sign detection applications. We conclude that deep learning-based approaches, and YOLOv3 in particular, are well-suited for accurate and real-time traffic sign detection, although additional data and model refinements can further improve performance on challenging sign classes.