Authors:
Diogo Lopes da Silva
1
and
António Ramires Fernandes
2
Affiliations:
1
Universidade do Minho, Braga, Portugal
;
2
Algoritmi Centre/Department of Informatics, Universidade do Minho, Braga, Portugal
Keyword(s):
Synthetic Training Sets, Traffic Sign Classification Repositories, Convolutional Neural Networks.
Abstract:
Current traffic sign image repositories for classification purposes suffer from scarcity of samples due to the compiling and labelling images being mainly a manual process. Thus, researchers resort to alternative approaches to deal with this issue, such as increasing the model architectural complexity or performing data augmentation. A third approach is the usage of synthetic data. This work addresses the data shortage issue by building a synthetic repository proposing a pipeline to build synthetic samples introducing previously unused image operators. Three use cases for synthetic data usage are explored: as a standalone training set, merging with real data, and ensembling. The first option provides results that not only clearly surpass any previous attempt on using synthetic data for traffic sign recognition but are also encouragingly placing the obtained accuracies closer to results with real images. Merging real and synthetic data in a single data set further improves those resul
ts. Due to the different nature of the datasets involved, ensembling provides a boost in accuracy results. Overall we got results in three different datasets that surpass previous state of the art results: GTSRB (99:85%), BTSC (99:76%), and rMASTIF (99:84%). Finally, cross testing amongst the three datasets hints that our synthetic datasets have the potential to provide better generalization ability than using real data.
(More)