2 RELATED WORK
A Traffic Sign Recognition software using pre-
processing traditional computer vision methods and a
simplistic neural network for an autonomous
navigation robot is presented in (Moura et al., 2014).
This project aimed to participate in the Portuguese
RoboCup Open Autonomous Driving Competition .
It relied on computer vision software with pictogram
extraction for detection and a feed-forward neural
network for traffic sign classification. In most signs,
100% precision was obtained in both algorithms. The
traffic lights had an accuracy of over 96%, whereas
the traffic signs were between 52% and 88.2%. A
different approach using end-to-end machine learning
solutions for traffic sign recognition systems is
presented in (Qian et al., 2016), where CNNs are used
without pre-processing. Instead of using a CNN as a
feature extractor and a multilayer perception (MLP)
as a classifier, max-pooling positions (MPPs) is
proposed as a practical discriminative feature to
predict category labels.
3 PROBLEM DEFINITION
The first proposed task is part of the autonomous
driving competition held at the RoboCup Portuguese
Open (Sociedade Portuguesa de Robótica, 2019).
This competition simulates some problems that arise
when working on autonomous driving in a controlled
and scaled way. It consists of a track with two lanes
and two curves set so that the cars can continuously
drive around the track. It has vertical traffic signs,
traffic lights, two different parking spaces, and traffic
cones for temporary lanes and obstacles (Figure 2).
For this work, the challenge considered is the
"Vertical traffic signs detection challenge".
Figure 2: Environment of the Autonomous Driving
Competition from the RoboCup Portuguese Open.
The second proposed task is similar to the first
one, considering traffic sign and lights detection and
recognition and only differs in the environment. It is
implemented on a real car driving on public roads.
This system must detect a broader range of traffic
signs, further away from the car with different
weather and light conditions.
4 METHODOLOGIES
To test YOLOv3 and YOLOv3_tiny in both
environments (Autonomous Driving Competition and
Public Roads) it is essential to parameterise the
detection goals. In this chapter, all the information
regarding the two environments is described.
In the RoboCup Portuguese Open autonomous
driving competition, apart from detecting which sign
was identified and its relative location to the robot,
another feature implemented is to have the car adjust
its actions and movement in real-time according to
the traffic signs and lights. The results are shown in
simulation and real-world. The autonomous driving
competition consists of correctly identifying six
traffic lights and twelve traffic signs. In addition, a set
of twelve traffic signs were selected to upgrade the
variety of signs and demonstrate YOLOv3 capability
on more extensive sets of signs. The new signs were
selected given their direct interference with the
robot's movement, whether to stop, turn in a direction
and increase or decrease speed. Figure 3 shows all the
traffic signs created where the top twelve are the ones
on the competition rulebook, and the bottom twelve
are the ones added.
Figure 3: Selected traffic signs for the RoboCup Portuguese
Open Autonomous Driving Competition environment.
The traffic lights on the competition are different
from public roads since these are not the traditional
red, yellow and green lights that inform the user to
move or not. These traffic lights provide additional
information on different actions the robot must take.
They display information forcing the robot to turn
left, right or go forward, park, stop or finish the round.
Figure 4, on the left, shows how the traffic light is
placed on the competition track, and on the right side
it shows the six different traffic lights.
To compete in the autonomous driving challenge
a robotic agent must go through the track and
overcome some challenges. The robot agent YOLOv3
was implemented in a car-like four-wheel drive robot
with an RGB camera. The input from the camera is