4. How do the parameters of DR affect results?
By analysing these questions, this work con-
tributes the following:
• Extension of DR to detect different airplane mod-
els in front of real-world complex backgrounds;
• Introduction of a new DR component, namely, ob-
ject motion (rotation and translation), which im-
proves detection accuracy;
• Investigation of the parameters of DR to evaluate
their importance for airplane detection;
• A comprehensive metric study to make compari-
son between synthetic and real data training’s;
• Fine-tuning models trained on large synthetic
dataset with a small set of real data.
2 PREVIOUS WORKS
Our work is related to airplane detection, synthetic
data for machine learning, domain adaptation and do-
main randomization.
The performance for computer vision algorithms
increases logarithmically with training data increas-
ing. However, obtaining a large amount of annotated
data in the real-word is a bottleneck for computer vi-
sion tasks. One way of dealing with this issue is to use
the synthetic data as a cheap and efficient solution to
assemble such large datasets (Bousmalis et al., 2018).
The use of synthetic data however introduces what
is known as the reality gap, which is the inability for
the synthetic environments to fully generate the real-
world data, for numerous reasons including textures,
physics of materials, lighting and domain distribu-
tions (Nowruzi et al., 2019). In an attempt to nar-
row the reality gap, DR is introduced to simulate a
sufficiently large amount of variations such that real-
world data is viewed as simply another domain varia-
tion. This can include randomization of view angles,
textures, shapes, camera localisation, object positions
and many other parameters (Valtchev and Wu, 2020).
On the other hand, underlying principal of DR is to
create enough variance in training data which forces
the model to only learn relevant features useful for the
task. DR provides a solution to narrow the reality gap,
thus by enriching the data generation phase.
To address the reality gap, DR techniques have
been explored, including most notably the work of
(Tobin et al., 2017), where they synthesized images
of basic geometric objects on a table, in an attempt
to estimate their 3D positions, such that a robotic
arm could pick them up. Their accuracy varied de-
pending on domain parameters, achieving errors as
low as 1.5cm on average in terms of object location,
showing promise for synthetic data training. Notably,
they found that the number of images and the number
of unique textures used in the images were the most
prominent parameters to model accuracy. Camera po-
sitioning and occlusion also had meaningful contri-
butions, while the addition of random noise in images
did not. (Tremblay et al., 2018) uses DR for car de-
tection by effectively abandoning photorealism in the
creation of the synthetic dataset. The involved work
forced the network to learn only the essential features
for the task of car detection. Results of this approach
are comparable to the Virtual KITTI dataset (Gaidon
et al., 2016). One issue with the Virtual KITTI dataset
is its limited sample size of 2500 images. This could
result in worse performance than the larger datasets.
(Loquercio et al., 2019) used domain randomiza-
tion to bridge the gap between the artificial world and
the real one, in the task of autonomous drone flight.
In their work, they synthesized arbitrary race courses
for the drone to learn to fly in, and then tested their
controller in arbitrary track configurations in the real
world. They achieved near perfect course completion
scores for many variations including max speed con-
straints up to 10m/s, and lap totals less than 3.
In (Barisic et al., 2021), a network is trained by
texture invariant object representation for aerial ob-
ject detection. By a technique of randomly assigning
atypical textures to UAV models, the obtained results
confirm that shape plays a greater role in aerial ob-
ject detection. The authors proved also that the train-
ing by the synthetic dataset outperforms baseline and
even real-world data in situations with difficult light-
ing and distant objects.
On the other hand, supervised-learning-based ap-
proaches for airplane detection often require a large
amount of training data, for the most important ob-
ject detection in both military and civil aviation fields.
The manual annotation, of an object such as airplane,
in large image sets is generally expensive and some-
times unreliable, due to the significant appearance
variations of airplane models, the airport area back-
ground is often complex and cluttered, and the air-
planes can be at multiple scales on images. As a re-
sult, it is difficult to achieve accurate detection with
training from a small amount of annotated real data.
Besides, most research have used satellites as the data
feeder. Accordingly, this work targets real-time air-
plane detection applications in airport areas, using
synthetic images collected by domain randomization
and processed through deep learning model.
VEHITS 2022 - 8th International Conference on Vehicle Technology and Intelligent Transport Systems
426