2.3 Phase One – Developing the Model
using Artificial Images
Using the guidance material, eight individual
helideck marking images were made for each
document, 8 using HSAC RP 161 and 8 using CAP
437, resulting in 16 distinct images for the machine
learning process. These images include helidecks that
are round, rectangular, and octagonal. Based on these
16 images, the category None was created manually
by copying the images and using a photo editor to
change colors and remove key elements. This process
resulted in a folder with 23 individual images and
were categorized as HSAC, CAP 437 or None.
Before being able to create a dataset, the model
will need to distinguish individual images from each
other. To achieve this, each image has been verified
and classified manually and the filename reflects the
classification of the image. The filenames will either
start with a prefix CAP 437, HSAC, or None.
In Convolutional Neural Networks, more
available data provides better overall results. The
number of 23 images currently available is not
enough to properly train a convolutional neural
network. Data augmentation was used to
automatically generate more images for the network
to use during learning. The data augmentation
consisted of taking a single image and altering
saturation, brightness, or rotation to generate
additional images that have different properties.
Tensorflow has an ImageDataGenerator function that
can adjust the mentioned property values and save the
newly generated image to a different location (Abadi,
et al., 2015). Additionally, this function can perform
functions such as flipping the image orientation,
shifting horizontally and vertically, adjusting zoom
levels to make it appear closer or further away, and
shearing the image to make the helideck appear
angled (Abadi, et al., 2015). The augmented image
was sheared, mirrored horizontally, zoomed out, and
has an increased brightness. Repeating this step for
each image 100 times will result in over 4,343 images
as a dataset for the neural network learning.
Just as important as the algorithm itself, the
environment used to train the algorithm needs to be
taken into consideration. While the model can be
exported and be reused in other hardware, the training
process requires a more robust setting. For this
training process a desktop computer with a ZOTAC
GeForce® GTX 1070 Ti Mini graphics card was used
to train the model.
The graphics card aids in accelerating the neural
network training process by using the Tensorflow
library. This library will use the CUDA cores to allow
parallel processing (Abadi, et al., 2015). The software
used in this process was Microsoft Visual Studio
Code with the Python extension provided by
Microsoft. Libraries within Python 3.8.7 mainly
consist of Tensorflow 2.4.1 and keras 2.4.3, while
sklearn was used for metrics (Pedregosa, et al., 2011).
A base convolutional neural network model is
first created to start the process of finding an
optimized model. The base model is manually
constructed to increase productivity and a gradient
descent optimizer is selected. Based on the resulting
graphs of accuracy and loss, manual modifications
are made to add and adjust layers and create a model
that demonstrates the desired learning curve, as well
as a desired loss function curve.
This initial model will also define the compiler
used for future fine-tuning, and will be chosen
between SGD, AdaDelta, RMSprop, and Adam. The
chosen optimizer will be based on the graphs
generated after each training session and by the
performance of the model.
During the training, the model will be modified
until it has a validation accuracy above 90 percent
This number was chosen as this program is meant to
be an aid to the pilot, so in case it does misidentify,
the pilot will still be able to personally verify the
helideck. In this process, the computer uses a loop to
modify the number of convolutions per layer and the
number of nodes per dense layers to find a model that
has an accuracy above 90 percent.
The accuracy of the predictions is dependent on
the training and testing data. There are no universal
rules regarding the identification of proper ratios
between training and testing data to obtain a certain
percentage in accuracy. Also, as the size of the
training data increases, the accuracy of the model will
likewise increase (Medar, Rajpurohit, & Rashmi,
2017). Focusing on the model, rather than the number
of images it is training and testing on, will give the
program the chance to obtain accuracies of 90 percent
or higher. The training and testing ratio will be set at
75 percent training and 25 percent testing. Using this
ratio, the model will have enough images to learn and
adapt to the ratio to get accuracies above 90 percent.
In case the optimization process is not able to obtain
90 percent, the training and testing ratio will be
modified and then the process will have to be
restarted to find the proper model.
2.4 Phase Two - Developing the Model
using Real Images
Phase two of the process is similar to phase one,
except that instead of self-developed images actual