3.1 Creating the Model
There are five convolution layers in total in the
Convolutional Neural Network model that we
designed.
The kernel size of the first four layers was 3x3,
and the last layer's kernel was 1x1. After each
convolution layer, there was a Batch Normalization
layer.
We spread an effort to reduce the data size.
Therefore, we used MaxPooling in size 2x2 after the
first four convolutional layers. After the last
convolution layer, we transformed the data into one-
dimensional tensor by flattening. This one-
dimensional tensor was given as an input to the fully
connected layers.
Our model had six fully connected layers. The
first five fully connected layers contain 128, 64, 32,
16, 10 neurons, and the fully connected output layer
contains 1 neuron.
To prevent over-fitting, dropout layers were used
after convolution and fully connected layers.
For the cross-validation method, we created 5
different data partitions by dividing the data into
random groups. All groups were set to contain an
equal amount of data. We used 80% of the data for
training and 20% for testing. Batch Size was set to 1
to allow more accurate gradient value calculation and
to reduce linearity.
We trained the Convolutional Neural Network
model with Adam optimizer at 100 epochs and set the
initial learning rate to 0.0001.
The loss function was chosen as the binary cross-
entropy, which provided the best binary classification
result.
After training the model, we sent the test data to
the model and classified the predicted label value by
comparing it with the values found as a result of the
sigmoid function.
All operations were executed on NVIDIA
GeForce GTX 1080 Ti workstation with 64GB RAM.
Rescaled using MATLAB 2020b, classification was
performed using Python 3.9.7 and Keras 2.8.0 (using
Tensorflow 2.8.0 backend).
3.2 Data Augmentation
Since the data set we have was not enough for the
model to learn completely, the data was replicated
with the help of the ImageDataGenerator() function.
In this step, our train and test data;
• By specifying the rotation_range parameter,
which rotates the image randomly clockwise
by the given degree (40 degrees),
• By specifying the rescale parameter 1./255,
which performs the normalization process,
• By specifying the zoom_range parameter 0.2,
which is used to zoom the image,
• By specifying the shear_range parameter 0.2,
which distorts the image in the axis direction,
• To move the image horizontally, by specifying
the width_shift_range parameter to 0.2,
• To move the image vertically, by setting the
height_shift_range parameter to 0.2,
• Set the horizontal_flip parameter to True to
flip the image horizontally.
were multiplied. As a result, we obtained 5 new
pictures from each training picture. In the end, we had
800 pictures for both above and below threshold
classes.
After reshaping the dimensions of our data, the
learning process was carried out by fitting our model.
We calculated our success metric "accuracy" which is
the number of correct predictions/total number of
predictions. Although accuracy was our main
evaluation metric, evaluating it alone was not the
right approach. We also used Precision and Recall
metrics to better observe the reliability of Accuracy.
The precision tells us how many of the positively
predicted class predictions are essentially positive. In
other words, it refers to the formula TP/(TP+FP).
Recall, on the other hand, gives the ratio of how many
of the transactions we need to predict positively are
essentially positively predicted. In other words, it is
formulated as TP / (TP+FN). When we took the
average of 5 results, we got 93.25% Accuracy, 88.5%
Precision, 98% Recall.
Figure 3: Confusion Matrix Representation.