On the description of the method in Figure 2, the
research flow can be explained through the following
steps: Starting from the selection of datasets that will
be used in the research process, this study uses pri-
vate datasets taken independently which refer to SIBI.
The next step is that all image datasets are processed
through an augmentation process. After the augmen-
tation process, the data processing process uses sev-
eral CNN models with experiments of 50 and 100
epochs as well as batches of 50 and a rate of 0.001,
then obtained the accuracy value and the AUC value.
After processing with the CNN model, an application
is built that can translate in real-time.
1. Preprocessing
The preprocessing stage is based on the data col-
lected from twenty-six classes namely the A-Z
alphabet, consisting of 7,800 images of the A-Z
alphabet, performing processing using augmenta-
tion techniques, the data resulting from the aug-
mentation process totals to 89,808 images.
2. Convolutional Neural Network (CNN)
CNN is one of the Deep Learning methods. CNN
is a convoluted operation that combines several
layers of processing, using several elements that
run in parallel and are inspired by the biological
nervous system. At CNN each neuron is presented
in a two-dimensional form, so this method is suit-
able for processing with input in the form of an
image (Maggiori et al., 2017).
(a) Input Layer
Input layer is an image data input that is con-
verted into a three-dimensional matrix with the
values of each dimension, namely red, blue and
green (Felix et al., 2020).
(b) Convolution Layer
It is a major part of CNN, as most of the com-
putations on CNN are done in this layer. The
operations performed are the same as convolu-
tion operations commonly performed in image
processing, where there are kernels and sub-
images. The kernels used on CNN are three-
by-three in size. Then for each sub image that
is the same size as the kernel a convolution op-
eration is performed (Alamsyah and Pratama,
2020).
(c) Pooling Layers
Pooling layer is the stage after convolutional
layer. Pooling layer consists of a filter of a cer-
tain size and stride. Each shift will be decided
by the number of strides that will be shifted
over the entire feature map or activation map
area. In its application, the pooling layers com-
monly used are Max Pooling and Average Pool-
ing. For example, if we use Max Pooling 2x2
with Stride 2, then at each filter shift, the value
taken is the largest value in the 2x2 area, while
Average Pooling will take the average value
(Santoso and Ariyanto, 2018).
(d) Fully Connected Layer
It is a multilayer perceptron (MLP) classifica-
tion stage process or also known as neural net-
works. On a fully con- nected layer, each neu-
rons have a full connection to all activations in
the earlier layer. This is the same as the one
in MLP. The activation model is also exactly
the same as MLP, which is that computing uses
a matrix multiplication followed by offset bias
(Putra and Bunyamin, 2020).
(e) Dropout Layer
Dropout is one of the efforts to prevent overfit-
ting and speed up the learning process. Overfit-
ting is a condition where all data that has gone
through the training process reaches a good per-
centage, but there is a discrepancy in the pre-
diction process. In its working system, dropout
temporarily removes a neuron in the form of
hidden layer or visible layer that is in the net-
work (Nugroho et al., 2020).
3 RESULT AND DISCUSSION
This study was conducted to classify SIBI Sign Lan-
guage by applying the CNN algorithm with the Hand
Gesture Recognition method. Applications built from
the results of earlier implementations of the model
must first be declared in the directory used as a place
to store SIBI alphabet imagery and vocabulary data.
The imagery data obtained was divided into twenty-
six classes with alphabet A-Z.
Another trial scenario in this study was carried out
by applying the use of data augmentation techniques,
before training the data so that the resulting perfor-
mance was more optimal and avoided the occurrence
of overfitting. After the augmentation process, then
carry out the training process for model formation.
The trials in this training process used 50 and 100
epoch experiments with a batch size of sixteen and
a speed of 0.001.
Table 1 is the result of the experiment on the ap-
plication that was built:
Based on table 1, the classification results of the
model using the CNN algorithm-based application
and the Hand Gesture Recognition method showed
satisfactory results. Out of a total of 150 image data,
as many as 128 data were successfully classified cor-
rectly. Based on the equation, the calculation of accu-
ICAISD 2023 - International Conference on Advanced Information Scientific Development
174