The Effect of Segmentation and CNN Architecture in Determining

Accuracy Convolutional Neural Network

Manutur Siregar

, Herman Mawengkang

and Suyanto

Master of Informatics Program, Faculty of Computer Science and Information Technology, Universitas Sumatra Utara,

Medan, Indonesia

Department of Mathematics, Universitas Sumatra Utara, Medan, Indonesia

Keywords:

CNN, Effect of Segmentation, Active Contour.

Abstract:

Convolutional Neural Network is a type of deep learning that is used for image detection or image classiﬁca-

tion. The images used can be obtained from data banks such as http://www.kaggle.com, the image usually has

a different image format, different width and height sizes also in some images contain noise. Segmentation is

used to separate the image that becomes information from the noise contained therein. The types of segmenta-

tion used are active contour and K- Means and compared to not using segmentation at all. The Convolutional

Neural Network architecture was also changed to obtain a better level of accuracy, and to take advantage of

the existing CNN architectures such as Alexnet and GoogleNet. From the research conducted, the best accu-

racy results were obtained from the RGB Image model combined with the GoogleNet model, namely 98.37

K-Means segmentation has better test results when compared to the active contour for classifying lung disease.

1 INTRODUCTION

To correctly diagnose lung images from a chest CT

scan, a doctor needs to do this by examining many

structures and must be aware of the possibility of hun-

dreds of potential abnormalities in other diseases, in-

cluding whether what is examined is a normal vari-

ant (Islam et al., 2021). The use of more and more

radiological images without being accompanied by a

number of trained radiologists can lead to misdiagno-

sis(Summers, 2003).

For some new types of diseases such as COVID19,

radiologists who have not been trained in diagnosing

these diseases will have a greater chance of making a

wrong diagnosis. This causes anxiety in the commu-

nity who want to go to the hospital. Therefore, classi-

ﬁcation of pulmonary disease by chest X-ray image

is necessary to reduce errors and enable more efﬁ-

cient measurement of the reading of a CT scan image

(Sarker et al., 2021).

The CT scan results obtained have different width

and height measurements, some images contain noise

and have different image extension formats. The pro-

cess of image uniformity is needed so that it can be

used as input for the system to be built. Convolu-

tional neural networks will be used to classify lung

diseases such as COVID-19, Pneumonia, Tuberculo-

sis, and Normal. This can be used as an alternative

method by radiologists to diagnose a disease or as a

method to conﬁrm a disease diagnosis.

2 LITERATURE REVIEW

2.1 Convolutional Neural Networks

Convolutional Neural Network or abbreviated as

CNN or ConvNet consists of a neural network that ex-

tracts features from the input image and another neu-

ral network that classiﬁes those features (Rajaraman

et al., 2020). The input image will enter the feature

extraction network, then the extraction results will en-

ter the classiﬁer network which operates based on im-

age features and produces output (Kim, 2017).

2.2 Convolutional Layers

The Convolutional Layer is the core building block

of CNN, where most of the computation is done at

this layer (Ahmed et al., 2021). Suppose we build a

convolutional layer with a sheet of neurons contain-

ing 28 x 28. Each is connected to a small area in the

(image) input, for example 5x5 pixels which is the

Siregar, M., Mawengkang, H. and Suyanto, .

The Effect of Segmentation and CNN Architecture in Determining Accuracy Convolutional Neural Network.

DOI: 10.5220/0012441500003848

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 3rd International Conference on Advanced Information Scientiﬁc Development (ICAISD 2023), pages 46-51

ISBN: 978-989-758-678-1

receptive ﬁeld (receptive ﬁeld) for each neuron and

indicates that the ﬁlter used 5x5 in size. The entire re-

ceptive ﬁeld will be traced in partial overlap, so all of

these neurons must share the weight of the connection

(weight sharing) (Alom et al., 2018).

Figure 1: Convolutional Neural Network Architecture.

2.3 Pooling Layers

The pooling layer functions as a measure during con-

volution, namely by downsampling (reducing sam-

pling). With pooling, we can represent data to be

smaller, easy to manage, and easy to control overﬁt-

ting (Babukarthik et al., 2020).There are 2 types of

pooling, namely max pooling and average pooling.

The way max pooling works is to ﬁnd the highest

value of the matrix pixels covered by the kernel, while

average looks for the average value of the matrix pix-

els covered by the kernel (Beale et al., 2020).

2.4 Normalization Layer

The distribution of each layer in a Convolution Neu-

ral Network changes during the training phase and

varies from layer to layer. This reduces the conver-

gence speed of the optimization algorithm (Ioffe and

Szegedy, 2015).Batch normalization layer serves to

normalize each input channel in all mini batches. To

speed up training of convolutional neural networks

and reduce sensitivity to network initialization, use

batch normalization layers between convolutional and

nonlinear layers, such as the ReLU layer (Beale et al.,

2020).

2.4.1 ReLU Layers

Rectiﬁed Linear Units (ReLU) layer is a layer for ap-

plying the activation function f (x) = max(0, x) (Singh

and Singh, 2021). This improves the nonlinearity

of the decision function and the network as a whole

without affecting the receptive ﬁelds of the convolu-

tional layer (El-Kenawy et al., 2020). Other functions

can also be used to increase nonlinearity, such as the

hyperbolic tangent f (x) = tanh(x), f (x) = |tanh(x)|.

This layer performs a threshold process where for

each input element, any value less than zero is set to

zero.

2.5 Fully Connected Layers

In the fully connected layer, each neuron has full con-

nection to all activations in the previous layer. This

is exactly the same as that of the MLP. The activation

model is exactly the same as MLP, namely compu-

tation using a multiplication matrix followed by an

offset bias (El-Kenawy et al., 2021).

2.6 Digital Image

Digital images can be expressed as a two-dimensional

function f (x, y) where x and y are coordinate posi-

tions while f is the amplitude at position (x,y) which

is often known as intensity. The value of intensity is

discrete from 0 to 255 (Gazda et al., 2021).

3 METHOD

3.1 Data Collection

This process is carried out by downloading the

COVID19, Pneumonia, Tuberculosis and Normal im-

ages from the website www.kaggle.com and saving

them to a local directory.

3.2 Data Preparation

Image data that has been stored in the local directory

is separated into 4 folders for the training phase and 4

folders for the testing phase. The number of training

data is 2116 images while the number of testing data

is 369 images.

3.3 Data Processing

The image data obtained has different widths and

heights, different bit depths and has different image

extensions.

3.4 Active Contour Segmentation

Is a segmentation process that uses closed curves,

these curves can move widened and narrowed to get

the desired object by minimizing image energy using

external force, and also inﬂuenced by image charac-

teristics such as edges.

3.4.1 Active Contour Segmentation

Preprocessing

Is the process of preparing the image to become the

ideal input image for the active contour segmentation

The Effect of Segmentation and CNN Architecture in Determining Accuracy Convolutional Neural Network

Figure 2: Dataset Details.

Figure 3: Preprocessing Active contour.

process (Summers, 2003; Yamac et al., 2021). The

preprocessing process includes image resizing, image

conversion to greyscale and contrast adjustment, as

shown in the ﬂowchart below. Resizing the image will

change the width and height of the image to 224x224

pixels, greyscale aims to change the bit depth of the

image to only 8 bits and contrast adjustment for the

level of color sharpness in the image.

3.4.2 Segmentation Process

This process begins by reading the size of the prepro-

cessed image. The next process is to make 2 pieces

of masking in the middle position of the image, the

masking will move left to right up and down to form

a region. The next process is to convert the image

back into RGB form, and save it to the local directory

with the .png image format

3.5 K-Means Segmentation

It is a popular clustering algorithm that utilizes the

number of clusters. The way it works is by dividing

the data into several cluster regions. The data parti-

tioning process is based on the shortest distance be-

tween the data and the centroid of each cluster.

Figure 4: Active contour segmentation process.

Figure 5: Preprocessing K-Means.

3.5.1 K-Means Segmentation Preprocessing

The pre-processing process includes resizing the im-

age, converting the image to greyscale and converting

the data type from uint8 to the double data type, as

shown in the ﬂowchart below. Resizing the image will

change the width and height of the image to 224x224

pixels, greyscale aims to change the bitdepth of the

image to only 8 bits and conversion of the image data

type from uint8 to double is carried out so that the

calculation process can be carried out in the K-Means

phase.

3.5.2 Segmentation Process

This process begins with the initialization of the clus-

ter value of 2 so that further processing the image can

be divided into 2 clusters. The next process is to map

the results of dividing the 2 clusters into the input im-

age vector so that you can ﬁnd the area and the small-

est object. The next process is to reduce image noise

with the median ﬁlter method, which is a method that

focuses on the median value or the middle value of

the total number of all the pixels. is around it, hence-

forth the image is changed back to type uint8 so that

the image can be saved in the local directory.

3.6 Convolutional Neural Networks

The ﬂow chart above is a Convolutional Neural Net-

work ﬂow chart with a very simple feature extraction

section. In practice, feature extraction can be done re-

peatedly to get accurate results.The ﬂow chart above

is a Convolutional Neural Network ﬂow chart with

a very simple feature extraction section. In practice,

feature extraction can be done repeatedly to get accu-

rate results. The ﬂow chart below is a Convolutional

Neural Network ﬂow chart with a very simple feature

extraction section. In practice, feature extraction can

be done repeatedly to get accurate results.

Figure 6: K-Means segmentation process.

4 RESULT

4.1 Active Contour Segmentation

This process serves to separate the lung image from

the existing background and noise. The following is

ICAISD 2023 - International Conference on Advanced Information Scientiﬁc Development

Figure 7: Convolutional Neural Network Model.

the ﬂow of the process: For some images, the image

of the lungs can be separated from the background

and noise. But for some images, noise is still present

in the active contour segmentation results, such as the

following image.

4.2 Convert to RGB

The active counter segmentation process will cause

categorized images that should not be correct. With

the dataset obtained which is an image with a differ-

ent size (width x height) and bit depth, the researchers

carried out a simple process, namely: Resize and

Convert to RGB to standardize the input data. Here’s

the process ﬂow.

4.3 Segmentation of K-Means

This process serves to separate the lung image from

the background and noise that exists. The segmenta-

tion process is carried out by dividing the data into

several cluster regions. The process of dividing the

data is done based on the closest distance of the data

to the centroid of each cluster Here is the process

ﬂow: For some images, the lung image can be sep-

arated from the background and noise. But for some

images, noise is still present in the K-Means segmen-

tation results, like the following image: The results of

the study will obtain a comparison table of accuracy

results with the model that has been studied For some

images, the image of the lungs can be separated from

the background and noise. But for some images, noise

is still present in the active contour segmentation re-

sults, such as the following image.

Figure 8: Active Contour Segmentation.

Figure 9: RGB Image (Resize + Convert To RGB).

5 CONCLUSIONS

The research concluded:

1. the best accuracy results were obtained from the

RGB Image model, namely the input image which

was resized and converted to grayscale and then

The Effect of Segmentation and CNN Architecture in Determining Accuracy Convolutional Neural Network

Figure 10: K-Means segmentation results with noise.

Figure 11: K-Means Segmentation Process.

Figure 12: Active Contour + Convolutional Neural Network

(CNN) Model 1 Segmentation Results.

Figure 13: RGB Image Results + GoogleNet.

converted to RGB combined with the GoogleNet

model, namely 98.37%.

2. the fastest training and testing time was obtained

from the Citra RGB + Convolutional Neural Net-

work (CNN) Model 2 model, namely 15 minutes

56 seconds with an accuracy of 92.96%.

Figure 14: Results of training and testing with various mod-

els.

3. the longest training and testing time was obtained

from the GoogleNet RGB + Image model, which

was 107 minutes 15 seconds with an accuracy of

98.37%.

4. In active contour segmentation, researchers can

obtain only lung images without image noise, but

in some images lung images are also obtained

with noise. This is because the input image has

varying noise which cannot be resolved only with

contrast adjustment and active segmentation. con-

tour only.

5. For images with Active Contour segmentation, the

CNN 2 model has a better level of accuracy than

the CNN 1 model.

REFERENCES

Ahmed, K., Goldgof, G., Paul, R., Goldgof, D., and Hall,

L. (2021). Discovery of a generalization gap of con-

volutional neural networks on covid-19 x-rays classi-

ﬁcation. IEEE Access, 9:72970–72979.

Alom, Z., Taha, T., Yakopcic, C., Westberg, S., Sidike, P.,

Nasrin, M., Essen, B., Awwal, A. S., and Asari, V.

(2018). The history began from alexnet: A compre-

hensive survey on deep learning approaches. In ArXiv:

Computer Vision and Pattern Recognition.

Babukarthik, R., Adiga, V. K., Sambasivam, G., Chan-

dramohan, D., and Amudhavel, J. (2020). Predic-

tion of covid-19 using genetic deep learning convo-

lutional neural network (gdcnn. IEEE Access, pages

8,177647–177666.

Beale, M., Martin, T., and Howard, B. (2020). Deep learn-

ing toolboxtm user’s guide matlab.

El-Kenawy, E., Ibrahim, A., Mirjalili, S., Eid, M., and Hus-

sein, S. (2020). Novel feature selection and voting

classiﬁer algorithms for covid-19 classiﬁcation in ct

images. IEEE Access, 8:179317–179335.

El-Kenawy, E., Mirjalili, S., Ibrahim, A., Alrahmawy, M.,

El-Said, M., Zaki, R., and Eid, M. (2021). Advanced

meta-heuristics, convolutional neural networks, and

feature selectors for efﬁcient covid-19 x ray chest im-

age classiﬁcation. IEEE Access, 9:36019–36037.

Gazda, M., Plavka, J., Gazda, J., and Drotar, P.

(2021). Self-supervised deep convolutional neural

ICAISD 2023 - International Conference on Advanced Information Scientiﬁc Development

network for chest x-ray classiﬁcation. IEEE Access,

9:151972–151982.

Ioffe, S. and Szegedy, C. (2015). Batch normalization: Ac-

celerating deep network training by reducing internal

covariate shift. ArXiv: Learning.

Islam, M., Karray, F., Alhajj, R., and Zeng, J. (2021).

A review on deep learning techniques for the diag-

nosis of novel coronavirus (covid-19. IEEE Access,

9:30551–30572.

Kim, P. (2017). MATLAB Deep Learning: With Machine

Learning, Neural Networks and Artiﬁcial Intelligence.

Apress.

Rajaraman, S., Siegelman, J., Alderson, P., Folio, L., Fo-

lio, L., and Antani, S. (2020). Iteratively pruned deep

learning ensembles for covid-19 detection in chest x-

rays. IEEE Access, 8:115041 115050.

Sarker, S., Tan, L., Ma, W., Rong, S., Kwapong, O.,

and Darteh, O. (2021). Multi-classiﬁcation network

for identifying covid-19 cases using deep convolu-

tional neural networks. Journal on Internet of Things,

3(2):39–51.

Singh, K. and Singh, A. (2021). Diagnosis of covid-19

from chest x-ray images using wavelets-based depth-

wise convolution network. Big Data Mining and An-

alytics, 4(2):84–93.

Summers, M. (2003). Road maps for advancement of ra-

diologic computer-aided detection in the 21st century.

Radiology, 229(1):11–13.

Yamac, M., Ahishali, M., Degerli, A., Kiranyaz, S., Chowd-

hury, M., and Gabbouj, M. (2021). Convolutional

sparse support estimator-based covid-19 recognition

from x-ray images. IEEE Transactions on Neural Net-

works and Learning Systems, 32(5):1810–1820.

The Effect of Segmentation and CNN Architecture in Determining Accuracy Convolutional Neural Network