Semantic Analysis of Chest X-Ray using an Attention-based CNN
Technique
Rishabh Dhenkawat
1
, Snehal Saini
2
, Nagendra Pratap Singh
1
1
Department of CSE, National Institute of Technology, Hamirpur (H.P.),India
2
Department of ECE, National Institute of Technology, Hamirpur (H.P.),India
Keywords: CNN, LSTM, Deep Learning, Additive Attention, Teacher Force.
Abstract: The world today is suffering from a huge pan-demic. COVID-19 has infected 106M people around the
globe causing 2.33M deaths, as of February 9, 2021. To control the disease from spreading more and to
provide accurate health care to existing patients, detection of COVID-19 at an early stage is important. As
per the World Health Organization (WHO), diagnosing pneumonia is the most common way of detecting
COVID-19. 172K deaths were reported in the USA between February 2020 and January 29, 2021, that was
caused by pneumonia and COVID-19 together. In many situations, a chest X-ray is used to determine the
type of pneumonia. We present a deep learning model to generate a report of a chest x-ray image using
image captioning with an attention mechanism.
1 INTRODUCTION
Computer vision-based diagnosis provides an
automatic classification and suggestions for
reference to improves diagnosis’s accuracy and
efficiency. In the past few years, many deep learning
and machine learning algorithms are used for the
classification of medical images, SVM, K-nearest
neighbors, random forest, and other techniques are
included. They can be used in a variety of medical
image processing applications.
Using old machine learning methods poses two
major difficulties. First, the inaccurate results due to
the limited processing of large input. Secondly, the
use of manual feature extractions instead of learning
valid features. Thus, deep learning methods are
preferred for medical image processing.
learning’s technology has a wide variety of
applications in healthcare image processing, such as
diagnosis and organ segmentation. The convolution
neural network cnn has been used extensively in
several pieces of research that include reading and
interpreting ct images for medical applications. Deep
learning is a representation learning technique that
connects different layers and nonlinear components
efficiently to obtain various representation levels.
Deep learning algorithms have two essential
characteristics: local connectivity and shared
weights (CNN). Deep learning is widely used in
image analysis because of all these features, which
make it much easier to handle complex data
processing tasks. Convolution layer, pooling layers,
and fully connected layers are the three layers that
make up the CNN architecture. Convolution layers
extract features from the previous layer, pooling
layers minimize computational complexity, and
completely connected layers, eventually, are used to
extract features from the previous layer. A recurrent
neural network (RNN) is used to process sequence
data in order to recognize things. Since words in a
sentence are semantically related, word generation
uses previous word knowledge to predict the next
word in the sentence. In RNN, the current output of
a sequence is related to the previous output, enabling
word relationships to be determined. It is used to
model temporary sequences and their long-range
dependencies because of the property of feedback
connections.
In this paper, we propose a CNN-LSTM chest-x-
ray image semantic analysis focused on an attention
process to produce a description of the chest x-ray
images. In the deep learning model, we used the idea
of the attention to highlight the infection regions in
the lungs. Two types of attention mechanisms in
deep learning are local attention and global attention.
In our pour model, we used Local Attention, also
known as additive attention or Bahdanau Attention.