and Feng (2015) proposed an approach to enhance
Deep Convolutional Neural Network to recognize
handwritten Chinese characters. This enhancement
including deformation, non-linear normalization,
imaginary strokes, path signature, and 8-directional
features. SASAKI, HORIUCHI, and KATO (2015)
improve the system for recognizing ancient Japanese
characters in order to read ancient documents. They
CNN to extract the features and used a support vector
machine (SVM) to classify these features. On the
other hand,(Tsai, 2016) used a deep convolutional
neural network to classified the three different types
of scripts of handwritten Japanese. This work focuses
on the classification of the type of script, character
recognition within each type of script, and character
recognition across all three types of scripts.
2 WHAT IS DEEP LEARNING?
Deep Learning is a field of machine learning that is
used to learn computers to do what can humans able
to do in real life. Exactly just like, when a person
understand and learn from his/her expertise.
Conventional machine learning has algorithms that
use the computations to learn from data or
information directly without depending on predefined
equations to be used as a model. Machine learning
algorithms are widely used in pattern recognition
fields. For instant, face recognition, recognize texts
from audio or videos (speech recognition), and
recognize car numbers from its plates (O'Shea &
Nash, 2015). Furthermore, they are used in smart
technology such as auto-driving of the cars which
used to detecting lights and people crossing the street.
Deep Learning (DL) represents the most efficient
technique since it is providing better performance
than other machine learning algorithms based on the
experimental results. The main behind this is that DL
mimics brain functions. Furthermore, its methods
include multi-layer processing, which can give better
time consuming and high accuracy performance. Sub-
sampling layers give better results, by using CNN and
auto-encoders when their number increased then
better timing and clarity for the images are obtained
(Hijazi, Kumar, & Rowen, 2015). In general, there are
three important types of neural networks that form the
basis for most pre-trained models in deep learning:
Artificial Neural Networks (ANN), Recurrent Neural
Networks (RNN), and Convolution Neural Networks
(CNN). In the next section, CNN will be discussed in
detail, which is mainly used in this work.
3 CONVOLUTION NEURAL
NETWORKS (CNN)
A CNN is known as a special type of neural network
and a main branch of the DL. It represents the best
choice for pattern recognition and specifically in the
image processing area. . A CNN is comprised of one
or more convolutional layers, also sometimes it
contains a subsampling layer and after the
subsampling layer, there are one or more fully
connected layers just like the conventional neural
network (Patterson & Gibson, 2017). The Design of
CNN is based on the mechanism of the visual of the
human, i.e. the cortex of the visually in the brain of a
human. Many cells exist in the cortex of the visually,
these cells have the job of detecting light in sub-
regions that are small or overlapped in the visual area
for the human eye. These areas are known as
receptive areas and the cells on its work as local filters
for input space. Moreover, if the cells have higher
complexity then they will have larger receptive areas.
CNN’s convolution layer represented the function
that is implemented by the visual cortex’s cells
(Patterson & Gibson, 2017).
3.1 Advantage of CNN
Recently, CNN has taken the attention of many
researchers, especially in image processing fields
because of the advantages include within it. These
advantages can be summarized as follows(Hijazi et
al., 2015):
1. Ruggedness to Shifts and Distortion in the
Image: the detection with using CNN is rugged
to distortions for example change in shape due
to camera lens, different lighting conditions,
different poses, presence of partial occlusions,
horizontal and vertical shifts, etc.
2. Fewer Memory Requirements: Theoretically,
the fully connected layers can be used to have
all the features to be extracted, for example, If
an image of size 32×32 with a hidden that has
1000 features, then 106 coefficients order is
needed, with a very large memory needed.
However, these coefficients will be used in
several locations across space in the
convolutional layer, this will lead to reducing
the memory usage drastically.
3. Easier and Better Training: In CNN, the
number of parameters is reduced drastically.
Which make CNN better time consuming when
compared with the traditional neural network. In