to achieve an optimal classification accuracy. In sec-
tion VI, the framework of the experiment is laid out
and results of the performed experiments are shown.
We discuss the result and their consequences on the
objectives of the paper in section VII and we conclude
the findings in section VIII.
2 RELATED WORK
LeNet was the very first Convolutional Neural Net-
work used for visual detection tasks including char-
acter recognition and document analysis (Jackel et al.,
1995). This neural network was used to extract local
geometric features from the input image in a way that
preserved approximate relative locations of these fea-
tures. By convolving the input image with a trainable
kernel, the network was able to produce high level
feature maps which were then fed to a linear classifi-
cation layer. For the system they created, an overall
OCR accuracy exceeding 99.00% was achieved.
Owing to the continuous academic research in
the Online Handwritten Chinese Character Recogni-
tion, it has been demonstrated (Xiao et al., 2017) that
methods based on CNNs can learn more discrimi-
native features from source data, which may lead to
a better end-to-end solution for Online Handwritten
Recognition problems. The authors of this paper con-
tinued to design a compact CNN classifier for On-
line Handwritten Chinese Character Recognition us-
ing DropWeight for pruning redundant connections
in a CNN architecture maintaining an accuracy of
96.88%. Handwritten Bangla Digit Recognition us-
ing a CNN with Gaussian and Gabor Filters, achieved
98.78% recognition accuracy (Alom et al., 2017).
Along with working on improving the recognition
accuracy of the classifiers, researchers also worked
upon writer adaptation for online handwritten recog-
nition where they used lexemes to identify the styles
present in a particular writer’s sample data which re-
sulted in the reduction of average error rate on hand-
written words (Connell and Jain, 2002).
In the present paper, we demonstrate a method
to capture the variability induced by different writ-
ing styles, thus enhancing the generalization accu-
racy of the classifier. It is pertinent here to discuss
the progression in Gurmukhi script recognition. A
recognizer using pre-processing algorithms (Normal-
ization, Interpolation and Slant Correction) has been
proposed to recognize loops, headline, straight line
and dot features from online handwritten Gurmukhi
strokes collected on a pen-tablet interface by 60 writ-
ers (Sharma et al., 2007). After this, a post proces-
sor for improving the accuracy of character recog-
nition was built to detect and aggregate strokes us-
ing set theory to recognize characters with an accu-
racy of 95.60% for single character stroke sequencing
(Kumar and Sharma, 2013). This work was done on
a dataset of 27,231 samples categorized on the ba-
sis of the proficiency of the writers. A significant
shift came after Hidden Markov Models(HMM) and
Support Vector Machines(SVM) were used for clas-
sification while employing the features extracted on
the basis of region and cursiveness. This experiment
resulted in a 96.70% recognition rate of Gurmukhi
characters (Verma and Sharma, 2016). Their experi-
ments consisted of the methods to extract features and
then classify them using SVM or HMM for classifica-
tion. Our aim in this work is to use the Deep Learn-
ing concept of learning features and then performing
extensive experimentation using CNNs to obtain bet-
ter recognition accuracy. One of the main advantage
of using a CNN is that it is able to extract features
automatically and is invariant to shift and distortion
(Wong et al., 2016).
3 GURMUKHI SCRIPT
Punjabi language is spoken by about 130 million peo-
ple, mainly in West Punjab in Pakistan and in East
Punjab in India. Indian Punjabi is written using the
Gurmukhi script, which has a fairly complex system
of tonal variance.
Some notable features of Gurmukhi script are :
• Gurmukhi script is cursive and written in left to
right direction with top down approach.
• A horizontal line, called a “shirorekha” is found
on the upper part of almost all the characters.
• Any Gurmukhi word can be divided into three
sections viz. Upper Zone, Middle Zone and the
Lower Zone. All the strokes are classified into
one of the three zone. The upper zone consists of
the region above the head line where some of the
vowels reside. The middle zone is the most popu-
lated zone, consisting of consonants and some of
the vowels. The lower zone contains some vow-
els and half characters that lie below the foot of
consonants (Verma and Sharma, 2017).
3.1 Different Styles of Writing
Gurmukhi Script
The critical area of research in Character recognition
is to capture and detect the complex nature of any
script that results in the variation of writing styles.
The documented reasons for variation in handwriting
Recognition of Online Handwritten Gurmukhi Strokes using Convolutional Neural Networks
579