best feature extraction technique for classification is
hotspot, followed by mark direction and direction of
chain code, respectively. The average classification
rate obtained from hotspot, mark direction, and di-
rection of chain code are 87.8%, 86.9%, and 79.2%,
respectively. Our new technique significantly out-
performs the other feature extraction method on the
two data sets containing digits. The mark direction
technique outperforms our method on the Thai data
set. The direction of chain code technique obtains
the worst performance by far. This technique is more
complicated and involves several subtleties which re-
quires adapting it to different data sets. Much bet-
ter results for MNIST have been reported in literature
(above 99% accuracy), but in those studies more train-
ing patterns were used (60,000 compared to 10,000 in
our study). This dataset has a very large number of
examples and few classes, which makes pixel-based
methods more effective. However, we believe that
by more fine-tuning, using more examples and better
classifiers, and combining multiple feature extraction
methods, we are able to obtain similar performances.
5 CONCLUSIONS
The present study proposed a new technique for fea-
ture extraction, named the hotspot technique. In this
technique, the distance values between the closest
black pixels and the hotspots in each direction are
used as representation for a character. There are two
key parameters to be taken into account; 1) number
of hotspots and 2) number of chain code directions.
The hotspot technique was applied to numeric data
sets including MNIST and Bangla numeric, and Thai
characters.
For the two data sets with few classes, namely
the handwritten digit data sets, Bangla and MNIST,
the novel hotspot technique significantly outperforms
the other methods. However, the mark direction tech-
nique outperforms the hotspot technique on the Thai
data set that has much more classes (65). Maybe
the hotspot technique needs more examples for this
data set, possibly because it is less robust to variances
in the handwritten characters than the mark direction
technique. Still, our results on data sets of multiple
scripts show that the hotspot technique achieves the
highest average recognition rate.
In future work, we want to compare different fea-
ture extraction techniques, among those the ones de-
scribed in this paper, to pixel-based methods. Sev-
eral neural network architectures have obtained very
high recognition rates on the MNIST data set, and we
are interested in finding the utility of feature extrac-
tion methods compared to the use of strong classifiers
that immediately work on pixel representations. Fur-
thermore, keypoint methods have not deserved a lot
of attention in handwriting recognition, and we want
to explore the use of adaptive keypoints to be more
translation invariant and also use generative models
to maximize the probability of generating the data.
ACKNOWLEDGEMENTS
We are sincerely grateful to Dr. Tapan K. Bhowmik
for providing the Bangla numeric data used in the
present study. We thank Jean Paul van Oosten for use-
ful remarks on a preliminary version of this paper.
REFERENCES
Bhowmik, T. K., Parui, S., Kar, M., and Roy, U. (2007).
HMM parameter estimation with genetic algorithm
for handwritten word recognition. In Ghosh, A., De,
R., and Pal, S., editors, Pattern Recognition and Ma-
chine Intelligence, volume 4815 of Lecture Notes in
Computer Science, pages 536–544. Springer Berlin /
Heidelberg.
Blumenstein, M., Verma, B., and Basli, H. (2003). A novel
feature extraction technique for the recognition of seg-
mented handwritten characters. Document Analysis
and Recognition, International Conference on, 1:137.
Ferdinando, H. (2001). Handwriting Digit Recognition
With Fuzzy Logic. Jurnal Teknik Elektro, 1(1).
Kawtrakul, A. and Waewsawangwong, P. (2000). Multi-
feature extraction for printed thai character recogni-
tion. In Natural Language Processing, 2000. SNLP
2000. 4th International Conference on.
Lauer, F., Suen, C. Y., and Bloch, G. (2007). A train-
able feature extractor for handwritten digit recogni-
tion. Pattern Recognition, 40(6):1816–1824.
LeCun, Y. and Cortes, C. (1998). The MNIST database of
handwritten digits.
Pal, U., Sharma, N., Wakabayashi, T., and Kimura, F.
(2008). Handwritten character recognition of popu-
lar south indian scripts. In Doermann, D. and Jaeger,
S., editors, Arabic and Chinese Handwriting Recogni-
tion, volume 4768 of Lecture Notes in Computer Sci-
ence, pages 251–264. Springer Berlin / Heidelberg.
Rajashekararadhya, S. and Ranjan, P. (2009). Zone based
feature extraction algorithm for handwritten numeral
recognition of kannada script. In Advance Computing
Conference, 2009. IACC 2009. IEEE International,
pages 525–528.
Sanossian, H. and Evans, D. (1998). Efficient feature
extraction technique for english characters. Inter-
national Journal of Computer Mathematics, 66(3-
4):257–265.
Trier, Ø. D., Jain, A. K., and Taxt, T. (1996). Feature extrac-
tion methods for character recognition-a survey. Pat-
tern Recognition, 29(4):641–662.
ICPRAM 2012 - International Conference on Pattern Recognition Applications and Methods
264