SEGMENTATION OF TOUCHING LANNA CHARACTERS

Sakkayaphop Pravesjit, Arit Thammano

Abstract

Character segmentation is an important preprocessing step for character recognition. Incorrectly segmented characters are not likely to be correctly recognized. Touching characters is one of the most difficult segmentation cases which arise when handwritten characters are being segmented. Therefore, this paper emphasizes the interest to the segmentation of touching and overlapping characters. In the proposed character segmentation process, the bounding box analysis is initially employed to segment the document image into images of isolated characters and images of touching characters. The thinning algorithm is applied to extract the skeleton of the touching characters. Next, the skeleton of the touching characters is separated into several pieces. Finally, the separated pieces of the touching characters are put back to reconstruct two isolated characters. The proposed algorithm achieves an accuracy of 75.3%.

References

  1. Bhowmik, T. K., Roy, A., Roy, U., 2005. Character Segmentation for Handwritten Bangla Words Using Artificial Neural Network. In: Proceedings of the International Workshop on Neural Networks and Learning in Document Analysis and Recognition.
  2. Casey, R. G., Lecolinet, E., 1996. A Survey of Methods and Strategies in Character Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 7, pp. 690-706.
  3. Chen, J. L., Wu, C. H., Lee, H. J., 1998. Chinese Handwritten Character Segmentation in Form Documents. Document Analysis Systems: Theory and Practice, LNCS 1655, pp. 348-362.
  4. Hoang, T. V., Tabbone, S., Pham, N., 2009. Recognitionbased Segmentation of Nom Characters from Body Text Regions of Stele Images Using Area Voronoi Diagram. In: Proceedings of the 13th International Conference on Computer Analysis of Images and Patterns.
  5. Marinai, S., Gori, M., Soda, G., 2005. Artificial Neural Networks for Document Analysis and Recognition.
  6. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 1, pp. 23-35.
  7. Soba, T., Sulong, G., Rehman, A., 2010. A Survey on Methods and Strategies on Touched Characters Segmentation. International Journal of Research and Reviews in Computer Science, vol. 1, no. 2, pp. 103- 114.
  8. Tseng, L. Y., Chen, R. C., 1998. Segmenting Handwritten Chinese Characters Based on Heuristic Merging of Stroke Bounding Boxes and Dynamic Programming. Pattern Recognition Letter, vol. 19, pp. 963-973.
  9. Xiao, X., Leedham, G., 2000. Knowledge-based English Cursive Script Segmentation. Pattern Recognition Letters, vol. 21, pp. 945-954.
Download


Paper Citation


in Harvard Style

Pravesjit S. and Thammano A. (2011). SEGMENTATION OF TOUCHING LANNA CHARACTERS . In Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2011) ISBN 978-989-8425-72-0, pages 47-51. DOI: 10.5220/0003511300470051


in Bibtex Style

@conference{sigmap11,
author={Sakkayaphop Pravesjit and Arit Thammano},
title={SEGMENTATION OF TOUCHING LANNA CHARACTERS},
booktitle={Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2011)},
year={2011},
pages={47-51},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003511300470051},
isbn={978-989-8425-72-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2011)
TI - SEGMENTATION OF TOUCHING LANNA CHARACTERS
SN - 978-989-8425-72-0
AU - Pravesjit S.
AU - Thammano A.
PY - 2011
SP - 47
EP - 51
DO - 10.5220/0003511300470051