Tamil Characters Recognition and Retrieval

Abdol Hamid Pilevar

Abstract

In this paper the shape of the vertical projection curves are considered. The behavior of the edges of vertical projection curve is selected for creating the feature vectors of the characters. The edges of the vertical projection curve traced and the direction of the movement in the edges has been mapped by Eleven Direction Method (EDM) method .The direction codes have been extracted and saved as features vectors of the characters. The method is tested on the Tamil printed text documents. The testing data are collected from various legal documents. The test documents contain alphabet, special characters. A technique named EDM is used to search and retrieve the characters from Tamil text databases. The effectiveness and performance of the proposed algorithm have been tested with 10 separate sample data of 6 different fonts. The experiments shows that more than 97% of the Tamil characters are recognized correctly therefore, the proposed algorithm and the selected features perform satisfactorily.

References

  1. Bansal V., R. Sinha, “Segmentation of touching and fused Devanagari characters”, Pattern Recognition 35, 875- 893, 2002.
  2. Davessar N. M., S. Madan, and H. Singh, “A Hybrid Approach to Character Segmentation of Gurmukhi Script Characters,” Pattern Recognition, pp. 4-8, 2003.
  3. Dhanya, D.: “Bilingual OCR for Tamil and Roman scripts. Master's thesis, Department of Electrical Engineering”, Indian Institute of Science, 2001.
  4. Electronics N., C. T. Center, and K. Luang, “Using Projection and Loop for Segmentation of Touching Thai Typewritten,” Analysis, vol. 2004, pp. 504-508, 2004.
  5. Faure, C., Vincent, N., “Simultaneous detection of vertical and horizontal text lines based on perceptual organization”, Proceedings of SPIE - The International Society for Optical Engineering, Volume 7247, 2009.
  6. Grailu, H., Lotfizad, M., Sadoghi-Yazdi, H, “A lossy/lossless compression method for printed typeset bi-level text images based on improved pattern matching”, International Journal on Document Analysis and Recognition, pp. 1-24, 2009.
  7. Hotta, Y., Fujimoto, K., “Line-touching character recognition based on dynamic reference feature synthesis”, Proceedings of SPIE - The International Society for Optical Engineering Volume 6815, 2008.
  8. Kumar S. and Muhammad Mashroor Ali, “An Efficient Object Scaling Algorithm for raster device”, Graphics and Image Processing, NCCIS, 1997.
  9. Li Y., S. Naoi, and M. Cheriet, “A Segmentation Method for Touching Italic Characters,” Pattern Recognition, pp. 2-5, 2004.
  10. Li Y., S., M. Cheriet, Ching Y, Suen, “A Segmentation Method for Touching Italic Characters”, Proceedings of the 17th International Conference on Pattern Recognition (ICPR'04), 1051-4651/ 2004.
  11. Liang S., M. Shridhar and M. Ahmadi, “Segmentation of touching characters in printed document recognition”, Pattern Recognition, Vol. 27, No. 6, pp. 825 840, 1994.
  12. Lu X., X. Liu, G. Xiao, E. Song, P. Li, and Q. Luo, “A Segment Extraction Algorithm Based on Polygonal Approximation for On-Line Chinese Character Recognition,” Japan-China Joint Workshop on Frontier of Computer Science and Technology, pp. 204-207, 2008.
  13. Ode, Ã., Tveit, M., Fry, G., “Capturing landscape visual character using indicators: Touching base with landscape aesthetic theory”, Landscape Research, volume 33, Issue 1, pp. 89-117, February 2008.
  14. Pilevar A. H, A. G. Ramakrishnan, “Inversion detection in text document images”, 9th Joint Conference on Information Science, Taiwan, 2006
  15. Pilevar A. H., “Retrieval of signal from Biomedical Databases some new approaches”, Ph D thesis, University of Mysore, 2005.
  16. Sattar, Md. A., Mahmud, K., Arafat, H., Noor Uz Zaman, A. F. M., “Segmenting Bangla text for optical recognition, 10th International Conference on Computer and Information Technology”, ICCIT, 2007.
  17. Wang W., “Printed Chinese Character Recognition Based on Pixel Distribution Probability of Character Image,” 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 1403-1407, 2008.
  18. Watcharabutsarakham S., “Segmentation for touching thai typewrittens,” Science, pp. 199-202, 2004.
Download


Paper Citation


in Harvard Style

Hamid Pilevar A. (2012). Tamil Characters Recognition and Retrieval . In Proceedings of the 7th International Conference on Software Paradigm Trends - Volume 1: ICSOFT, ISBN 978-989-8565-19-8, pages 487-493. DOI: 10.5220/0004030504870493


in Bibtex Style

@conference{icsoft12,
author={Abdol Hamid Pilevar},
title={Tamil Characters Recognition and Retrieval},
booktitle={Proceedings of the 7th International Conference on Software Paradigm Trends - Volume 1: ICSOFT,},
year={2012},
pages={487-493},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004030504870493},
isbn={978-989-8565-19-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 7th International Conference on Software Paradigm Trends - Volume 1: ICSOFT,
TI - Tamil Characters Recognition and Retrieval
SN - 978-989-8565-19-8
AU - Hamid Pilevar A.
PY - 2012
SP - 487
EP - 493
DO - 10.5220/0004030504870493