IMPROVED ADAPTIVE BINARIZATION TECHNIQUE FOR DOCUMENT IMAGE ANALYSIS

Lal Chandra, Puja Lal, Raju Gupta, Arun Tayal, Dinesh Ganotra

Abstract

Technology of image capturing devices has graduated from Black & White (B&W) to Color, still majority of document image analysis and extraction functionalities work on B&W documents only. Quality of document images directly scanned as B&W is not good enough for further analysis. Moreover, nowadays documents are getting more and more complex with use of variety of background schemes, color combinations and light text on dark background (reverse video) etc. Hence an efficient binarization algorithm becomes an integral step of preprocessing stage. In proposed algorithm we have modified Adaptive Niblack's Method (Rais et al., 2004) of thresholding to make it more efficient and handle reverse video cases also. The proposed algorithm is fast and invariant of factors involved in thresholding of document images like ambient illumination, contrast stretch and shading effects. We have also used gamma correction before applying the proposed binarization algorithm. This gamma correction is adaptive to brightness of document image and is found from predetermined equation of brightness versus gamma. Based upon result of experiments, an optimal size of window for local binarization scheme is also proposed.

References

  1. Rais, N. B., Hanif, M. S., Taj, I. A., 2004. “Adaptive Thresholding Technique for Document Image Analysis.”, Proc. INMIC 2004, 8th Int. Multitopic conf, 61-66 IEEE.
  2. Gonzalez, R. C., Richard, E. W., 2005. “Digital Image Processing”, Pearson Prentice-Hall Inc.
  3. Niblack, W., 1990. “An Introduction to Digital Image Processing”, Prentice-Hall Inc.
  4. Otsu, N, 1979. "A threshold selection method from gray level histograms." IEEE Tran. on Sys. Man. Cyber.
  5. Bemsen, J., 1986. "Dynamic thresholding of gray-level images". Proc. 8th ICPR, Paris. 1251-1255.
  6. Sauvola, J., Seppanen, T., Haapakoski, R. Pietikainen, M., 1997. "Adaptive Document Binarization." Int. Conf. Doc. Ana. Rec., 147-152.
  7. Parker, J.K., 1991. "Gray level thresholding in badly illuminated images". IEEE Trans. Patt. Ana. Mac. Intell. Volume 13, 813-819.
  8. www.newgensoft.com/2005/products/omniextract.htm
Download


Paper Citation


in Harvard Style

Chandra L., Lal P., Gupta R., Tayal A. and Ganotra D. (2007). IMPROVED ADAPTIVE BINARIZATION TECHNIQUE FOR DOCUMENT IMAGE ANALYSIS . In Proceedings of the Second International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, ISBN 978-972-8865-73-3, pages 317-321. DOI: 10.5220/0002057003170321


in Bibtex Style

@conference{visapp07,
author={Lal Chandra and Puja Lal and Raju Gupta and Arun Tayal and Dinesh Ganotra},
title={IMPROVED ADAPTIVE BINARIZATION TECHNIQUE FOR DOCUMENT IMAGE ANALYSIS},
booktitle={Proceedings of the Second International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP,},
year={2007},
pages={317-321},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002057003170321},
isbn={978-972-8865-73-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Second International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP,
TI - IMPROVED ADAPTIVE BINARIZATION TECHNIQUE FOR DOCUMENT IMAGE ANALYSIS
SN - 978-972-8865-73-3
AU - Chandra L.
AU - Lal P.
AU - Gupta R.
AU - Tayal A.
AU - Ganotra D.
PY - 2007
SP - 317
EP - 321
DO - 10.5220/0002057003170321