Authors:
Lal Chandra
1
;
Puja Lal
1
;
Raju Gupta
1
;
Arun Tayal
1
and
Dinesh Ganotra
2
Affiliations:
1
Newgen Software Technologies Limited, India
;
2
GGSIP University, India
Keyword(s):
Binarization, Thresholding, Gamma Correction, Reverse Video, Contrast Stretch, ICR, OCR.
Related
Ontology
Subjects/Areas/Topics:
Computer Vision, Visualization and Computer Graphics
;
Image and Video Analysis
;
Image Formation and Preprocessing
;
Image Quality
;
Statistical Approach
Abstract:
Technology of image capturing devices has graduated from Black & White (B&W) to Color, still majority of document image analysis and extraction functionalities work on B&W documents only. Quality of document images directly scanned as B&W is not good enough for further analysis. Moreover, nowadays documents are getting more and more complex with use of variety of background schemes, color combinations and light text on dark background (reverse video) etc. Hence an efficient binarization algorithm becomes an integral step of preprocessing stage. In proposed algorithm we have modified Adaptive Niblack's Method (Rais et al., 2004) of thresholding to make it more efficient and handle reverse video cases also. The proposed algorithm is fast and invariant of factors involved in thresholding of document images like ambient illumination, contrast stretch and shading effects. We have also used gamma correction before applying the proposed binarization algorithm. This gamma correction is adapt
ive to brightness of document image and is found from predetermined equation of brightness versus gamma. Based upon result of experiments, an optimal size of window for local binarization scheme is also proposed.
(More)