Authors:
Abhishek Agarwal
;
Pramod Kumar
and
Sorabh Kumar
Affiliation:
Newgen Software Technologies Ltd., India
Keyword(s):
ICR Cells Detection, Handwritten Form Processing, Touching Characters, Line Removal, Component Labeling, Form Removal, Character Preservation, Data Extraction, ICR/OCR/OMR Accuracy, Registration Marks, Form Processing.
Related
Ontology
Subjects/Areas/Topics:
Computer Vision, Visualization and Computer Graphics
;
Feature Extraction
;
Features Extraction
;
Image and Video Analysis
;
Informatics in Control, Automation and Robotics
;
Signal Processing, Sensors, Systems Modeling and Control
Abstract:
This paper presents methods to enhance accuracy rates of ICR detection in structured form processing. Forms are printed at different vendors using a variety of printers and at different settings. Every printer has its own scaling algorithm, so the final printed forms though visibly similar to naked eyes, contains considerable shift, expansion or shrinkage. This poses problems when data zones are close together as the template reference points refer to the neighbouring identical zones, impeding data extraction accuracy. Moreover, these transformational defects result in inaccurate form removal leaving behind line residues and noise that further deteriorates the extraction accuracy. Our proposed algorithm works on filled forms thereby eliminating the problem of difference between template and actual form. Template data can also be provided as an input to our algorithm to increase speed and accuracy. The algorithm has been tested on a variety of forms and the results have been very prom
ising.
(More)