Authors:
Enes Aslan
1
;
Tugrul Karakaya
1
;
Ethem Unver
1
and
Yusuf Sinan AKGUL
2
Affiliations:
1
Kuveyt Turk Participation Bank, Turkey
;
2
Gebze Techical University, Turkey
Keyword(s):
Invoice Processing, Part based Modeling, Page Segmentation, Document Analysis, Information Extraction.
Related
Ontology
Subjects/Areas/Topics:
Applications and Services
;
Computer Vision, Visualization and Computer Graphics
;
Document Imaging in Business
Abstract:
Automated invoice processing and information extraction has attracted remarkable interest from business and
academic circles. Invoice processing is a very critical and costly operation for participation banks because
credit authorization process must be linked with real trade activity via invoices. The classical invoice
processing systems first assign the invoices to an invoice class but any error in document class decision will
cause the invoice parsing to be invalid. This paper proposes a new invoice class free parsing method that uses
a two-phase structure. The first phase uses individual invoice part detectors and the second phase employs an
efficient part-based modeling approach. At the first phase, we employ different methods such as SVM,
maximum entropy and HOG to produce candidates for the various types of invoice parts. At the second phase,
the basic idea is to parse an invoice by parts arranged in a deformable composition similar to face or human
body detection from digital
images. The main advantage of the part-based modeling (PBM) approach is that
this system can handle any type of invoice, a crucial functionality for business processes at participation
banks. The proposed system is tested with real invoices and experimental results confirm the effectiveness of
the proposed approach.
(More)