loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Daniel Esser ; Daniel Schuster ; Klemens Muthmann and Alexander Schill

Affiliation: TU Dresden, Germany

Keyword(s): Information Extraction, Few-exemplar Learning, One-shot Learning, Business Documents.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Data Mining ; Databases and Information Systems Integration ; Enterprise Information Systems ; Enterprise Resource Planning ; Enterprise Software Technologies ; Performance Evaluation and Benchmarking ; Sensor Networks ; Signal Processing ; Simulation and Modeling ; Simulation Tools and Platforms ; Soft Computing ; Software Engineering

Abstract: The automatic extraction of relevant information from business documents (sender, recipient, date, etc.) is a valuable task in the application domain of document management and archiving. Although current scientific and commercial self-learning solutions for document classification and extraction work pretty well, they still require a high effort of on-site configuration done by domain experts and administrators. Small office/home office (SOHO) users and private individuals do often not benefit from such systems. A low extraction effectivity especially in the starting period due to a small number of initially available example documents and a high effort to annotate new documents, drastically lowers their acceptance to use a self-learning information extraction system. Therefore we present a solution for information extraction that fits the requirements of these users. It adopts the idea of one-shot learning from computer vision to the domain of business document processing and requi res only a minimal number of training to reach competitive extraction effectivity. Our evaluation on a document set of 12,500 documents consisting of 399 different layouts/templates achieves extraction results of 88% F1 score on 10 commonly used fields like document type, sender, recipient, and date. We already reach an F1 score of 78% with only one document of each template in the training set. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.144.100.252

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Esser, D.; Schuster, D.; Muthmann, K. and Schill, A. (2014). Few-exemplar Information Extraction for Business Documents. In Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 3: ICEIS; ISBN 978-989-758-027-7; ISSN 2184-4992, SciTePress, pages 293-298. DOI: 10.5220/0004946702930298

@conference{iceis14,
author={Daniel Esser. and Daniel Schuster. and Klemens Muthmann. and Alexander Schill.},
title={Few-exemplar Information Extraction for Business Documents},
booktitle={Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 3: ICEIS},
year={2014},
pages={293-298},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004946702930298},
isbn={978-989-758-027-7},
issn={2184-4992},
}

TY - CONF

JO - Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 3: ICEIS
TI - Few-exemplar Information Extraction for Business Documents
SN - 978-989-758-027-7
IS - 2184-4992
AU - Esser, D.
AU - Schuster, D.
AU - Muthmann, K.
AU - Schill, A.
PY - 2014
SP - 293
EP - 298
DO - 10.5220/0004946702930298
PB - SciTePress