Few-exemplar Information Extraction for Business Documents

Daniel Esser; Daniel Schuster; Klemens Muthmann; Alexander Schill

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Few-exemplar Information Extraction for Business Documents

Topics: Data Mining; Enterprise Resource Planning; Performance Evaluation and Benchmarking

In Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 3: ICEIS, 293-298, 2014 , Lisbon, Portugal

Authors: Daniel Esser ; Daniel Schuster ; Klemens Muthmann and Alexander Schill

Affiliation: TU Dresden, Germany

Keyword(s): Information Extraction, Few-exemplar Learning, One-shot Learning, Business Documents.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Data Mining ; Databases and Information Systems Integration ; Enterprise Information Systems ; Enterprise Resource Planning ; Enterprise Software Technologies ; Performance Evaluation and Benchmarking ; Sensor Networks ; Signal Processing ; Simulation and Modeling ; Simulation Tools and Platforms ; Soft Computing ; Software Engineering

Abstract: The automatic extraction of relevant information from business documents (sender, recipient, date, etc.) is a valuable task in the application domain of document management and archiving. Although current scientific and commercial self-learning solutions for document classification and extraction work pretty well, they still require a high effort of on-site configuration done by domain experts and administrators. Small office/home office (SOHO) users and private individuals do often not benefit from such systems. A low extraction effectivity especially in the starting period due to a small number of initially available example documents and a high effort to annotate new documents, drastically lowers their acceptance to use a self-learning information extraction system. Therefore we present a solution for information extraction that fits the requirements of these users. It adopts the idea of one-shot learning from computer vision to the domain of business document processing and requi res only a minimal number of training to reach competitive extraction effectivity. Our evaluation on a document set of 12,500 documents consisting of 399 different layouts/templates achieves extraction results of 88% F1 score on 10 commonly used fields like document type, sender, recipient, and date. We already reach an F1 score of 78% with only one document of each template in the training set. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.59

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Esser, D., Schuster, D., Muthmann, K. and Schill, A. (2014). Few-exemplar Information Extraction for Business Documents. In Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 3: ICEIS; ISBN 978-989-758-027-7; ISSN 2184-4992, SciTePress, pages 293-298. DOI: 10.5220/0004946702930298

@conference{iceis14,
author={Daniel Esser and Daniel Schuster and Klemens Muthmann and Alexander Schill},
title={Few-exemplar Information Extraction for Business Documents},
booktitle={Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 3: ICEIS},
year={2014},
pages={293-298},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004946702930298},
isbn={978-989-758-027-7},
issn={2184-4992},
}

TY - CONF

JO - Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 3: ICEIS
TI - Few-exemplar Information Extraction for Business Documents
SN - 978-989-758-027-7
IS - 2184-4992
AU - Esser, D.
AU - Schuster, D.
AU - Muthmann, K.
AU - Schill, A.
PY - 2014
SP - 293
EP - 298
DO - 10.5220/0004946702930298
PB - SciTePress