Heimatkunde: Dataset for Multi-Modal Historical Document Analysis

Josef Baloun; Josef Baloun; Václav Honzík; Ladislav Lenc; Ladislav Lenc; Jiří Martínek; Jiří Martínek; Pavel Král; Pavel Král

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Heimatkunde: Dataset for Multi-Modal Historical Document Analysis

Topics: Deep Learning; Neural Networks

In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART, 995-1001, 2024 , Rome, Italy

Authors: Josef Baloun ^{1

;

2} ; Václav Honzík ¹ ; Ladislav Lenc ^{1

;

2} ; Jiří Martínek ^{1

;

2} and Pavel Král ^{1

;

2}

Affiliations: ¹ Department of Computer Science and Engineering, University of West Bohemia, Univerzitní, Pilsen, Czech Republic ; ² NTIS - New Technologies for the Information Society, University of West Bohemia, Univerzitní, Pilsen, Czech Republic

Keyword(s): BERT, Deep Learning, Layout Analysis, Multi-Modality, Transformer.

Abstract: This paper introduces a novel Heimatkunde dat aset comprising printed documents in German, specifically designed for evaluating layout analysis methods with a focus on multi-modality. The dataset is openly accessible for research purposes. The study further presents baseline results for instance segmentation and multi-modal element classification. Three advanced models, Mask R-CNN, YOLOv8, and LayoutLMv3, are employed for instance segmentation, while a fusion-based model integrating BERT and various vision Transformers are proposed for multi-modal classification. Experimental findings reveal that optimal bounding box segmentation is achieved with YOLOv8 using an input image size of 1280 pixels, and the best segmentation mask is produced by LayoutLMv3 with PubLayNet weights. Moreover, the research demonstrates superior multi-modal classification results using BERT for textual and Vision Transformer for image modalities. The study concludes by suggesting the integration of the proposed models into the historical Porta fontium portal to enhance the information retrieval from historical data. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 18.191.154.132

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Baloun, J.; Honzík, V.; Lenc, L.; Martínek, J. and Král, P. (2024). Heimatkunde: Dataset for Multi-Modal Historical Document Analysis. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-680-4; ISSN 2184-433X, SciTePress, pages 995-1001. DOI: 10.5220/0012428500003636

@conference{icaart24,
author={Josef Baloun. and Václav Honzík. and Ladislav Lenc. and Ji\v{r}í Martínek. and Pavel Král.},
title={Heimatkunde: Dataset for Multi-Modal Historical Document Analysis},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2024},
pages={995-1001},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012428500003636},
isbn={978-989-758-680-4},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Heimatkunde: Dataset for Multi-Modal Historical Document Analysis
SN - 978-989-758-680-4
IS - 2184-433X
AU - Baloun, J.
AU - Honzík, V.
AU - Lenc, L.
AU - Martínek, J.
AU - Král, P.
PY - 2024
SP - 995
EP - 1001
DO - 10.5220/0012428500003636
PB - SciTePress