loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Wissam AlKendi 1 ; Franck Gechter 1 ; 2 ; Laurent Heyberger 3 and Christophe Guyeux 4

Affiliations: 1 CIAD, UMR 7533, UTBM, F-90010 Belfort, France ; 2 LORIA, UMR 7503, SIMBIOT Team, F-54506 Vandoeuvre-lès-Nancy, France ; 3 FEMTO-ST Institute/RECITS, UMR 6174 CNRS, UTBM, F-90010 Belfort, France ; 4 FEMTO-ST Institute/DISC, UMR 6174 CNRS, Université de Franche-Comté, F-90016 Belfort, France

Keyword(s): Belfort Civil Registers, Segmentation, Handwritten Text Recognition, Preprocessing, Text Skew.

Abstract: Historical documents are invaluable windows into the past. They play a critical role in shaping our perception of the world and its rich tapestry of stories. This paper presents techniques to facilitate the transcription of the French Belfort Civil Registers of Births, which are valuable historical resources spanning from 1807 to 1919. The methodology focuses on preprocessing steps such as binarization, skew correction, and text line segmentation, tailored to address the challenges posed by these documents including various text styles, marginal annotations, and a hybrid mix of printed and handwritten text. The paper also introduces this archive as a new database by developing a structured strategy for the components of the documents using XML tags, ensuring accurate formatting and alignment of transcriptions with image components at both the paragraph and text line levels for further enhancements to handwritten text recognition models. The results of the preprocessing phase show an accuracy rate of 96%, facilitating the preservation and study of this rich cultural heritage. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.191.162.73

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
AlKendi, W.; Gechter, F.; Heyberger, L. and Guyeux, C. (2024). Belfort Birth Records Transcription: Preprocessing, and Structured Data Generation. In Proceedings of the 4th International Conference on Image Processing and Vision Engineering - IMPROVE; ISBN 978-989-758-693-4; ISSN 2795-4943, SciTePress, pages 32-43. DOI: 10.5220/0012715600003720

@conference{improve24,
author={Wissam AlKendi. and Franck Gechter. and Laurent Heyberger. and Christophe Guyeux.},
title={Belfort Birth Records Transcription: Preprocessing, and Structured Data Generation},
booktitle={Proceedings of the 4th International Conference on Image Processing and Vision Engineering - IMPROVE},
year={2024},
pages={32-43},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012715600003720},
isbn={978-989-758-693-4},
issn={2795-4943},
}

TY - CONF

JO - Proceedings of the 4th International Conference on Image Processing and Vision Engineering - IMPROVE
TI - Belfort Birth Records Transcription: Preprocessing, and Structured Data Generation
SN - 978-989-758-693-4
IS - 2795-4943
AU - AlKendi, W.
AU - Gechter, F.
AU - Heyberger, L.
AU - Guyeux, C.
PY - 2024
SP - 32
EP - 43
DO - 10.5220/0012715600003720
PB - SciTePress