Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents
Alice Nannini, Federico Galatolo, Mario Cimino, Gigliola Vaglini
2022
Abstract
The computer vision and object detection techniques developed in recent years are dominating the state of the art and are increasingly applied to document layout analysis. In this research work, an automatic method to extract meaningful information from scanned documents is proposed. The method is based on the most recent object detection techniques. Specifically, the state-of-the-art deep learning techniques that are designed to work on images, are adapted to the domain of digital documents. This research focuses on play scripts, a document type that has not been considered in the literature. For this reason, a novel dataset has been annotated, selecting the most common and useful formats from hundreds of available scripts. The main contribution of this paper is to provide a general understanding and a performance study of different implementations of object detectors applied to this domain. A fine-tuning of deep neural networks, such as Faster R-CNN and YOLO, has been made to identify text sections of interest via bounding boxes, and to classify them into a specific pre-defined category. Several experiments have been carried out, applying different combinations of data augmentation techniques.
DownloadPaper Citation
in Harvard Style
Nannini A., Galatolo F., Cimino M. and Vaglini G. (2022). Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents. In Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-758-569-2, pages 610-615. DOI: 10.5220/0011090600003179
in Bibtex Style
@conference{iceis22,
author={Alice Nannini and Federico Galatolo and Mario Cimino and Gigliola Vaglini},
title={Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents},
booktitle={Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2022},
pages={610-615},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011090600003179},
isbn={978-989-758-569-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents
SN - 978-989-758-569-2
AU - Nannini A.
AU - Galatolo F.
AU - Cimino M.
AU - Vaglini G.
PY - 2022
SP - 610
EP - 615
DO - 10.5220/0011090600003179