Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents

Alice Nannini, Federico Galatolo, Mario Cimino, Gigliola Vaglini

2022

Abstract

The computer vision and object detection techniques developed in recent years are dominating the state of the art and are increasingly applied to document layout analysis. In this research work, an automatic method to extract meaningful information from scanned documents is proposed. The method is based on the most recent object detection techniques. Specifically, the state-of-the-art deep learning techniques that are designed to work on images, are adapted to the domain of digital documents. This research focuses on play scripts, a document type that has not been considered in the literature. For this reason, a novel dataset has been annotated, selecting the most common and useful formats from hundreds of available scripts. The main contribution of this paper is to provide a general understanding and a performance study of different implementations of object detectors applied to this domain. A fine-tuning of deep neural networks, such as Faster R-CNN and YOLO, has been made to identify text sections of interest via bounding boxes, and to classify them into a specific pre-defined category. Several experiments have been carried out, applying different combinations of data augmentation techniques.

Download


Paper Citation


in Harvard Style

Nannini A., Galatolo F., Cimino M. and Vaglini G. (2022). Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents. In Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-758-569-2, pages 610-615. DOI: 10.5220/0011090600003179


in Bibtex Style

@conference{iceis22,
author={Alice Nannini and Federico Galatolo and Mario Cimino and Gigliola Vaglini},
title={Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents},
booktitle={Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2022},
pages={610-615},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011090600003179},
isbn={978-989-758-569-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents
SN - 978-989-758-569-2
AU - Nannini A.
AU - Galatolo F.
AU - Cimino M.
AU - Vaglini G.
PY - 2022
SP - 610
EP - 615
DO - 10.5220/0011090600003179