Page Boundary Extraction of Bound Historical Herbaria
Krishna Chandrasekar, Steven Verstockt
2020
Abstract
When digitizing bound historical collections such as herbaria it is important to extract the main page region so that it could be used for automated processing. The thickness of the herbaria books also gives rise to deformations during imaging which reduces the efficiency of automatic detection tasks. In this work we address these problems by proposing an automatic page detection algorithm that estimates all the boundaries of the page and performs morphological corrections in order to reduce deformations. The algorithm extracts features from Hue, Saturation and Value transformations of an RGB image to detect the main page polygon. The algorithm was evaluated on multiple textual and herbaria type historical collections and obtains over 94% mean intersection over union on all these datasets. Additionally, the algorithm was also subjected to an ablation test to demonstrate the importance of morphological corrections.
DownloadPaper Citation
in Harvard Style
Chandrasekar K. and Verstockt S. (2020). Page Boundary Extraction of Bound Historical Herbaria. In Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH, ISBN 978-989-758-395-7, pages 476-483. DOI: 10.5220/0009154104760483
in Bibtex Style
@conference{artidigh20,
author={Krishna Chandrasekar and Steven Verstockt},
title={Page Boundary Extraction of Bound Historical Herbaria},
booktitle={Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH,},
year={2020},
pages={476-483},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009154104760483},
isbn={978-989-758-395-7},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH,
TI - Page Boundary Extraction of Bound Historical Herbaria
SN - 978-989-758-395-7
AU - Chandrasekar K.
AU - Verstockt S.
PY - 2020
SP - 476
EP - 483
DO - 10.5220/0009154104760483