Enforcing Graph Structures to Enhance Key Information Extraction in Document Analysis
Rajashree Majumder, Zhewei Wang, Ye Yue, Mukut Kalita, Jundong Liu
2025
Abstract
Key Information Extraction (KIE) is a critical and often final step in the comprehensive process of document analysis. Various graph-based solutions, including SDMG-R, have been proposed to address the challenges posed by the relationships between document components. In this paper, we propose a spatial structure-guided framework to integrate known structures of the data and tasks, which are represented as ground-truth graphs. This integration is enforce by minimizing a (dis-)similarity loss defined on graph edges. To optimize graph similarity, different loss functions are explored for the edge loss. In addition, we enhance the text feature extraction component in SDMG-R from character-level Bi-LSTM to word-level embeddings using a fine-tuned BERT, thereby integrating deeper language knowledge into the text labeling procedure. Experiments on the FUNSD and WildReceipt datasets demonstrate the effectiveness of our proposed model in extracting key information from document images with unseen templates, significantly outperforming baseline models.
DownloadPaper Citation
in Harvard Style
Majumder R., Wang Z., Yue Y., Kalita M. and Liu J. (2025). Enforcing Graph Structures to Enhance Key Information Extraction in Document Analysis. In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP; ISBN 978-989-758-728-3, SciTePress, pages 618-625. DOI: 10.5220/0013240600003912
in Bibtex Style
@conference{visapp25,
author={Rajashree Majumder and Zhewei Wang and Ye Yue and Mukut Kalita and Jundong Liu},
title={Enforcing Graph Structures to Enhance Key Information Extraction in Document Analysis},
booktitle={Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP},
year={2025},
pages={618-625},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013240600003912},
isbn={978-989-758-728-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: VISAPP
TI - Enforcing Graph Structures to Enhance Key Information Extraction in Document Analysis
SN - 978-989-758-728-3
AU - Majumder R.
AU - Wang Z.
AU - Yue Y.
AU - Kalita M.
AU - Liu J.
PY - 2025
SP - 618
EP - 625
DO - 10.5220/0013240600003912
PB - SciTePress