Evaluation of Deep Learning Techniques for Entity Matching
Paulo Lima, Douglas Santana, Wellington Santos Martins, Leonardo Ribeiro
2023
Abstract
Application data inevitably has inconsistencies that may cause malfunctioning in daily operations and com- promise analytical results. A particular type of inconsistency is the presence of duplicates, e.g., multiple and non-identical representations of the same information. Entity matching (EM) refers to the problem of de- termining whether two data instances are duplicates. Two deep learning solutions, DeepMatcher and Ditto, have recently achieved state-of-the-art results in EM. However, neither solution considered duplicates with character-level variations, which are pervasive in real-world databases. This paper presents a comparative evaluation between DeepMatcher and Ditto on datasets from a diverse array of domains with such variations and textual patterns that were previously ignored. The results showed that the two solutions experienced a considerable drop in accuracy, while Ditto was more robust than DeepMatcher.
DownloadPaper Citation
in Harvard Style
Lima P., Santana D., Santos Martins W. and Ribeiro L. (2023). Evaluation of Deep Learning Techniques for Entity Matching. In Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-648-4, SciTePress, pages 247-254. DOI: 10.5220/0011996200003467
in Bibtex Style
@conference{iceis23,
author={Paulo Lima and Douglas Santana and Wellington Santos Martins and Leonardo Ribeiro},
title={Evaluation of Deep Learning Techniques for Entity Matching},
booktitle={Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2023},
pages={247-254},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011996200003467},
isbn={978-989-758-648-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 25th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Evaluation of Deep Learning Techniques for Entity Matching
SN - 978-989-758-648-4
AU - Lima P.
AU - Santana D.
AU - Santos Martins W.
AU - Ribeiro L.
PY - 2023
SP - 247
EP - 254
DO - 10.5220/0011996200003467
PB - SciTePress