Improving Quality of Entity Resolution Using a Cascade Approach
Khizer Syed, Onais Khan Mohammed, John Talburt, Adeeba Tarannum, Altaf Mohammed, Mudasar Ali Mir, Mahboob Khan Mohammed
2025
Abstract
Entity Resolution (ER) is a critical technique in data management, designed to determine whether two or more data references correspond to the same real-world entity. This process is essential for cleansing datasets and linking information across diverse records. A variant of this technique, Binary Entity Resolution, focuses on the direct comparison of data pairs without incorporating the transitive closure typically found in cluster-based approaches. Unlike cluster-based ER, where indirect linkages imply broader associations among multiple records (e.g., A is linked with B, and B is linked with C, thereby linking A with C indirectly), Binary ER performs pairwise matching, resulting in a straightforward outcome—a series of pairs from two distinct sources. In this paper, we present a novel improvement to the cascade process used in entity resolution. Specifically, our data-centric, descending confidence cascade approach systematically orders linking methods based on their confidence levels in descending order. This method ensures that higher confidence methods, which are more accurate, are applied first, potentially enhancing the accuracy of subsequent, lower-confidence methods. As a result, our approach produces better quality matches than traditional methods that do not utilize a cascading approach, leading to more accurate entity resolution while maintaining high-quality links. This improvement is particularly significant in Binary ER, where the focus is on pairwise matches, and the quality of each link is crucial.
DownloadPaper Citation
in Harvard Style
Syed K., Mohammed O., Talburt J., Tarannum A., Mohammed A., Mir M. and Mohammed M. (2025). Improving Quality of Entity Resolution Using a Cascade Approach. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 60-69. DOI: 10.5220/0013077400003890
in Bibtex Style
@conference{icaart25,
author={Khizer Syed and Onais Mohammed and John Talburt and Adeeba Tarannum and Altaf Mohammed and Mudasar Mir and Mahboob Mohammed},
title={Improving Quality of Entity Resolution Using a Cascade Approach},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2025},
pages={60-69},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013077400003890},
isbn={978-989-758-737-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Improving Quality of Entity Resolution Using a Cascade Approach
SN - 978-989-758-737-5
AU - Syed K.
AU - Mohammed O.
AU - Talburt J.
AU - Tarannum A.
AU - Mohammed A.
AU - Mir M.
AU - Mohammed M.
PY - 2025
SP - 60
EP - 69
DO - 10.5220/0013077400003890
PB - SciTePress