Improving Quality of Entity Resolution Using a Cascade Approach

Khizer Syed, Onais Khan Mohammed, John Talburt, Adeeba Tarannum, Altaf Mohammed, Mudasar Ali Mir, Mahboob Khan Mohammed

2025

Abstract

Entity Resolution (ER) is a critical technique in data management, designed to determine whether two or more data references correspond to the same real-world entity. This process is essential for cleansing datasets and linking information across diverse records. A variant of this technique, Binary Entity Resolution, focuses on the direct comparison of data pairs without incorporating the transitive closure typically found in cluster-based approaches. Unlike cluster-based ER, where indirect linkages imply broader associations among multiple records (e.g., A is linked with B, and B is linked with C, thereby linking A with C indirectly), Binary ER performs pairwise matching, resulting in a straightforward outcome—a series of pairs from two distinct sources. In this paper, we present a novel improvement to the cascade process used in entity resolution. Specifically, our data-centric, descending confidence cascade approach systematically orders linking methods based on their confidence levels in descending order. This method ensures that higher confidence methods, which are more accurate, are applied first, potentially enhancing the accuracy of subsequent, lower-confidence methods. As a result, our approach produces better quality matches than traditional methods that do not utilize a cascading approach, leading to more accurate entity resolution while maintaining high-quality links. This improvement is particularly significant in Binary ER, where the focus is on pairwise matches, and the quality of each link is crucial.

Download


Paper Citation


in Harvard Style

Syed K., Mohammed O., Talburt J., Tarannum A., Mohammed A., Mir M. and Mohammed M. (2025). Improving Quality of Entity Resolution Using a Cascade Approach. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 60-69. DOI: 10.5220/0013077400003890


in Bibtex Style

@conference{icaart25,
author={Khizer Syed and Onais Mohammed and John Talburt and Adeeba Tarannum and Altaf Mohammed and Mudasar Mir and Mahboob Mohammed},
title={Improving Quality of Entity Resolution Using a Cascade Approach},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2025},
pages={60-69},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013077400003890},
isbn={978-989-758-737-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Improving Quality of Entity Resolution Using a Cascade Approach
SN - 978-989-758-737-5
AU - Syed K.
AU - Mohammed O.
AU - Talburt J.
AU - Tarannum A.
AU - Mohammed A.
AU - Mir M.
AU - Mohammed M.
PY - 2025
SP - 60
EP - 69
DO - 10.5220/0013077400003890
PB - SciTePress