loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Alieh Saeedi 1 ; 2 ; Lucie David 2 and Erhard Rahm 1 ; 2

Affiliations: 1 ScaDS.AI Dresden/Leipzig, Germany ; 2 University of Leipzig, Germany

Keyword(s): Entity Resolution, Hierarchical Agglomerative Clustering , Multi-source ER, MSCD-HAC.

Abstract: We propose extensions to Hierarchical Agglomerative Clustering (HAC) to match and cluster entities from multiple sources that can be either duplicate-free or dirty. The proposed scheme is comparatively evaluated against standard HAC as well as other entity clustering approaches concerning efficiency and efficacy criteria. All proposed algorithms can be run in parallel on a distributed cluster to improve scalability to large data volumes. The evaluation with diverse datasets shows that the new approach can utilize duplicate-free sources and achieves better match quality than previous methods.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.131.13.24

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Saeedi, A.; David, L. and Rahm, E. (2021). Matching Entities from Multiple Sources with Hierarchical Agglomerative Clustering. In Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - KEOD; ISBN 978-989-758-533-3; ISSN 2184-3228, SciTePress, pages 40-50. DOI: 10.5220/0010649600003064

@conference{keod21,
author={Alieh Saeedi. and Lucie David. and Erhard Rahm.},
title={Matching Entities from Multiple Sources with Hierarchical Agglomerative Clustering},
booktitle={Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - KEOD},
year={2021},
pages={40-50},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010649600003064},
isbn={978-989-758-533-3},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - KEOD
TI - Matching Entities from Multiple Sources with Hierarchical Agglomerative Clustering
SN - 978-989-758-533-3
IS - 2184-3228
AU - Saeedi, A.
AU - David, L.
AU - Rahm, E.
PY - 2021
SP - 40
EP - 50
DO - 10.5220/0010649600003064
PB - SciTePress