Cost-Aware TrE-ND: Tri-embed Noise Detection for Enhancing Data Quality of Knowledge Graph

Jumana Alsubhi, Abdulrahman Gharawi, Lakshmish Ramaswamy

2024

Abstract

In the realm of machine learning, knowledge graphs (KGs) are increasingly utilized for a wide range of tasks, such as question-answering, recommendation systems, and natural language processing. These KGs are inherently susceptible to noise, whether they are constructed manually or automatically. Existing techniques often fail to precisely identify these noisy triples, thereby compromising the utility of KGs for downstream applications. In addition, manual noise detection is costly, with costs ranging from $2 to $6 per triple. This highlights the need for cost-effective solutions, especially for large KGs. To tackle this problem, we introduce Tri-embed Noise Detection (TrE-ND), a highly accurate and cost-efficient noise detection approach for KGs. TrE-ND combines semantic depth, hierarchical modeling, and scalability for robust noise detection in large knowledge graphs. We also evaluate the overall quality of these KGs using the TrE-ND approach. We validate TrE-ND through comprehensive experiments on widely recognized KG datasets, namely, FB13 and WN11, each containing varying degrees of noise. Our findings demonstrate a substantial improvement in noise detection and KG evaluation accuracy as compared to existing methods. By utilizing the TrE-ND approach, we manage to flag noisy triples with an average approximate accuracy of 87%, even when up to 40% of the dataset contains noise. This simplifies the subsequent verification process by domain experts and makes it more cost-effective. Therefore, our proposed method offers a viable solution for efficiently addressing the persistent issue of noise in KGs. This work also paves the way for future research in cost-aware noise mitigation techniques and their applications in various domains.

Download


Paper Citation


in Harvard Style

Alsubhi J., Gharawi A. and Ramaswamy L. (2024). Cost-Aware TrE-ND: Tri-embed Noise Detection for Enhancing Data Quality of Knowledge Graph. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 1020-1027. DOI: 10.5220/0012431700003636


in Bibtex Style

@conference{icaart24,
author={Jumana Alsubhi and Abdulrahman Gharawi and Lakshmish Ramaswamy},
title={Cost-Aware TrE-ND: Tri-embed Noise Detection for Enhancing Data Quality of Knowledge Graph},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2024},
pages={1020-1027},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012431700003636},
isbn={978-989-758-680-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Cost-Aware TrE-ND: Tri-embed Noise Detection for Enhancing Data Quality of Knowledge Graph
SN - 978-989-758-680-4
AU - Alsubhi J.
AU - Gharawi A.
AU - Ramaswamy L.
PY - 2024
SP - 1020
EP - 1027
DO - 10.5220/0012431700003636
PB - SciTePress