Entity Linking of Sound Recordings and Compositions with Pre-trained Language Models

Nikiforos Katakis, Pantelis Vikatos

2021

Abstract

In this paper, we present a Deep Learning (DL) approach to tackle a real-world, large-scale music entity matching task. The quality of data, the lack of necessary information, and the absence of unique identifiers affect the effectiveness of entity matching and pose many challenges to the matching process. We propose an efficient matching method for linking recordings to their compositions through metadata using pre-trained language models. We represent each entity as a vector and estimate the similarity between vectors for a pair of entities. Our experiments show that an application of language models such as BERT, DistilBERT or ALBERT to large text corpora significantly improves the matching quality at an industrial level. We created a human- annotated dataset with sound recordings and composition pairs obtained from music usage logs and publishers, respectively. The proposed language model achieves 95% precision and reaches 96.5% recall which is a high performance on this challenging task.

Download


Paper Citation


in Harvard Style

Katakis N. and Vikatos P. (2021). Entity Linking of Sound Recordings and Compositions with Pre-trained Language Models. In Proceedings of the 17th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-758-536-4, pages 474-481. DOI: 10.5220/0010713900003058


in Bibtex Style

@conference{webist21,
author={Nikiforos Katakis and Pantelis Vikatos},
title={Entity Linking of Sound Recordings and Compositions with Pre-trained Language Models},
booktitle={Proceedings of the 17th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2021},
pages={474-481},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010713900003058},
isbn={978-989-758-536-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - Entity Linking of Sound Recordings and Compositions with Pre-trained Language Models
SN - 978-989-758-536-4
AU - Katakis N.
AU - Vikatos P.
PY - 2021
SP - 474
EP - 481
DO - 10.5220/0010713900003058