Authors:
Italo Lopes Oliveira
1
;
Diego Moussallem
2
;
Luís Paulo Faina Garcia
3
and
Renato Fileto
1
Affiliations:
1
Department of Informatics and Statistics, Federal University of Santa Catarina, Florianópolis, Santa Catarina, Brazil
;
2
Data Science Group, University of Paderborn, North Rhine-Westphalia, Germany
;
3
Computer Science Department, University of Brasilia, Brasília, Distrito Federal, Brazil
Keyword(s):
Entity Linking, Knowledge Embedding, Word Embedding, Deep Neural Network.
Abstract:
Entity Linking (EL) for microblog posts is still a challenge because of their usually informal language and limited textual context. Most current EL approaches for microblog posts expand each post context by considering related posts, user interest information, spatial data, and temporal data. Thus, these approaches can be too invasive, compromising user privacy. It hinders data sharing and experimental reproducibility. Moreover, most of these approaches employ graph-based methods instead of state-of-the-art embedding-based ones. This paper proposes a knowledge-intensive EL approach for microblog posts called OPTIC. It relies on a jointly trained word and knowledge embeddings to represent contexts given by the semantics of words and entity candidates for mentions found in the posts. These embedded semantic contexts feed a deep neural network that exploits semantic coherence along with the popularity of the entity candidates for doing their disambiguation. Experiments using the benchm
ark system GERBIL shows that OPTIC outperforms most of the approaches on the NEEL challenge 2016 dataset.
(More)