Authors:
Ndiouma Bame
1
;
Ibrahima Gueye
2
and
Hubert Naacke
3
Affiliations:
1
Département de Mathématiques-Informatique, Université Cheikh Anta Diop, Dakar, Senegal
;
2
LTISI, Ecole Polytechnique de Thiès, Thiès, Senegal
;
3
LIP6, Sorbonne Université, Paris, France
Keyword(s):
Geographic Similarity, Semantic Similarity, Event-Poi Matching, Sentence Embedding, Open Data.
Abstract:
Users often share data about their daily activities through social networks. These event data are very useful for a variety of uses cases such as points of interest (POI) recommendation. However, event data often lack information about POIs. Thus, enriching event data with POI information is of upmost importance. This implies to know the POI in which an event took place before completing the data. We face the problem of aligning two types of data sources, event data and POI data, which is difficult because they do not have a common identifier or the same descriptive attributes. This work proposes and implements a complete methodology for the enrichment of a large dataset of geolocated data on user events with POI using both geographical and semantic properties. This effective methodology for matching POIs with geo-located events comprises four steps: (i) in a first step, we cross-reference the data using spatial proximity to define the geographical neighborhood of each event; (ii) in
a second step, we define the semantic neighborhood of each event based on a threshold on the semantic similarity. The semantic similarity exploits events data such as their contextual description and the tags by crossing them with those of the POI. (iii) these two types of similarity are combined for each POI of the event semantic neighborhood, to evaluate a geo-semantic similarity score; (iv) subsequently, each event is matched with the POI of the semantic neighborhood which maximizes the geo-semantic similarity score. We propose a robust modeling of our methodology and evaluate the effectiveness of our approach.
(More)