Authors:
Amin Mantrach
and
Jean-Michel Renders
Affiliation:
Xerox Research Centre Europe, France
Keyword(s):
Social media mining, Information retrieval, Social retrieval, Data fusion, Data aggregation, Multi-view problems, Multiple graphs, Collaborative recommendation, Similarity measures, Pseudo-relevance feedback.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Information Extraction
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Mining Multimedia Data
;
Mining Text and Semi-Structured Data
;
Symbolic Systems
;
User Profiling and Recommender Systems
Abstract:
The growing importance of social media and heterogeneous relational data emphasizes to the fundamental problem of combining different sources of evidence (or modes) efficiently. In this work, we are considering the problem of people retrieval where the requested information consists of persons and not of documents. Indeed, the processed queries contain generally both textual keywords and social links while the target collection consists of a set of documents with social metadata. Traditional approaches tackle this problem by early or late fusion where, typically, a person is represented by two sets of features: a word profile and a contact/link profile. Inspired by cross-modal similarity measures initially designed to combine image and text, we propose in this paper new ways of combining social and content aspects for retrieving people from a collection of documents with social metadata. To this aim, we define a set of multimodal similarity measures between socially-labelled document
s and queries, that could then be aggregated at the person level to provide a final relevance score for the general people retrieval problem. Then, we examine particular instances of this problem: author retrieval, recipient recommendation and alias detection. For this purpose, experiments have been conducted on the ENRON email collection, showing the benefits of our proposed approach with respect to more standard fusion and aggregation methods.
(More)