Authors:
Christos Bouras
1
and
Vassilis Tsogkas
2
Affiliations:
1
University of Patras and Computer Technology Institute and Press “Diophantus”, Greece
;
2
Computer Technology Institute and Press “Diophantus”, Greece
Keyword(s):
Clustering, Text Preprocessing, User Personalization, n-Grams, W-kmeans.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Business Analytics
;
Data Analytics
;
Data Engineering
;
Data Management and Quality
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Health Information Systems
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Statistics Exploratory Data Analysis
;
Symbolic Systems
;
Text Analytics
Abstract:
While online information sources are rapidly increasing in amount, so does the daily available online news content. Several approaches have being proposed for organizing this immense amount of data. In this work we explore the integration of multiple information retrieval techniques, like text preprocessing, n-grams expansion, summarization, categorization and item/user clustering into a single mechanism designed to consolidate and index news articles from major news portals from around the web. Our goal is to allow users to seamlessly and quickly get the news of the day that are of appeal to them via our system. We show how, the application of each one of the proposed techniques gradually improves the precision results in terms of the suggested news articles for a number of registered system users and how, aggregately, these techniques provide a unified solution to the recommendation problem.