Authors:
Rocco Tripodi
and
Marcello Pelillo
Affiliation:
Ca' Focsari University, Italy
Keyword(s):
Document Clustering, Dominant Set, Game Theory.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Artificial Intelligence
;
Clustering
;
Data Engineering
;
Graphical and Graph-Based Models
;
Information Retrieval
;
Knowledge Engineering and Ontology Development
;
Knowledge-Based Systems
;
Natural Language Processing
;
Ontologies and the Semantic Web
;
Pattern Recognition
;
Software Engineering
;
Symbolic Systems
;
Theory and Methods
Abstract:
In this article we propose a new model for document clustering, based on game theoretic principles. Each document to be clustered is represented as a player, in the game theoretic sense, and each cluster as a strategy that the players have to choose in order to maximize their payoff. The geometry of the data is modeled as a graph, which encodes the pairwise similarity among each document and the games are played among similar players. In each game the players update their strategies, according to what strategy has been effective in previous games. The Dominant Set clustering algorithm is used to find the prototypical elements of each cluster. This information is used in order to divide the players in two disjoint sets, one collecting labeled players, which always play a definite strategy and the other one collecting unlabeled players, which update their strategy at each iteration of the games. The evaluation of the system was conducted on 13 document datasets and shows that the propo
sed method performs well compared to different document clustering algorithms.
(More)