Authors:
Yann Carbonne
1
and
Christelle Jacob
2
Affiliations:
1
University of Technology of Troyes, France
;
2
Altran Technologies, France
Keyword(s):
Genetic Algorithm, Machine Learning, Natural Language Processing, Profiles Recognition, Clustering.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Computational Intelligence
;
Evolutionary Computing
;
Genetic Algorithms
;
Informatics in Control, Automation and Robotics
;
Intelligent Control Systems and Optimization
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Soft Computing
;
Symbolic Systems
Abstract:
Persons are often asked to provide information about themselves. These data are very heterogeneous and
result in as many “profiles” as contexts. Sorting a large amount of profiles from different contexts and
assigning them back to a specific individual is quite a difficult problem. Semantic processing and machine
learning are key tools to achieve this goal. This paper describes a framework to address this issue by means
of concepts and algorithms selected from different Artificial Intelligence fields. Indeed, a Vector Space Model
is customized to first transpose semantic information into a mathematical model. Then, this model goes
through a Genetic Algorithm (GA) which is used as a supervised learning algorithm for training a computer
to determine how much two profiles are similar. Amongst the GAs, this study introduces a new reproduction
method (Best Together), and compare it to some usual ones (Wheel, Binary Tournament).This paper also
evaluates the accuracy of the GAs pre
dictions for profiles clustering with the computation of a similarity
score, as well as its ability to classify two profiles are similar or non-similar. We believe that the overall
methodology can be used for any kind of sources using profiles and, more generally, for similar data
recognition.
1
(More)