Author:
Fabien Duchateau
Affiliation:
Université Lyon 1, France
Keyword(s):
Data Integration, Schema Matching, Ontology Alignment, Entity Resolution, Entity Matching, Selection of Correspondences.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Collaboration and e-Services
;
Complex Systems Modeling and Simulation
;
Data Engineering
;
Data Management and Quality
;
Data Structures and Data Management Algorithms
;
e-Business
;
Enterprise Information Systems
;
Health Information Systems
;
Information Integration
;
Integration/Interoperability
;
Interoperability
;
Knowledge Engineering and Ontology Development
;
Knowledge Management and Information Sharing
;
Knowledge-Based Systems
;
Modeling and Managing Large Data Systems
;
Ontologies and the Semantic Web
;
Sensor Networks
;
Simulation and Modeling
;
Software Agents and Internet Computing
;
Software and Architectures
;
Symbolic Systems
Abstract:
The Web 2.0 and the inexpensive cost of storage have pushed towards an exponential growth in the volume of collected and produced data. However, the integration of distributed and heterogeneous data sources has become the bottleneck for many applications, and it therefore still largely relies on manual tasks. One of this task, named matching or alignment, is the discovery of correspondences, i.e., semantically-equivalent elements in different data sources. Most approaches which attempt to solve this challenge face the issue of deciding whether a pair of elements is a correspondence or not, given the similarity value(s) computed for this pair. In this paper, we propose a generic and flexible framework for selecting the correspondences by relying on the discriminative similarity values for a pair. Running experiments on a public dataset has demonstrated the improvment in terms of quality and the robustness for adding new similarity measures without user intervention for tuning.