Authors:
Edemberg Rocha Silva
1
;
Bernadette Farias Lóscio
2
and
Ana Carolina Salgado
2
Affiliations:
1
Federal Institute of Education and Science and Tecnhology of Paraíba, Brazil
;
2
Federal University of Pernambuco, Brazil
Keyword(s):
Semantic Balance, Semantic Clusters, Dynamic Data Integration Systems, Schema Evolution, Clustering Measure.
Related
Ontology
Subjects/Areas/Topics:
Coupling and Integrating Heterogeneous Data Sources
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Organisational Issues on Systems Integration
Abstract:
With the large volume of data sources on the Web, we need a system that integrates them, so that the user can query them transparently. For efficiency in queries, integration systems can group these sources in clusters according to the semantic similarity of their schemas. However, the sources have autonomy to evolve their schema, and to join or to leave the integration system at any time. This autonomy may cause a problem which we define as semantic unbalance of clusters. The semantic unbalance can compromise the formation of clusters and hence the efficiency of the submitted queries. In this paper, we propose a solution to the semantic balance of clusters in dynamic data integration systems based on self-organization. We also introduce a measure to evaluate how much the clusters are semantically unbalanced