Authors:
Sonia Bergamaschi
1
;
Mirko Orsini
1
;
Francesco Guerra
1
and
Claudio Sartori
2
Affiliations:
1
Università di Modena e Reggio Emilia, Italy
;
2
Università di Bologna, Italy
Keyword(s):
Metadata, querying, data integration, data mining.
Related
Ontology
Subjects/Areas/Topics:
Coupling and Integrating Heterogeneous Data Sources
;
Databases and Information Systems Integration
;
Enterprise Information Systems
Abstract:
Research on data integration has provided languages and systems able to guarantee an integrated intentional representation of a given set of data sources. A significant limitation common to most proposals is that only intentional knowledge is considered, with little or no consideration for extensional knowledge.
In this paper we propose a technique to enrich the intension of an attribute with a new sort of metadata: the “relevant values”, extracted from the attribute values. Relevant values enrich schemata with domain knowledge; moreover they can be exploited by a user in the interactive process of creating/refining a query. The technique, fully implemented in a prototype, is automatic, independent of the attribute domain and it is based on data mining clustering techniques and emerging semantics from data values. It is parameterized with various metrics for similarity measures and is a viable tool for dealing with frequently changing sources, as in the Semantic Web context.