Authors:
Mortaza S. Bargh
1
;
Jan van Dijk
1
and
Sunil Choenni
2
Affiliations:
1
Research and Documentation Centre and Ministry of Security and Justice, Netherlands
;
2
Research and Documentation Centre, Ministry of Security and Justice and Rotterdam University of Technology, Netherlands
Keyword(s):
Data Quality Issues, Data Quality Management, Knowledge Mapping, User Generated Inputs.
Related
Ontology
Subjects/Areas/Topics:
Architectural Concepts
;
Business Analytics
;
Data Engineering
;
Data Management and Quality
;
Data Warehouse Management
;
Information Quality
;
Organizational Concepts and Best Practices
Abstract:
Dealing with data quality related problems is an important issue that all organizations face in realizing and sustaining data intensive advanced applications. Upon detecting these problems in datasets, data analysts often register them in issue tracking systems in order to address them later on categorically and collectively. As there is no standard format for registering these problems, data analysts often describe them in natural languages and subsequently rely on ad-hoc, non-systematic, and expensive solutions to categorize and resolve registered problems. In this contribution we present a formal description of an innovative data quality resolving architecture to semantically and dynamically map the descriptions of data quality related problems to data quality attributes. Through this mapping, we reduce complexity – as the dimensionality of data quality attributes is far smaller than that of the natural language space – and enable data analysts to directly use the methods and tool
s proposed in literature. Furthermore, through managing data quality related problems, our proposed architecture offers data quality management in a dynamic way based on user generated inputs. The paper reports on a proof of concept tool and its evaluation.
(More)