Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach

Mila Kwiatkowska, Frank Pouw

Abstract

Data quality problems are widespread in secondary data when they are used for data warehousing and data mining. This paper advocates a broad semiotic approach to data quality. The main premises of this expanded semiotic framework are (1) data represent some reality, (2) data are created and interpreted by humans in a communication process, (3) data are used for specific purposes by humans, and (4) data cannot be created, interpreted and used without knowledge. Thus, the semiotic-based approach to data quality in secondary data analysis has four aspects: (1) representational, (3) communicational, (3) pragmatic, and (4) knowledge-based. To illustrate these four characteristics, we present a case study of ecological data analysis used in the creation of an ornithological data warehouse. We discuss the temporal data (ecological notion of time), spatial ecological data (communication processes and protocols used for data collection), and bioacoustic data processing (domain knowledge needed for the specification of data provenance).

Download


Paper Citation


in Harvard Style

Kwiatkowska M. and Pouw F. (2019). Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach.In Proceedings of the 8th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-377-3, pages 377-384. DOI: 10.5220/0007978403770384


in Bibtex Style

@conference{data19,
author={Mila Kwiatkowska and Frank Pouw},
title={Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach},
booktitle={Proceedings of the 8th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2019},
pages={377-384},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007978403770384},
isbn={978-989-758-377-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 8th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach
SN - 978-989-758-377-3
AU - Kwiatkowska M.
AU - Pouw F.
PY - 2019
SP - 377
EP - 384
DO - 10.5220/0007978403770384