Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach
Mila Kwiatkowska, Frank Pouw
2019
Abstract
Data quality problems are widespread in secondary data when they are used for data warehousing and data mining. This paper advocates a broad semiotic approach to data quality. The main premises of this expanded semiotic framework are (1) data represent some reality, (2) data are created and interpreted by humans in a communication process, (3) data are used for specific purposes by humans, and (4) data cannot be created, interpreted and used without knowledge. Thus, the semiotic-based approach to data quality in secondary data analysis has four aspects: (1) representational, (3) communicational, (3) pragmatic, and (4) knowledge-based. To illustrate these four characteristics, we present a case study of ecological data analysis used in the creation of an ornithological data warehouse. We discuss the temporal data (ecological notion of time), spatial ecological data (communication processes and protocols used for data collection), and bioacoustic data processing (domain knowledge needed for the specification of data provenance).
DownloadPaper Citation
in Harvard Style
Kwiatkowska M. and Pouw F. (2019). Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach.In Proceedings of the 8th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-377-3, pages 377-384. DOI: 10.5220/0007978403770384
in Bibtex Style
@conference{data19,
author={Mila Kwiatkowska and Frank Pouw},
title={Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach},
booktitle={Proceedings of the 8th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2019},
pages={377-384},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007978403770384},
isbn={978-989-758-377-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 8th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach
SN - 978-989-758-377-3
AU - Kwiatkowska M.
AU - Pouw F.
PY - 2019
SP - 377
EP - 384
DO - 10.5220/0007978403770384