Citable by Design - A Model for Making Data in Dynamic Environments Citable

Stefan Pröll, Andreas Rauber

2013

Abstract

Data forms the basis for research publications. But still the focus of researchers is a paper based publication, data is rather seen as a supplement that could be offered as a download, often without further comments. Yet validation, verification, reproduction and re-usage of existing knowledge can only be applied when the research data is accessible and identifiable. For this reason, precise data citation mechanisms are required, that allow reproducing experiments with exactly the same data basis. In this paper, we propose a model that enables to cite, identify and reference specific data sets within their dynamic environments. Our model allows the selection of subsets that support experiment verification and result re-utilisation in different contexts. The approach is based on assigning persistent identifiers to timestamped queries which are executed against time-stamped and versioned databases. This facilitates transparent implementation and scalable means to ensure identical result sets being delivered upon re-invocation of the query.

References

  1. Australian National Data Service (2011). Data Citation Awareness. http://ands.org.au/guides/data-citationawareness.pdf.
  2. Bellini, E., Cirinn, C., and Lunghi, M. (2008). Persistent identifiers distributed system for cultural heritage digital objects. In iPRES 2008: The Fifth International Conference on Preservation of Digital Objects.
  3. Brase, J. (2009). DataCite - A Global Registration Agency for Research Data. In COINFO 2009: Proceedings of the Fourth International Conference on Cooperation and Promotion of Information Resources in Science and Technology, Washington, DC, USA. IEEE Computer Society.
  4. Hans-Werner Hilse, J. K. (2006). Implementing Persistent Identifiers: Overview of concepts, guidelines and recommendations. Consortium of European Research Libraries, London.
  5. Hey, T., Tansley, S., and Tolle, K., editors (2009). The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research.
  6. Jensen, C., Soo, M., and Snodgrass, R. (1993). Unifying temporal data models via a conceptual model. Information Systems, 19:513-547.
  7. Jensen, C. S. and Lomet, D. B. (2001). Transaction timestamping in (temporal) databases. In Proceedings of the International Conference on Very Large Data Bases, pages 441-450.
  8. Moore, R. (2013). Workflow virtualization. In (Pröll and Rauber, 2013). Research Data Alliance - Launch and First Plenary March 18-20, 2013, Gothenburg, Sweden.
  9. Pröll, S. and Rauber, A. (2013). BoF-Session on Data Citation. Research Data Alliance - Launch and First Plenary March 18-20, 2013, Gothenburg, Sweden.
  10. Snodgrass, R. (1999). Developing Time-Oriented Database Applications in SQL. Morgan Kaufmann.
Download


Paper Citation


in Harvard Style

Pröll S. and Rauber A. (2013). Citable by Design - A Model for Making Data in Dynamic Environments Citable . In Proceedings of the 2nd International Conference on Data Technologies and Applications - Volume 1: DATA, ISBN 978-989-8565-67-9, pages 206-210. DOI: 10.5220/0004589102060210


in Bibtex Style

@conference{data13,
author={Stefan Pröll and Andreas Rauber},
title={Citable by Design - A Model for Making Data in Dynamic Environments Citable},
booktitle={Proceedings of the 2nd International Conference on Data Technologies and Applications - Volume 1: DATA,},
year={2013},
pages={206-210},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004589102060210},
isbn={978-989-8565-67-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Conference on Data Technologies and Applications - Volume 1: DATA,
TI - Citable by Design - A Model for Making Data in Dynamic Environments Citable
SN - 978-989-8565-67-9
AU - Pröll S.
AU - Rauber A.
PY - 2013
SP - 206
EP - 210
DO - 10.5220/0004589102060210