Toward Pay-As-You-Go Data Integration for Healthcare Simulations

Philipp Baumgärtel, Gregor Endler, Richard Lenz

2014

Abstract

ProHTA (Prospective Health Technology Assessment) aims at understanding the impact of innovative medical processes and technologies at an early stage. To that end, large scale healthcare simulations are employed to estimate the effects of potential innovations. Simulation techniques are also utilized to detect areas with a high potential for improving the supply chain of healthcare. The data needed for both validating and adjusting these simulations typically comes from various heterogeneous sources and is often preaggregated and insufficiently documented. Thus, new data management techniques are required to cope with these conditions. Because of the high initial integration effort, we propose a pay-as-you-go approach using RDF. Thereby, data storage is separated from semantic annotation. Our proposed system offers automatic initial integration of various data sources. Additionally, it provides methods for searching semantically annotated data and for loading it into the simulation. The user can add annotations to the data in order to enable semantic integration on demand. In this paper, we demonstrate the feasibility of this approach with a prototype implementation. We discuss benefits and remaining challenges.

References

  1. Arthofer, K., Girardi, D., and Giretzlehner, M. (2012). Ein ontologiebasiertes system zum extrahieren, transformieren und laden in krankenanstalten. In Proceedings of the eHealth2012.
  2. Baumgärtel, P., Endler, G., and Lenz, R. (2013). A benchmark for multidimensional statistical data. In Catania, B., Guerrini, G., and Pokorn, J., editors, Advances in Databases and Information Systems, volume 8133 of Lecture Notes in Computer Science, pages 358-371. Springer Berlin Heidelberg.
  3. Baumgärtel, P. and Lenz, R. (2012). Towards data and data quality management for large scale healthcare simulations. In Conchon, E., Correia, C., Fred, A., and Gamboa, H., editors, Proceedings of the International Conference on Health Informatics, pages 275-280. SciTePress - Science and Technology Publications.
  4. Baumgärtel, P., Tenschert, J., and Lenz, R. (2014). A query language for workflow instance data. In Catania, B., Cerquitelli, T., Chiusano, S., Guerrini, G., Kmpf, M., Kemper, A., Novikov, B., Palpanas, T., Pokorn, J., and Vakali, A., editors, New Trends in Databases and Information Systems, volume 241 of Advances in Intelligent Systems and Computing, pages 79-86. Springer International Publishing.
  5. Das Sarma, A., Dong, X., and Halevy, A. (2008). Bootstrapping pay-as-you-go data integration systems. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, SIGMOD 7808, pages 861-874, New York, NY, USA. ACM.
  6. Djanatliev, A., Kolominsky-Rabas, P., Hofmann, B. M., and German, R. (2012). Hybrid simulation with loosely coupled system dynamics and agent-based models for prospective health technology assessments. In Proceedings of the 2012 Winter Simulation Conference.
  7. Franklin, M., Halevy, A., and Maier, D. (2005). From databases to dataspaces: a new abstraction for information management. SIGMOD Rec., 34:27-33.
  8. Howe, B., Cole, G., Souroush, E., Koutris, P., Key, A., Khoussainova, N., and Battle, L. (2011). Databaseas-a-service for long-tail science. In Bayard Cushing, J., French, J., and Bowers, S., editors, Scientific and Statistical Database Management, volume 6809 of Lecture Notes in Computer Science, pages 480- 489. Springer Berlin / Heidelberg.
  9. Jablonski, S., Volz, B., Rehman, M., Archner, O., and Cur, O. (2009). Data integration with the dalton framework - a case study. In Winslett, M., editor, Scientific and Statistical Database Management, volume 5566 of Lecture Notes in Computer Science, pages 255- 263. Springer Berlin / Heidelberg.
  10. Jeffery, S. R., Franklin, M. J., and Halevy, A. Y. (2008). Pay-as-you-go user feedback for dataspace systems. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, SIGMOD 7808, pages 847-860, New York, NY, USA. ACM.
  11. Lassila, O., Swick, R. R., Wide, W., and Consortium, W. (1999). Resource description framework (rdf) model and syntax specification. http://www.w3.org/TR/ 1999/REC-rdf-syntax-19990222.
  12. Lenz, R. (2009). Information systems in healthcare - state and steps towards sustainability. IMIA Yearbook 2009, 1:63-70.
  13. Lenz, R., Beyer, M., and Kuhn, K. A. (2007). Semantic integration in healthcare networks. International Journal of Medical Informatics, 76(2-3):201 - 207.
  14. Nadkarni, P. M., Marenco, L., Chen, R., Skoufos, E., Shepherd, G., and Miller, P. (1999). Organization of heterogeneous scientific data using the eav/cr representation. Journal of the American Medical Informatics Association, 6(6):478-493.
  15. Parnas, D. L. (1994). Software aging. In Proceedings of the 16th international conference on Software engineering, ICSE 7894, pages 279-287, Los Alamitos, CA, USA. IEEE Computer Society Press.
  16. Prud'hommeaux, E. and Seaborne, A. (2008). Sparql query language for rdf. http://www.w3.org/TR/2008/ REC-rdf-sparql-query-20080115/.
  17. Rahm, E. and Bernstein, P. A. (2001). A survey of approaches to automatic schema matching. The VLDB Journal, 10:334-350. 10.1007/s007780100057.
  18. Reimann, P., Reiter, M., Schwarz, H., Karastoyanova, D., and Leymann, F. (2011). Simpl - a framework for accessing external data in simulation workflows. In Datenbanksysteme für Business, Technologie und Web (BTW).
  19. Robertson, N. and Perera, T. (2002). Automated data collection for simulation? Simulation Practice and Theory, 9(6-8):349 - 364.
  20. Rogers, J., Simakov, R., Soroush, E., Velikhov, P., Balazinska, M., DeWitt, D., Heath, B., Maier, D., Madden, S., Patel, J., Stonebraker, M., Zdonik, S., Smirnov, A., Knizhnik, K., and Brown, P. G. (2010). Overview of scidb, large scale array storage, processing and analysis. In Proceedings of the SIGMOD'10.
  21. Sheth, A. P. and Larson, J. A. (1990). Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Comput. Surv., 22:183- 236.
Download


Paper Citation


in Harvard Style

Baumgärtel P., Endler G. and Lenz R. (2014). Toward Pay-As-You-Go Data Integration for Healthcare Simulations . In Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2014) ISBN 978-989-758-010-9, pages 172-177. DOI: 10.5220/0004734201720177


in Bibtex Style

@conference{healthinf14,
author={Philipp Baumgärtel and Gregor Endler and Richard Lenz},
title={Toward Pay-As-You-Go Data Integration for Healthcare Simulations},
booktitle={Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2014)},
year={2014},
pages={172-177},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004734201720177},
isbn={978-989-758-010-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2014)
TI - Toward Pay-As-You-Go Data Integration for Healthcare Simulations
SN - 978-989-758-010-9
AU - Baumgärtel P.
AU - Endler G.
AU - Lenz R.
PY - 2014
SP - 172
EP - 177
DO - 10.5220/0004734201720177