MODEL-DRIVEN AD HOC DATA INTEGRATION IN THE CONTEXT OF A POPULATION-BASED CANCER REGISTRY

Yvette Teiken, Martin Rohde, Hans-Jürgen Appelrath

Abstract

The major task of a population-based Cancer Registry (CR) is the identification of risk groups and factors. This analysis makes use of data about the social background of the population. The integration of that data is not intended for the routine processes at the CR. Therefore, this process must be performed by data warehouse experts that results in high cost. This paper proposes an approach, which allows epidemiologists and physicians at the CR to realize this ad hoc data integration on their own. We use model driven software design (MDSD) with a domain specific language (DSL), which allows the epidemiologists and physicians to describe the data to be integrated in a known language. This description or rather model is used to create an extension of the existing data pool and a web service and web application for data integration. The end user can do the integration on his/her own which results in a very cost-efficient way of ad hoc data integration.

References

  1. Batzler, W. U., Giersiepen, K., Hentschel, S., Husmann, G., Kaatsch, P., Katalinic, A., Kieschke, J., Kraywinkel, K., Meyer, M., Stabenow, R., and Stegmaier, C. (2008). Cancer in Germany 2003-2004 Incidence and Trends. Robert Koch-Institut, Berlin.
  2. Bulos, D. (1996). Olap database design: A new dimension. Database Programming&Design, Vol. 9(6).
  3. Cook, S., Jones, G., and Kent, S. (2007). Domain Specific Development with Visual Studio DSL Tools (Microsoft .net Development). Addison-Wesley Longman, Amsterdam.
  4. Dombrowski, E. and Lechtenb örger, J. (2005). Evaluation objektorientierter Ansätze zur Data-WarehouseModellierung. Datenbank-Spektrum, 5(15):18-25.
  5. Fowler, M. (2002). Patterns of Enterprise Application Architecture. Addison-Wesley Longman.
  6. Gluchowski, P., Kunze, C., and Schneider, C. (2009). A modeling tool for multidimensional data using the adapt notation. In 42nd Hawaii International Conference on System Sciences (HICSS-42).
  7. Golfarelli, M., Maio, D., and Rizzi, S. (1998). The dimensional fact model: A conceptual model for data warehouses. International Journal of Cooperative Information Systems, 7:215-247.
  8. Hahne, M. (2005). Das common warehouse metamodel als referenzmodell für metadaten im data warehouse und dessen erweiterung im sap business information warehouse. In Vossen, G., Leymann, F., Lockemann, P. C., and Stucky, W., editors, BTW, volume 65 of LNI, pages 578-595. GI.
  9. Hartmann, S. (2008). berwindung semantischer Heterogenitt bei multiplen Data-Warehouse-Systemen. PhD thesis, University of Bamberg.
  10. Herden, O. (2000). A design methodology for data warehouses. In Proc. of the 7th IEEE Intl. Baltic Workshop (Baltic DB&IS 2000), pages 292-293. IEEE.
  11. Kelly, S. and Tolvanen, J.-P. (2008). Domain-Specific Modeling: Enabling Full Code Generation. John Wiley & Sons.
  12. Kimball, R., Reeves, L., Ross, M., and Thornthwaite, W. (1998). The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing, and Deploying Data Warehouses. John Wiley & Sons.
  13. Koch, S., Meister, J., and Rohde, M. (2003). Mustang - a framework for statistical analyses of multidimensional data in public health. In Gnauck, A. and Heinrich, R., editors, 17th International Conference Informatics for Environment Protection, pages 635-642.
  14. Mazon, J.-N., Trujillo, J., Serrano, M., and Piattini, M. (2005). Applying mda to the development of data warehouses. In DOLAP 7805: Proceedings of the 8th ACM international workshop on Data warehousing and OLAP, pages 57-66, New York, NY, USA. ACM.
  15. OMG (2001). Common warehouse metamodel (cwm) specification. Internet.
  16. Sapia, C., Blaschka, M., Höfling, G., and Dinter, B. (1999). Extending the e/r model for the multidimensional paradigm. In ER 7898: Proceedings of the Workshops on Data Warehousing and Data Mining, pages 105- 116, London, UK. Springer-Verlag.
  17. Völter, M. and Stahl, T. (2006). Model-Driven Software Development. Wiley & Sons.
  18. Wietek, F. (1999). Modelling Multidimensional Data in a Dataflow-Based Visual Data Analysis Environment, volume 1626 of Lecture Notes in Computer Science. Springer.
Download


Paper Citation


in Harvard Style

Teiken Y., Rohde M. and Appelrath H. (2010). MODEL-DRIVEN AD HOC DATA INTEGRATION IN THE CONTEXT OF A POPULATION-BASED CANCER REGISTRY . In Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: DMIA, (ICSOFT 2010) ISBN 978-989-8425-22-5, pages 337-343. DOI: 10.5220/0003044703370343


in Bibtex Style

@conference{dmia10,
author={Yvette Teiken and Martin Rohde and Hans-Jürgen Appelrath},
title={MODEL-DRIVEN AD HOC DATA INTEGRATION IN THE CONTEXT OF A POPULATION-BASED CANCER REGISTRY},
booktitle={Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: DMIA, (ICSOFT 2010)},
year={2010},
pages={337-343},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003044703370343},
isbn={978-989-8425-22-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: DMIA, (ICSOFT 2010)
TI - MODEL-DRIVEN AD HOC DATA INTEGRATION IN THE CONTEXT OF A POPULATION-BASED CANCER REGISTRY
SN - 978-989-8425-22-5
AU - Teiken Y.
AU - Rohde M.
AU - Appelrath H.
PY - 2010
SP - 337
EP - 343
DO - 10.5220/0003044703370343