ETL Patterns on YAWL - Towards to the Specification of Platform-independent Data Warehousing Populating Processes

Bruno Oliveira, Orlando Belo

Abstract

The implementation of data warehouse populating processes (ETL) is considered a complex task, not only in terms of the amount of data processed but also in the complexity of the tasks involved. The implementation and maintenance of such processes faces various design drawbacks, such as the change of business requirements, which consequently leads to adapting existing data structures and reusing existing parts of ETL system. We consider that a more abstract view of the ETL processes and its data structures is need as well as a more effective mapping to real execution primitives, providing its validation before conducting an ETL solution to its final implementation. With this work we propose the use of standard solutions, which already has proven very useful in software developing, for the implementation of standard ETL processes. In this paper we approach ETL modelling in a new perspective, using YAWL, a Workflow language, as the mean to get ETL models platform-independent.

References

  1. Van der Aalst, W M P & Hofstede, A. H. M. Ter, 2003. YAWL: Yet Another Workflow Language. Information Systems, 30, pp.245-275.
  2. Akkaoui, Z. El et al., 2011. A model-driven framework for ETL process development. In DOLAP 7811 Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP. pp. 45-52.
  3. Akkaoui, Z. El et al., 2012. BPMN-Based Conceptual Modeling of ETL Processes. Data Warehousing and Knowledge Discovery Lecture Notes in Computer Science, 7448, pp.1-14.
  4. Akkaoui, Z. El & Zimanyi, E., 2009. Defining ETL worfklows using BPMN and BPEL. In DOLAP 7809 Proceedings of the ACM twelfth international workshop on Data warehousing and OLAP. pp. 41- 48.
  5. Buschmann, F., Henney, K. & Schmidt, D. C., 2007. Pattern-Oriented Software Architecture, On Patterns and Pattern Languages, Wiley. Available at: http://books.google.pt/books?id=wzplRf3uh-EC.
  6. Decker, G. et al., 2008. Transforming BPMN Diagrams into YAWL Nets. Business Process Management Lecture Notes in Computer Science, 5240, pp.386- 389.
  7. El-Sappagh, S. H. A., Hendawi, A. M. A. & Bastawissy, A. H. El, 2011. A proposed model for data warehouse ETL processes. Journal of King Saud University - Computer and Information Sciences, 23(91-104).
  8. Kimball, R. & Caserta, J., 2004. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data,
  9. Lujan-Mora, S., Vassiliadis, P. & Trujillo, J., 2004. Data Mapping Diagrams for Data Warehouse Design with UML. In In Proc. 23rd International Conference on Conceptual Modeling (ER 2004. Springer, pp. 191- 204.
  10. Oliveira, B. & Belo, O., 2013a. Approaching ETL Conceptual Modelling and Validation Using BPMN and BPEL. In 2nd International Conference on Data Management Technologies and Applications (DATA).
  11. Oliveira, B. & Belo, O., 2012. BPMN Patterns for ETL Conceptual Modelling and Validation. The 20th International Symposium on Methodologies for Intelligent Systems: Lecture Notes in Artificial Intelligence.
  12. Oliveira, B. & Belo, O., 2013b. ETL Standard Processes Modelling: A Novel BPMN Approach. In 15th International Conference on Enterprise Information Systems (ICEIS).
  13. Oliveira, B. & Belo, O., 2013c. Pattern-based ETL conceptual modelling. In 3rd International Conference on Model & Data Engineering (MEDI 2013).
  14. Oliveira, B. & Belo, O., 2013d. Using Reo on ETL Conceptual Modelling - A First Approach. In ACM Sixteenth International Workshop On Data Warehousing and OLAP (DOLAP 2013).
  15. Simitsis, A. & Vassiliadis, P., 2008. A method for the mapping of conceptual designs to logical blueprints for ETL processes. Decis. Support Syst., 45(1), pp.22- 40. Available at: http://dx.doi.org/10.1016/ j.dss.2006.12.002.
  16. Simitsis, A. & Vassiliadis, P., 2003. A Methodology for the Conceptual Modeling of ETL Processes. In The 15th Conference on Advanced Information Systems Engineering (CAiSE 7803). pp. pp. 305-316.
  17. Trujillo & Luján-Mora, S., 2003. A UML Based Approach for Modeling ETL Processes in Data Warehouses. Conceptual Modeling - ER 2003 - Lecture Notes in Computer Science, 2813, pp.307-320.
  18. Vassiliadis, P. et al., 2003. A framework for the design of ETL scenarios. In Proceedings of the 15th international conference on Advanced information systems engineering. Berlin, Heidelberg: SpringerVerlag, pp. 520-535. Available at: http://dl.acm.org/citation.cfm?id=1758398.1758445.
  19. Vassiliadis, P. et al., 2000. Arktos: A Tool for Data Cleaning and Transformation in Data Warehouse Environments. IEEE Data Eng. Bull, 23, p.2000.
  20. Weske, M., Aalst, W. M. P. van der & Verbeek, H.M.W., 2004. Advances in business process management. Data & Knowledge Engineering 50, 50(1-8).
Download


Paper Citation


in Harvard Style

Oliveira B. and Belo O. (2014). ETL Patterns on YAWL - Towards to the Specification of Platform-independent Data Warehousing Populating Processes . In Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-027-7, pages 299-307. DOI: 10.5220/0004947302990307


in Bibtex Style

@conference{iceis14,
author={Bruno Oliveira and Orlando Belo},
title={ETL Patterns on YAWL - Towards to the Specification of Platform-independent Data Warehousing Populating Processes},
booktitle={Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2014},
pages={299-307},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004947302990307},
isbn={978-989-758-027-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - ETL Patterns on YAWL - Towards to the Specification of Platform-independent Data Warehousing Populating Processes
SN - 978-989-758-027-7
AU - Oliveira B.
AU - Belo O.
PY - 2014
SP - 299
EP - 307
DO - 10.5220/0004947302990307