6 CONCLUSIONS AND FUTURE
WORK
In this paper we propose the use of a pattern-
oriented approach for ETL modelling and
implementation. Each ETL pattern represents an
ETL task (or set of tasks) that is regularly used in a
real world DWS – SKP, SCD, CDC, DQV, or IDL.
We can look to these patterns as "black boxes" that
given a proper input metadata produce a specific
output, accordingly its internal specification. This
approach provides an easy method to specify
complex ETL processes as also some proven
software engineer practices for ETL systems
implementation. With our approach it’s possible to
reduce some planning problems, especially the ones
related to business requirements changing and
implementation errors. We can change the execution
order of a pattern and its input data without
compromising other tasks or compromise the final
process implementation.
Through the use of YAWL, we demonstrated
how to design a complete ETL system using a set of
ETL patterns. The YAWL specification provides a
simple and very powerful notation that coupled with
powerful execution primitives and data support
structures turns YAWL very suitable for the
validation of ETL processes before proceeding to its
final implementation. Using this ETL modelling
approach, designers and developers only need to
know how to interact with patterns regardless of its
internal specification.
As future work we intend to provide an extended
family of YAWL patterns allowing for building a
complete ETL system from scratch. Additionally,
we expect to provide specific XML schemas for the
definition of patterns’ metadata and explore the use
of selection and exception handling services.
REFERENCES
Van der Aalst, W M P & Hofstede, A. H. M. Ter, 2003.
YAWL: Yet Another Workflow Language.
Information Systems, 30, pp.245–275.
Akkaoui, Z. El et al., 2011. A model-driven framework for
ETL process development. In DOLAP ’11
Proceedings of the ACM 14th international workshop
on Data Warehousing and OLAP. pp. 45–52.
Akkaoui, Z. El et al., 2012. BPMN-Based Conceptual
Modeling of ETL Processes. Data Warehousing and
Knowledge Discovery Lecture Notes in Computer
Science, 7448, pp.1–14.
Akkaoui, Z. El & Zimanyi, E., 2009. Defining ETL
worfklows using BPMN and BPEL. In DOLAP ’09
Proceedings of the ACM twelfth international
workshop on Data warehousing and OLAP. pp. 41–
48.
Buschmann, F., Henney, K. & Schmidt, D. C., 2007.
Pattern-Oriented Software Architecture, On Patterns
and Pattern Languages, Wiley. Available at:
http://books.google.pt/books?id=wzplRf3uh-EC.
Decker, G. et al., 2008. Transforming BPMN Diagrams
into YAWL Nets. Business Process Management
Lecture Notes in Computer Science, 5240, pp.386–
389.
El-Sappagh, S. H. A., Hendawi, A. M. A. & Bastawissy,
A. H. El, 2011. A proposed model for data warehouse
ETL processes. Journal of King Saud University –
Computer and Information Sciences, 23(91-104).
Kimball, R. & Caserta, J., 2004. The Data Warehouse
ETL Toolkit: Practical Techniques for Extracting,
Cleaning, Conforming, and Delivering Data,
Lujan-Mora, S., Vassiliadis, P. & Trujillo, J., 2004. Data
Mapping Diagrams for Data Warehouse Design with
UML. In In Proc. 23rd International Conference on
Conceptual Modeling (ER 2004. Springer, pp. 191–
204.
Oliveira, B. & Belo, O., 2013a. Approaching ETL
Conceptual Modelling and Validation Using BPMN
and BPEL. In 2nd International Conference on Data
Management Technologies and Applications (DATA).
Oliveira, B. & Belo, O., 2012. BPMN Patterns for ETL
Conceptual Modelling and Validation. The 20th
International Symposium on Methodologies for
Intelligent Systems: Lecture Notes in Artificial
Intelligence.
Oliveira, B. & Belo, O., 2013b. ETL Standard Processes
Modelling: A Novel BPMN Approach. In 15th
International Conference on Enterprise Information
Systems (ICEIS).
Oliveira, B. & Belo, O., 2013c. Pattern-based ETL
conceptual modelling. In 3rd International Conference
on Model & Data Engineering (MEDI 2013).
Oliveira, B. & Belo, O., 2013d. Using Reo on ETL
Conceptual Modelling - A First Approach. In ACM
Sixteenth International Workshop On Data
Warehousing and OLAP (DOLAP 2013).
Simitsis, A. & Vassiliadis, P., 2008. A method for the
mapping of conceptual designs to logical blueprints
for ETL processes. Decis. Support Syst., 45(1), pp.22–
40. Available at: http://dx.doi.org/10.1016/
j.dss.2006.12.002.
Simitsis, A. & Vassiliadis, P., 2003. A Methodology for
the Conceptual Modeling of ETL Processes. In The
15th Conference on Advanced Information Systems
Engineering (CAiSE ’03). pp. pp. 305–316.
Trujillo & Luján-Mora, S., 2003. A UML Based Approach
for Modeling ETL Processes in Data Warehouses.
Conceptual Modeling - ER 2003 - Lecture Notes in
Computer Science, 2813, pp.307–320.
Vassiliadis, P. et al., 2003. A framework for the design of
ETL scenarios. In
Proceedings of the 15th
ICEIS2014-16thInternationalConferenceonEnterpriseInformationSystems
306