A NEW LOOK INTO DATA WAREHOUSE MODELLING

Nikolay Nikolov

2007

Abstract

The dominating paradigm of Data Warehouse design is the star schema (Kimball, 1996). The main debate within the scientific community for years has been not whether this paradigm is really the only way, but, rather, on its details (e.g. “to snowflake or not to snowflake” – Kimball et al., 1998). Shifting the emphasis of the discourse entirely within the star schema paradigm prevents the search for better alternatives. We argue that the star schema paradigm is an artifact of the transactional perspective and does not account for the analytic perspective. The most popular formalized method for deriving the star schema (Golfarelli et al., 1998) underlines just that by taking only the entity-relationship-model (ERM) as an input. Although this design approach follows the natural data and work-flow, it does not necessarily offer the best performance. The main thrust of our argument is that the query model should be used on a par with the ERM as a starting point in the data warehouse design process. The rationale is that the end design should reflect not just the structure inherent in the data model, but also that of the expected workload. Such approach results in a schema which may look very different than the traditional star schema but the performance improvement it may achieve justifies going off-the-beaten track.

References

  1. Agrawal, S., et al., 2004: Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design. Proc. 2004 SIGMOD Int. Conf. on Manag. of Data.
  2. Bizarro, P., Madeira, H., 2002: Adding a PerformanceOriented Perspective to Data Warehouse Design. Proc. of 4th Int. Conf. on Data Warehousing and Knowledge Discovery (DaWaK).
  3. Golfarelli, M. et al., 1998 Conceptual Design of Data Warehouses from E/R Schemes. In Proc. 32th HICSS.
  4. Inmon, W., 1996. Building the data warehouse, John Wiley & Sons, Inc. New York, NY, USA.
  5. Kimball, R., 1996. The data warehouse toolkit: practical techniques for building dimensional data warehouses, John Wiley & Sons, Inc. New York, NY, USA.
  6. Kimball, R. et al., 1998. The data warehouse lifecycle toolkit, John Wiley & Sons, Inc. New York, NY, USA.
  7. Martello, S., Toth, P., 1990. Knapsack problems: algorithms and computer implementations, John Wiley & Sons, Inc. New York, NY, USA.
  8. Papadomanolakis E., Ailamaki, A., 2004: AutoPart: Automating Schema Design for Large Scientific Databases Using Data Partitioning. Proc. 16th Int. Conf. on Scient. and Stat. Datab. Manag. (SSDBM).
  9. Stonebraker, M. et al., 2005. C-Store: A Column-oriented DBMS. Proc of the 31st Int. Conf. on Very Large Databases (VLDB).
  10. TPC-H Standard Specification Revision 2.1.0, 2002. http://www.tpc.org
Download


Paper Citation


in Harvard Style

Nikolov N. (2007). A NEW LOOK INTO DATA WAREHOUSE MODELLING . In Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-972-8865-88-7, pages 540-543. DOI: 10.5220/0002347205400543


in Bibtex Style

@conference{iceis07,
author={Nikolay Nikolov},
title={A NEW LOOK INTO DATA WAREHOUSE MODELLING},
booktitle={Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2007},
pages={540-543},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002347205400543},
isbn={978-972-8865-88-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - A NEW LOOK INTO DATA WAREHOUSE MODELLING
SN - 978-972-8865-88-7
AU - Nikolov N.
PY - 2007
SP - 540
EP - 543
DO - 10.5220/0002347205400543