K-MEANS BASED APPROACH FOR OLAP DIMENSION UPDATES
Fadila Bentayeb
2008
Abstract
Actual data warehouses models usually consider OLAP dimensions as static entities. However, in practice, structural changes of dimensions schema are often necessary to adapt the multidimensional database to changing requirements. This paper presents a new structural update operator for OLAP dimensions, named Rollup-WithKmeans based on k-means clustering method. This operator allows to create a new level to which, a pre-existent level in an OLAP dimension hierarchy rolls up. To define the domain of the new level and the aggregation function from an existing level to the new level, our operator classifies all instances of an existing level into k clusters with the k-means clustering algorithm. To choose features for k-means clustering, we pro- pose two solutions. The first solution uses descriptors of the pre-existent level in its dimension table while the second one proposes to describe the new level by measures attributes in the fact table. Moreover, we carried out some experimentations within Oracle 10 g DBMS which validated the relevance of our approach.
References
- Blaschka, M., Sapia, C., and Höfling, G. (1999). On schema evolution in multidimensional databases. In DaWaK 1999, pages 153-164.
- Bliujute, R., Saltennis, S., Slivinskas, G., and Jensen, C. (1998). Systematic change management in dimensional data warehousing. Technical report, University of Arizona.
- Bradley, P. and Fayyad, U. (1998). Refining initial points for k-means clustering. In ICML, pages 91-99.
- Forgy, E. (1965). Cluster analysis of multivariate data: efficiency versus interpretability of classification. In Biometrics num 21, pages 768-780.
- Huang, Z. (1997). Clustering large data sets with mixed numeric and categorical values. In First Pacific-Asia Conference on Knowledge Discovery and Data Mining.
- Hurtado, C., Mendelzon, A., and Vaisman, A. (1999). Updating olap dimensions. In DOLAP 1999, pages 60- 66.
- Likas, A., Vlassis, N., and Verbeek, J. (2003). The global kmeans clustering algorithm. Pattern Recognition Letters 36(2), pages 451-461.
- Morzy, T. and Wrembel, R. (2004). On querying versions of multiversion data warehouse. In DOLAP 2004, pages 92-101.
- Pourabbas, E. and Rafanelli, M. (1999). Characterization of hierarchies and some operators in olap environment. In DOLAP 1999, pages 54-59.
- Vaisman, A. and Mendelzon, A. (2000). Temporal queries in olap. In VLDB 2000.
Paper Citation
in Harvard Style
Bentayeb F. (2008). K-MEANS BASED APPROACH FOR OLAP DIMENSION UPDATES . In Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-8111-36-4, pages 531-534. DOI: 10.5220/0001717905310534
in Bibtex Style
@conference{iceis08,
author={Fadila Bentayeb},
title={K-MEANS BASED APPROACH FOR OLAP DIMENSION UPDATES},
booktitle={Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2008},
pages={531-534},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001717905310534},
isbn={978-989-8111-36-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - K-MEANS BASED APPROACH FOR OLAP DIMENSION UPDATES
SN - 978-989-8111-36-4
AU - Bentayeb F.
PY - 2008
SP - 531
EP - 534
DO - 10.5220/0001717905310534