A Constraint-based Mining Approach for Multi-attribute Index Selection

B. Ziani, F. Rioult, Y. Ouinten

Abstract

The index selection problem (ISP) concerns the selection of an appropriate indexes set to minimize the total cost for a given workload under storage constraint. Since the ISP has been proven to be an NP-hard problem, most studies focus on heuristic algorithms to obtain approximate solutions. The problem becomes more difficult for indexes defined on multiple tables such as bitmap join indexes, since it requires the exploration of a large search space. Studies dealing with the problem of selecting bitmap join indexes mainly focused on proposing pruning solutions of the search space by the means of data mining techniques or heuristic strategies. The main shortcoming of these approaches is that the indexes selection process is performed in two steps. The generation of a large number of indexes is followed by a pruning phase. An alternative is to constrain the input data earlier in the selection process thereby reducing the output size to directly discover indexes that are of interest for the administrator. For example, to select a set of indexes, the administrator may put limits on the number of attributes or the cardinality of the attributes to be included in the indexes configuration he is seeking. In this paper we addressed the bitmap join indexes selection problem using a constraint-based approach. Unlike previous approaches, the selection is performed in one step by introducing constraints in the selection process. The proposed approach is evaluated using APB-1 benchmark.

References

  1. Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining association rules between sets of items in large databases. In ACM SIGMOD International Conference on Management of Data, Washington, D.C, pages 207-216.
  2. Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules in large databases. In International Conference on Very Large Data Bases, Santiago de Chile, Chile, pages 487-499.
  3. Agrawal, S., Chaudhuri, S., and Narasayya, V. (2000). Automated selection of materialized views and indexes in sql databases. In VLDB, pages 496-505.
  4. Aouiche, K., Darmont, J., Boussaid, O., and Bentayeb, F. (2005). Automatic selection of bitmap join index in data warehouses. In 7th International Conference, DaWaK, Copenhagen, Denmark, pages 64-73.
  5. Bellatreche, L. and Boukhalfa, K. (2010). Yet another algorithms for selecting bitmap join index. In 12th International Conference, DAWAK , Bilbao, Spain, pages 105-116.
  6. Bellatreche, L., Missaoui, R., Necir, H., and Drias, H. (2007). Selection and pruning algorithms for bitmap index selection problem using data mining. In 9th International Conference, DaWaK, Regensburg, Germany, pages 221-230.
  7. Boulicaut, J.-F., Raedt, L. D., and Mannila, H., editors (2005). Constraint-Based Mining and Inductive Databases, European Workshop on Inductive Databases and Constraint Based Mining, Hinterzarten, Germany, March 11-13, 2004, volume 3848 of Lecture Notes in Computer Science. Springer.
  8. Bucila, C., Gehrke, J. E., Kifer, D., and White, W. (2003). Dualminer: A dual-pruning algorithm for itemsets with constraints. Data Mining and Knowledge Discovery, 7(4):241-272.
  9. Chaudhuri, S., Datar, M., and Narasayya, V. (2004). Index selection for databases: A hardness study and a principled heuristic solution. IEEE Trans. Knowl. Data Eng, 16:1313-1323.
  10. Chaudhuri, S. and Narasayya, V. (1997). An efficient costdriven index selection tool for microsoft sql server. In 23rd International Conference on Very Large Data Bases, pages 146-155.
  11. Chaudhuri, S. and Narasayya, V. (2007). Self-tuning database systems: a decade of progress. In 33rd international conference on Very large data bases, pages 3-14.
  12. Feldman, Y. A. and Reouven, J. (2003). A knowledge-based approach for index selection in relational databases. Expert Syst. Appl., 25:15-37.
  13. Frank, M., Omiecinski, E., and Navathe, S. (1992). Adaptive and automated index selection in rdbms. In 3rd International Conference on Extending Database Technology, Vienna, Austria, pages 277-292.
  14. Golfarelli, M., Rizzi, S., and Saltarelli, E. (2002). Index selection for data warehousing. In 4th Intl. Workshop DMDW, Toronto, Canada.
  15. Inmon, W. (2002.). Building the Data Warehouse. John Wiley & Sons, Inc., New York, NY, USA, 2nd edition.
  16. Jeudy, B. and Boulicaut, J.-F. (2002). Constraint-based discovery and inductive queries: Application to association rule mining. In Hand, D. J., Adams, N. M., and Bolton, R. J., editors, Pattern Detection and Discovery, volume 2447 of LNCS, pages 110-124. Springer.
  17. Kimball, R. and Ross, M. (2007). The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. John Wiley & Sons, Inc., New York, NY, USA, 2nd edition.
  18. OLAP-Council (1998). Apb-1 olap benchmark, release ii. http://www.olapcouncil.org/.
  19. Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Discovering frequent closed itemsets for association rules. In 7th International Conference on Database Theory, pages 398-416.
  20. Pei, J. and Han, J. (2000). Can we push more constraints into frequent pattern mining? In Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (KDD'00), pages 350-354, Boston, USA. New York : ACM Press.
  21. Soulet, A., Klema, J., and Crmilleux, B. (2006). Efficient mining under flexible constraints through several datasets. In Workshop on Knowledge Discovery in Inductive Databases co-located with PKDD'06.
  22. Valentin, G., Zuliani, M., Zilio, D., Lohman, G., and Skelley, A. (2000). Db2 advisor: An optimizer smart enough to recommend its own index. In ICDE, pages 101-110.
  23. Vanichayobon, S. and Gruenwald, L. (1999). Indexing techniques for data warhouses queries. Technical report, University of Oklahoma, School of computer science.
  24. Ziani, B. and Ouinten, Y. (2011). Enhancing multi-attribute indexes selection using maximal frequent itemsets. In EGCM, Tanger, Morocco, pages 65-77.
Download


Paper Citation


in Harvard Style

Ziani B., Rioult F. and Ouinten Y. (2012). A Constraint-based Mining Approach for Multi-attribute Index Selection . In Proceedings of the 14th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-8565-10-5, pages 93-98. DOI: 10.5220/0003964600930098


in Bibtex Style

@conference{iceis12,
author={B. Ziani and F. Rioult and Y. Ouinten},
title={A Constraint-based Mining Approach for Multi-attribute Index Selection},
booktitle={Proceedings of the 14th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2012},
pages={93-98},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003964600930098},
isbn={978-989-8565-10-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 14th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - A Constraint-based Mining Approach for Multi-attribute Index Selection
SN - 978-989-8565-10-5
AU - Ziani B.
AU - Rioult F.
AU - Ouinten Y.
PY - 2012
SP - 93
EP - 98
DO - 10.5220/0003964600930098