Are Multi-way Joins Actually Useful?

Michael Henderson, Ramon Lawrence

2013

Abstract

Multi-way joins improve performance by avoiding extra I/Os from multiple partitioning steps. There are several multi-way join algorithms proposed, and the research results are encouraging. However, commercial database systems are not currently using multi-way joins. Practical issues include modifying the optimizer and execution system to support multi-way operators and ensuring robust and reliable performance. The contribution of this work is an implementation and experimental evaluation of multi-way joins in PostgreSQL. We provide algorithms that modify the optimizer to cost multi-way joins and create and execute query plans that have more than two input operators. Experimental results show that multi-way joins are beneficial for several queries in a production database system and can be effectively exploited by the optimizer, however there are implementation issues that must be resolved to guarantee robust performance.

References

  1. Afrati, F. N. and Ullman, J. D. (2011). Optimizing Multiway Joins in a Map-Reduce Environment. IEEE Trans. Knowl. Data Eng., 23(9):1282-1298.
  2. Albutiu, M.-C., Kemper, A., and Neumann, T. (2012). Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems. PVLDB, 5(10):1064- 1075.
  3. Bizarro, P. and DeWitt, D. J. (2006). Adaptive and Robust Query Processing with SHARP. Technical Report Technical Report 1562, University of Wisconsin.
  4. Blanas, S., Li, Y., and Patel, J. M. (2011). Design and evaluation of main memory hash join algorithms for multicore CPUs. In SIGMOD Conference, pages 37-48.
  5. DeWitt, D., Katz, R., Olken, F., Shapiro, L., Stonebraker, M., and Wood, D. (1984). Implementation Techniques for Main Memory Database Systems. In ACM SIGMOD, pages 1-8.
  6. DeWitt, D. and Naughton, J. (1995). Dynamic Memory Hybrid Hash Join. Technical report, University of Wisconsin.
  7. Graefe, G. (1992). Five Performance Enhancements for Hybrid Hash Join. Technical Report CU-CS-606-92, University of Colorado at Boulder.
  8. Graefe, G., Bunker, R., and Cooper, S. (1998a). Hash Joins and Hash Teams in Microsoft SQL Server. In VLDB, pages 86-97.
  9. Graefe, G., Ewel, J., and Galindo-Legaria, C. (September 1998b). Microsoft SQL Server 7.0 Query Processor at msdn.microsoft.com/en-us/library/ aa226170(SQL.70).aspx. Technical report, Microsoft Corporation.
  10. Kemper, A., Kossmann, D., and Wiesner, C. (1999). Generalised Hash Teams for Join and Group-by. In VLDB, pages 30-41.
  11. Kitsuregawa, M., Nakayama, M., and Takagi, M. (1989). The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method. In VLDB, pages 257- 266.
  12. Lawrence, R. (2008). Using Slice Join for Efficient Evaluation of Multi-Way Joins. Data and Knowledge Engineering, 67(1):118-139.
  13. Microsoft Corporation (May 2001). Description of Service Pack 1 for SQL Server 2000 at http://support. microsoft.com/kb/889553. Technical report, Microsoft Corporation.
  14. Moerkotte, G. and Neumann, T. (2008). Dynamic programming strikes back. In ACM SIGMOD, pages 539-552.
  15. Nakayama, M., Kitsuregawa, M., and Takagi, M. (1988). Hash-partitioned join method using dynamic destaging strategy. In VLDB, pages 468-478.
  16. TPC (2013). TPC-H Benchmark. Technical report, Transaction Processing Performance Council.
  17. Viglas, S., Naughton, J., and Burger, J. (2003). Maximizing the Output Rate of Multi-Way Join Queries over Streaming Information Sources. In VLDB, pages 285- 296.
  18. Walton, C. B., Dale, A. G., and Jenevein, R. M. (1991). A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins. In VLDB, pages 537-548.
  19. Zhang, X., Chen, L., and Wang, M. (2012). Efficient Multi-way Theta-Join Processing Using MapReduce. PVLDB, 5(11):1184-1195.
Download


Paper Citation


in Harvard Style

Henderson M. and Lawrence R. (2013). Are Multi-way Joins Actually Useful? . In Proceedings of the 15th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-8565-59-4, pages 13-22. DOI: 10.5220/0004412100130022


in Bibtex Style

@conference{iceis13,
author={Michael Henderson and Ramon Lawrence},
title={Are Multi-way Joins Actually Useful?},
booktitle={Proceedings of the 15th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2013},
pages={13-22},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004412100130022},
isbn={978-989-8565-59-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 15th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Are Multi-way Joins Actually Useful?
SN - 978-989-8565-59-4
AU - Henderson M.
AU - Lawrence R.
PY - 2013
SP - 13
EP - 22
DO - 10.5220/0004412100130022