FunctionGuard - A Query Engine for Expensive Scientific Functions in Relational Databases

Anh Pham, Mohamed Eltabakh

Abstract

Expensive user-defined functions impose unique challenges to database management systems at query time. This is mostly due to the black-box nature of these functions, the in-ability to optimize their internals, and the potential inefficiency of the common optimization heuristics, e.g., “selection-push-down’. Moreover, the in- creasing diversity of modern scientific applications that depend on DBMSs and, at the same time, extensively use expensive UDFs is mandating the design and development of efficient techniques to support these expensive functions. In this paper, we propose the “FunctionGuard” system that leverages disk-based persistent caching in novel ways to achieve across-queries optimizations for expensive UDFs. The unique features of FunctionGuard include: (1) Dynamic extraction of dependencies between the UDFs and the data sources and identifying the potential cacheable functions, (2) Cache-aware query optimization through newly introduced query operators, (3) Proactive cache refreshing that partially migrates the cost of the expensive calls from the query time to the idle and under-utilized times, and (4) Integration with the state-of-art techniques that generate efficient query plans under the presence of expensive functions. The system is implemented within PostgreSQL DBMS, and the results show the effectiveness of the proposed algorithms and optimizations.

References

  1. Chang, K. C.-C. and Hwang, S.-w. (2002). Minimal probing: Supporting expensive predicates for top-k queries. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pages 346-357.
  2. Chaudhuri, S. and Gravano, L. (1996). Optimizing queries over multimedia repositories. pages 91-102.
  3. Chaudhuri, S., Narasayya, V., and Sarawagi, S. (2002). Efficient evaluation of queries with mining predicates. In ICD, pages 529-540.
  4. Chaudhuri, S. and Shim, K. (1993). Query Optimization in the Presence of Foreign Functions. In Proceedings of the 19th International Conference on Very Large Data Bases, VLDB 7893, pages 529-542.
  5. Chaudhuri, S. and Shim, K. (1996). Optimization of queries with user-defined predicates. In ACM Transactions on Database Systems, pages 87-98.
  6. Denny, M. and Franklin, M. (2006). Operators for expensive functions in continuous queries. In Data Engineering, 2006. ICDE 7806. Proceedings of the 22nd International Conference on, pages 147-147.
  7. Gray, J., Liu, D. T., Nieto-Santisteban, M., Szalay, A., DeWitt, D. J., and Heber, G. (2005). Scientific Data Management in the Coming Decade. SIGMOD Rec., 34(4):34-41.
  8. Haas, L., Schwarz, P., Kodali, P., Kotlar, E., Rice, J., and Swope, W. (2001). Discoverylink: A system for integrated access to life sciences data sources. IBM Systems Journal, 40(2):489-511.
  9. Hanson, E. N., Carnes, C., Huang, L., Konyala, M., Noronha, L., Parthasarathy, S., Park, J., and Vernon, A. (1999). Scalable trigger processing. In In Proceedings of the 15th International Conference on Data Engineering (ICDE), pages 266-275.
  10. Hellerstein, J. M. (1994). Practical predicate placement. In In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 325-335.
  11. Hellerstein, J. M. (1998). Optimization techniques for queries with expensive methods. ACM Transactions on Database Systems (TODS.
  12. Hellerstein, J. M. and Naughton, J. F. (1996). Query Execution Techniques for Caching Expensive Methods. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, SIGMOD 7896, pages 423-434.
  13. Hellerstein, J. M. and Stonebraker, M. (1993). Predicate migration: Optimizing queries with expensive predicates. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 267-276.
  14. Scheufele, W. and Moerkotte, G. (1998). Efficient dynamic programming algorithms for ordering expensive joins and selections. In In Proc. of EDBT, pages 201-215.
  15. Zhang, Y., Yu, L., Zhang, X., Wang, S., and Li, H. (2012). Optimizing queries with expensive video predicates in cloud environment. Concurr. Comput. : Pract. Exper., 24(17):2102-2119.
Download


Paper Citation


in Harvard Style

Pham A. and Eltabakh M. (2014). FunctionGuard - A Query Engine for Expensive Scientific Functions in Relational Databases . In Proceedings of 3rd International Conference on Data Management Technologies and Applications - Volume 1: DATA, ISBN 978-989-758-035-2, pages 95-106. DOI: 10.5220/0004992300950106


in Bibtex Style

@conference{data14,
author={Anh Pham and Mohamed Eltabakh},
title={FunctionGuard - A Query Engine for Expensive Scientific Functions in Relational Databases},
booktitle={Proceedings of 3rd International Conference on Data Management Technologies and Applications - Volume 1: DATA,},
year={2014},
pages={95-106},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004992300950106},
isbn={978-989-758-035-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of 3rd International Conference on Data Management Technologies and Applications - Volume 1: DATA,
TI - FunctionGuard - A Query Engine for Expensive Scientific Functions in Relational Databases
SN - 978-989-758-035-2
AU - Pham A.
AU - Eltabakh M.
PY - 2014
SP - 95
EP - 106
DO - 10.5220/0004992300950106