STATIC OPTIMIZATION OF DATA INTEGRATION PLANS IN GLOBAL INFORMATION SYSTEMS

Janusz R. Getta

2011

Abstract

Global information systems provide its users with a centralized and transparent view of many heterogeneous and distributed sources of data. The requests to access data at a central site are decomposed and processed at the remote sites and the results are returned back to a central site. A data integration component of the system processes data retrieved and transmitted from the remote sites accordingly to the earlier prepared data integration plans. This work addresses a problem of static optimization of data integration plans in a global information system. Static optimization means that a data integration plan is transformed into more optimal form before it is used for data integration. We adopt an online approach to data integration where the packets of data transmitted over a wide area network are integrated into the final result as soon as they arrive at a central site. We show how data integration expression obtained from a user request can be transformed into a collection of data integration plans, one for each argument of data integration expression. This work proposes a number of static optimization techniques that change an order operations, eliminate materialization and constant arguments from data integration plans implemented as relational algebra expressions.

References

  1. Amsaleg, L., Franklin, J., and Tomasic, A. (1998). Dynamic query operator scheduling for wide-area remote access. Journal of Distributed and Parallel Databases, 6:217-246.
  2. Antoshenkov, G. and Ziauddin, M. (2000). Query processing and optmization in oracle rdb. VLDB Journal, 5(4):229-237.
  3. Avnur, R. and Hellerstein, J. M. (2000). Eddies: Continuously adaptive query processing. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pages 261-272.
  4. Getta, J. R. (2000). Query scrambling in distributed multidatabase systems. In 11th Intl. Workshop on Database and Expert Systems Applications, DEXA 2000.
  5. Getta, J. R. (2005). On adaptive and online data integration. In Intl. Workshop on Self-Managing Database Systems, 21st Intl. Conf. on Data Engineering, ICDE'05, pages 1212-1220.
  6. Getta, J. R. (2006). Optimization of online data integration. In Seventh International Conference on Databases and Information Systems, pages 91-97.
  7. Getta, J. R. and Vossough, E. (2004). Optimization of data stream processing. SIGMOD Record, 33(3):34-39.
  8. Gounaris, A., Paton, N. W., Fernandes, A. A., and Sakellariou, R. (2002). Adaptive query processing: A survey. In Proceedings of 19th British National Conference on Databases, pages 11-25.
  9. Haas, P. J. and Hellerstein, J. M. (1999). Ripple joins for online aggregation. In SIGMOD 1999, Proceedings ACM SIGMOD Intl. Conf. on Management of Data, pages 287-298.
  10. Ives, Z. G., Florescu, D., Friedman, M., Levy, A. Y., and Weld, D. S. (1999). An adaptive query execution system for data integration. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, pages 299-310.
  11. Ives, Z. G., Halevy, A. Y., and Weld, D. S. (2004). Adapting to source properties in processing data integration queries. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data.
  12. Kabra, N. and DeWitt, D. J. (1998). Efficient mid-query reoptimization of sub-optimal query execution plans. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data.
  13. Mokbel, M. F., Lu, M., and Aref, W. G. (2002). Hash-merge join: A non-blocking join algorithm for producing fast and early join results.
  14. Ozcan, F., Nural, S., Koksal, P., Evrendilek, C., and Dogac, A. (1997). Dynamic query optimization in multidatabases. Bulletin of the Technical Committee on Data Engineering, 20:38-45.
  15. Raman, V., Deshpande, A., and Hellerstein, J. M. (2003). Using state modules for adaptive query processing. In Proceedings of the 19th International Conference on Data Engineering, pages 353-.
  16. Urhan, T. and Franklin, M. J. (2000). Xjoin: A reactivelyscheduled pipelined join operator. IEEE Data Engineering Bulletin 23(2), pages 27-33.
  17. Urhan, T. and Franklin, M. J. (2001). Dynamic pipeline scheduling for improving interactive performance of online queries. In Proceedings of International Conference on Very Large Databases, VLDB 2001.
Download


Paper Citation


in Harvard Style

R. Getta J. (2011). STATIC OPTIMIZATION OF DATA INTEGRATION PLANS IN GLOBAL INFORMATION SYSTEMS . In Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-8425-53-9, pages 141-150. DOI: 10.5220/0003423901410150


in Bibtex Style

@conference{iceis11,
author={Janusz R. Getta},
title={STATIC OPTIMIZATION OF DATA INTEGRATION PLANS IN GLOBAL INFORMATION SYSTEMS},
booktitle={Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2011},
pages={141-150},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003423901410150},
isbn={978-989-8425-53-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - STATIC OPTIMIZATION OF DATA INTEGRATION PLANS IN GLOBAL INFORMATION SYSTEMS
SN - 978-989-8425-53-9
AU - R. Getta J.
PY - 2011
SP - 141
EP - 150
DO - 10.5220/0003423901410150