From Static to Agile - Interactive Particle Physics Analysis in the SAP HANA DB

David Kernert, Norman May, Michael Hladik, Klaus Werner, Wolfgang Lehner

2015

Abstract

In order to confirm their theoretical assumptions, physicists employ Monte-Carlo generators to produce millions of simulated particle collision events and compare them with the results of the detector experiments. The traditional, static analysis workflow of physicists involves creating and compiling a C++ program for each study, and loading large data files for every run of their program. To make this process more interactive and agile, we created an application that loads the data into the relational in-memory column store DBMS SAP HANA, exposes raw particle data as database views and offers an interactive web interface to explore this data. We expressed common particle physics analysis algorithms using SQL queries to benefit from the inherent scalability and parallelization of the DBMS. In this paper we compare the two approaches, i.e. manual analysis with C++ programs and interactive analysis with SAP HANA. We demonstrate the tuning of the physical database schema and the SQL queries used for the application. Moreover, we show the web-based interface that allows for interactive analysis of the simulation data generated by the EPOS Monte-Carlo generator, which is developed in conjunction with the ALICE experiment at the Large Hadron Collider (LHC), CERN.

References

  1. Ailamaki, A., Kantere, V., and Dash, D. (2010). Managing Scientific Data. Commun. ACM, 53(6):68-78.
  2. Baumann, P., Dehmel, A., Furtado, P., Ritsch, R., and Widmann, N. (1998). The Multidimensional Database System RasDaMan. SIGMOD Rec., 27(2):575-577.
  3. Brun, R. and Rademakers, F. (1997). ROOT: An object oriented data analysis framework. Nucl.Instrum.Meth., A389:81-86.
  4. CERN (2014). About Cern - Computing. http://home. web.cern.ch/about/computing.
  5. Cranshaw, J., Doyle, A., Kenyon, M., and Malon, D. (2008). Integration of the ATLAS Tag Database with Data Management and Analysis Components. J. Phys.: Conf. Ser.
  6. Drescher, H., Hladik, M., Ostapchenko, S., Pierog, T., and Werner, K. (2001). Parton-based Gribov-Regge Theory. Physics Reports, 350:93-289.
  7. Durham University (2014). The Durham HepData Project. http://hepdata.cedar.ac.uk/.
  8. Eich, M. and Moerkotte, G. (2015). Dynamic programming: The next step. In ICDE.
  9. Färber, F., May, N., Lehner, W., Große, P., Müller, I., Rauhe, H., and Dees, J. (2012). The SAP HANA Database - An Architecture Overview. IEEE Data Eng. Bull., 35(1):28-33.
  10. Große, P., Lehner, W., Weichert, T., Färber, F., and Li, W. (2011). Bridging two worlds with RICE integrating R into the SAP in-memory computing engine. PVLDB, 4(12):1307-1317.
  11. Karpathiotakis, M., Branco, M., Alagiannis, I., and Ailamaki, A. (2014). Adaptive Query Processing on RAW Data. PVLDB, 7(12):1119-1130.
  12. Malon, D., Cranshaw, J., van Gemmeren, P., and Zhang, Q. (2011). Emerging Database Technologies and Their Applicability to High Energy Physics: A First Look at SciDB. J. Phys.: Conf. Ser.
  13. Malon, D., van Gemmeren, P., and Weinstein, J. (2012). An exploration of SciDB in the context of emerging technologies for data stores in particle physics and cosmology. J. Phys.: Conf. Ser.
  14. May, N., Böhm, A., Block, M., and Lehner, W. (2014). Beyond SQL: Query processing lifecyle in the SAP HANA Database Platform. In submitted for publication.
  15. Neumann, T., Helmer, S., and Moerkotte, G. (2005). On the optimal ordering of maps and selections under factorization. In ICDE.
  16. SAP Fiori (2014). SAP Fiori for SAP Business Suite. http: //help.sap.com/fiori.
  17. Stonebraker, M., Becla, J., DeWitt, D. J., Lim, K., Maier, D., Ratzesberger, O., and Zdonik, S. B. (2009). Requirements for science data bases and SciDB. In CIDR.
Download


Paper Citation


in Harvard Style

Kernert D., May N., Hladik M., Werner K. and Lehner W. (2015). From Static to Agile - Interactive Particle Physics Analysis in the SAP HANA DB . In Proceedings of 4th International Conference on Data Management Technologies and Applications - Volume 1: DATA, ISBN 978-989-758-103-8, pages 16-25. DOI: 10.5220/0005503700160025


in Bibtex Style

@conference{data15,
author={David Kernert and Norman May and Michael Hladik and Klaus Werner and Wolfgang Lehner},
title={From Static to Agile - Interactive Particle Physics Analysis in the SAP HANA DB},
booktitle={Proceedings of 4th International Conference on Data Management Technologies and Applications - Volume 1: DATA,},
year={2015},
pages={16-25},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005503700160025},
isbn={978-989-758-103-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of 4th International Conference on Data Management Technologies and Applications - Volume 1: DATA,
TI - From Static to Agile - Interactive Particle Physics Analysis in the SAP HANA DB
SN - 978-989-758-103-8
AU - Kernert D.
AU - May N.
AU - Hladik M.
AU - Werner K.
AU - Lehner W.
PY - 2015
SP - 16
EP - 25
DO - 10.5220/0005503700160025