R-Pref: Rapid Prototyping of Database Preference Queries in R

Patrick Roocks, Werner Kießling

Abstract

Preferences are a well-established framework for database queries with soft constraints. Such queries select the best objects from large data sets according to a strict partial order induced by intuitive and semantically rich preference constructors. Together with functionality like grouping and aggregation, adapted from well-known database mechanisms, a very flexible preference framework has emerged in the last decade. In this paper we present R-Pref, an implementation of the preference framework in the statistical computing language R. R-Pref comprises less than 1000 lines of code and adheres to the formal foundations of preferences. It allows rapid prototyping of new preferences and related concepts. Exemplarily we present a use case in which a simple text mining example based on pattern matching is enriched by preferences. We argue that R-Pref paves the way for rapidly exploring new fields of application for preferences. Especially new semantic constructs for preference related operations together with equivalences of preference terms, being highly important for optimization, can be quickly evaluated.

References

  1. Chomicki, J. (2003). Preference Formulas in Relational Queries. In TODS 7803: ACM Transactions on Database Systems, volume 28, pages 427-466, New York, NY, USA. ACM Press.
  2. Csardi, G. and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems:1695.
  3. Feinerer, I., Hornik, K., and Meyer, D. (2008). Text Mining Infrastructure in R. Journal of Statistical Software, 25(5):1-54.
  4. Grothendieck, G. (2012). sqldf: Perform SQL Selects on R Data Frames. R package version 0.4-6.4.
  5. Hafenrichter, B. and Kießling, W. (2005). Optimization of Relational Preference Queries. In Proceedings of the 16th Australasian database conference - Volume 39, ADC 7805, pages 175-184, Darlinghurst, Australia, Australia. Australian Computer Society, Inc.
  6. Kießling, W. (2002). Foundations of Preferences in Database Systems. In VLDB 7802: Proceedings of the 28th International Conference on Very Large Data Bases, pages 311-322, Hong Kong, China. VLDB.
  7. Kießling, W. (2005). Preference Queries with SVSemantics. In Haritsa, J. R. and Vijayaraman, T. M., editors, COMAD 7805: Advances in Data Management 2005, Proceedings of the 11th International Conference on Management of Data, pages 15-26, Goa, India. Computer Society of India.
  8. Kießling, W., Endres, M., and Wenzel, F. (2011). The Preference SQL System - An Overview. Bulletin of the Technical Commitee on Data Engineering, IEEE Computer Society, 34(2):11-18.
  9. R Core Team (2012). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
  10. Roocks, P. (2013). R-Pref Documentation, Sources and use case http://ursaminor.informatik.uni-augsburg.de/ trac/wiki/R-Pref.
  11. Stefanidis, K., Koutrika, G., and Pitoura, E. (2011). A Survey on Representation, Composition and Application of Preferences in Database Systems. ACM Transaction on Database Systems, 36(4).
  12. Urbanek, S. (2012). RJDBC: Provides access to databases through the JDBC interface. R package version 0.2-1.
  13. Urbanek, S. (2013). Rserve: Binary R server. R package version 0.6-8.1.
  14. Zhang, W., Yoshida, T., and Tang, X. (2011). A comparative study of TF*IDF, LSI and multi-words for text classification. Expert Systems with Applications, 38(3):2758 - 2765.
Download


Paper Citation


in Harvard Style

Roocks P. and Kießling W. (2013). R-Pref: Rapid Prototyping of Database Preference Queries in R . In Proceedings of the 2nd International Conference on Data Technologies and Applications - Volume 1: DATA, ISBN 978-989-8565-67-9, pages 104-111. DOI: 10.5220/0004590301040111


in Bibtex Style

@conference{data13,
author={Patrick Roocks and Werner Kießling},
title={R-Pref: Rapid Prototyping of Database Preference Queries in R},
booktitle={Proceedings of the 2nd International Conference on Data Technologies and Applications - Volume 1: DATA,},
year={2013},
pages={104-111},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004590301040111},
isbn={978-989-8565-67-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Conference on Data Technologies and Applications - Volume 1: DATA,
TI - R-Pref: Rapid Prototyping of Database Preference Queries in R
SN - 978-989-8565-67-9
AU - Roocks P.
AU - Kießling W.
PY - 2013
SP - 104
EP - 111
DO - 10.5220/0004590301040111