PROGRAMMING THE KDD PROCESS USING XQUERY

Andrea Romei, Franco Turini

Abstract

XQuake is a language and system for programming data mining processes over native XML databases in the spirit of inductive databases. It extends XQuery to support KDD tasks. This paper focuses on the features required in the definition of the steps of the mining process. The main objective is to show the expressiveness of the language in handling mining operations as an extension of basic XQuery expressions. To this purpose, the paper offers an extended application in the field of analyzing web logs.

References

  1. Baralis, E., Garza, P., Quintarelli, E., and Tanca, L. (2007). Answering XML queries by means of data summaries. ACM Trans Info Syst, 25(3):1-10.
  2. Blockeel, H., Calders, T., Fromont, E., Goethals, B., Prado, A., and Robardet, C. (2008). An inductive database prototype based on virtual mining views. In KDD, pages 1061-1064, New York, NY, USA. ACM.
  3. Euler, T., Klinkenberg, R., Mierswa, I., Scholz, M., and Wurst, M. (2006). YALE: rapid prototyping for complex data mining tasks. In KDD 7806, pages 935-940, Philadelphia, PA, USA.
  4. Holupirek, A., GrĂ¼n, C., and Scholl, M. (2009). BaseX and DeepFS - Joint Storage for Filesystem and Database. In EDBT, pages 1108-1111, Saint Petersburg, Russia. ACM.
  5. Meo, R. and Psaila, G. (2006). An XML-based database for knowledge discovery. In EDBT 7806, pages 814-828, Munich, Germany.
  6. Romei, A., Ruggieri, S., and Turini, F. (2006). KDDML: a middleware language and system for knowledge discovery in databases. Data Knowl. Eng., 57(2):179- 220.
  7. Romei, A. and Turini, F. (2010). XML data mining. Softw., Pract. Exper., 40(2):101-130.
  8. Romei, A. and Turini, F. (2011a). Inductive database languages: requirements and examples. Knowl. Inf. Syst., 26(3):351-384.
  9. Romei, A. and Turini, F. (2011b). Programming the KDD process using XQuery. Technical Report (extended version) TR-11-10, University of Pisa, Department of Computer Science.
  10. Schmidt, A., Waas, F., Kersten, M., Carey, M. J., Manolescu, I., and Busse, R. (2002). XMark: a benchmark for XML data management. In VLDB, pages 974-985.
  11. The Data Mining Group (2011). The Predictive Model Markup Language (PMML). Version 4.0.1. www.dmg.org/pmml-v4-0-1.html.
  12. W3C (2010). XQuery 3.0: An XML Query Language. W3C Working Draft 14 December 2010. www.w3.org/TR/xquery-30/.
Download


Paper Citation


in Harvard Style

Romei A. and Turini F. (2011). PROGRAMMING THE KDD PROCESS USING XQUERY . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 123-131. DOI: 10.5220/0003626501310139


in Bibtex Style

@conference{kdir11,
author={Andrea Romei and Franco Turini},
title={PROGRAMMING THE KDD PROCESS USING XQUERY},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={123-131},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003626501310139},
isbn={978-989-8425-79-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - PROGRAMMING THE KDD PROCESS USING XQUERY
SN - 978-989-8425-79-9
AU - Romei A.
AU - Turini F.
PY - 2011
SP - 123
EP - 131
DO - 10.5220/0003626501310139