DataCommandr: Column-oriented Data Integration, Transformation and Analysis

Alexandr Savinov

Abstract

In this paper, we describe a novel approach to data integration, transformation and analysis, called DataCommandr. Its main distinguishing feature is that it is based on operations with columns rather than operations with tables in the relational model or operations with cells in spreadsheet applications. This data processing model is free of such typical set operations like join, group-by or map-reduce which are difficult to comprehend and slow at run time. Due to this ability to easily describe rather complex transformations and high performance on analytical workflows, this approach can be viewed as an alternative to existing technologies in the area of ad-hoc and agile data analysis.

References

  1. Atzeni, P., Jensen, C.S., Orsi, G., Ram, S., Tanca, L., & Torlone, R., 2013. The relational model is dead, SQL is dead, and I don't feel so good myself. ACM SIGMOD Record, 42(2), 64-68.
  2. Abadi, D.J., 2007. Column stores for wide and sparse data. In Proceedings of the Conference on Innovative Data Systems Research (CIDR), 292-297.
  3. Boncz, P. (Ed.), 2012. Column store systems [Special issue]. IEEE Data Eng. Bull., 35(1).
  4. Chaudhuri, S., Dayal, U. & Narasayya, V., 2011. An overview of Business Intelligence technology. Communications of the ACM, 54(8), 88-98.
  5. Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C., 2009. Mad skills: New analysis practices for big data. In Proc. 35th International Conference on Very Large Data Bases (VLDB 2009), 1481-1492.
  6. Copeland, G.P., Khoshafian, S.N., 1985. A decomposition storage model. In SIGMOD 1985, 268-279.
  7. Dean, J, Ghemawat, S., 2004. MapReduce: Simplified data processing on large clusters. In Sixth Symposium on Operating System Design and Implementation (OSDI'04), 137-150.
  8. Kandel, S., Paepcke, A., Hellerstein, J., Heer, J., 2011. Wrangler: Interactive Visual Specification of Data Transformation Scripts. In Proc. ACM Human Factors in Computing Systems (CHI), 3363-3372.
  9. Krawatzeck, R., Dinter, B., Thi D.A.P., 2015. How to Make Business Intelligence Agile: The Agile BI Actions Catalog. In Proceedings of the 48th Hawaii International Conference on System Sciences (HICSS'2015), 4762-4771.
  10. Morton, K., Balazinska, M., Grossman, D., Mackinlay, J., 2014. Support the Data Enthusiast: Challenges for Next-Generation Data-Analysis Systems. In Proc. VLDB Endowment 7(6), 453-456.
  11. Russo, M., Ferrari, A., Webb, C., 2012. Microsoft SQL Server 2012 Analysis Services: The BISM Tabular Model. Microsoft Press.
  12. Savinov, A., 2014a. ConceptMix: Self-Service Analytical Data Integration based on the Concept-Oriented Model. In Proc. 3rd International Conference on Data Technologies and Applications (DATA 2014), 78-84.
  13. Savinov, A., 2014b. Concept-oriented model. In J. Wang (Ed.), Encyclopedia of Business Analytics and Optimization. IGI Global, 502-511.
  14. Savinov, A., 2012a. References and arrow notation instead of join operation in query languages. Computer Science Journal of Moldova (CSJM), 20(3), 313-333.
  15. Savinov, A., 2012b. Concept-oriented model: Classes, hierarchies and references revisited. Journal of Emerging Trends in Computing and Information Sciences, 3(4), 456-470.
  16. Zaharia, M., Chowdhury, M., Das, T. et al., 2012. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In Proc. 9th USENIX conference on Networked Systems Design and Implementation (NSDI'12).
Download


Paper Citation


in Harvard Style

Savinov A. (2016). DataCommandr: Column-oriented Data Integration, Transformation and Analysis . In Proceedings of the International Conference on Internet of Things and Big Data - Volume 1: IoTBD, ISBN 978-989-758-183-0, pages 339-347. DOI: 10.5220/0005877203390347


in Bibtex Style

@conference{iotbd16,
author={Alexandr Savinov},
title={DataCommandr: Column-oriented Data Integration, Transformation and Analysis},
booktitle={Proceedings of the International Conference on Internet of Things and Big Data - Volume 1: IoTBD,},
year={2016},
pages={339-347},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005877203390347},
isbn={978-989-758-183-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Internet of Things and Big Data - Volume 1: IoTBD,
TI - DataCommandr: Column-oriented Data Integration, Transformation and Analysis
SN - 978-989-758-183-0
AU - Savinov A.
PY - 2016
SP - 339
EP - 347
DO - 10.5220/0005877203390347