Low Level Big Data Compression
Jaime Salvador-Meneses, Zoila Ruiz-Chavez, Jose Garcia-Rodriguez
2018
Abstract
In the last years, some specialized algorithms have been developed to work with categorical information, however the performance of these algorithms has two important factors to consider: the processing technique (algorithm) and the representation of information used. Many of the machine learning algorithms depend on whether the information is stored in memory, local or distributed, prior to processing. Many of the current compression techniques do not achieve an adequate balance between the compression ratio and the decompression speed. In this work we propose a mechanism for storing and processing categorical information by compression at the bit level, the method proposes a compression and decompression by blocks, with which the process of compressed information resembles the process of the original information. The proposed method allows to keep the compressed data in memory, which drastically reduces the memory consumption. The experimental results obtained show a high compression ratio, while the block decompression is very efficient. Both factors contribute to build a system with good performance.
DownloadPaper Citation
in Harvard Style
Salvador-Meneses J., Ruiz-Chavez Z. and Garcia-Rodriguez J. (2018). Low Level Big Data Compression. In Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 1: KDIR; ISBN 978-989-758-330-8, SciTePress, pages 353-358. DOI: 10.5220/0007228003530358
in Bibtex Style
@conference{kdir18,
author={Jaime Salvador-Meneses and Zoila Ruiz-Chavez and Jose Garcia-Rodriguez},
title={Low Level Big Data Compression},
booktitle={Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 1: KDIR},
year={2018},
pages={353-358},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007228003530358},
isbn={978-989-758-330-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 1: KDIR
TI - Low Level Big Data Compression
SN - 978-989-758-330-8
AU - Salvador-Meneses J.
AU - Ruiz-Chavez Z.
AU - Garcia-Rodriguez J.
PY - 2018
SP - 353
EP - 358
DO - 10.5220/0007228003530358
PB - SciTePress