COMPRESSED DATABASE STRUCTURE TO MANAGE LARGE SCALE DATA IN A DISTRIBUTED ENVIRONMENT

B. M. Monjurul Alom, Frans Henskens, Michael Hannaford

Abstract

Loss-less data compression is attractive in database systems as it may facilitate query performance improvement and storage reduction. Although there are many compression techniques which handle the whole database in main memory, problems arise when the amount of data increases gradually over time, and also when the data has high cardinality. Management of a rapidly evolving large volume of data in a scalable way is very challenging. This paper describes a disk based single vector large data cardinality approach, incorporating data compression in a distributed environment. The approach provides substantial storage performance improvement compared to other high performance database systems. The compressed database structure presented provides direct addressability in a distributed environment, thereby reducing retrieval latency when handling large volumes of data.

References

  1. Agrawl, R., A. Somani, et al., 2001. Storage and Querying of E-commerce Data. The 27th International Conference on Very Large Databases(VLDB). Roma, Italy.
  2. Alkhatib, G. and R. S. Labban., 1995. "Transaction Management in Distributed Database Systems: the Case of Oracle's Two-Phase Commit." The Journal of Information Systems Education 13:2: 95-103.
  3. Chang, F., J. Dean, et al., 2006. Bigtable: A Distributed Storage System for Structured Data. The International Conference on Operating Systems Design and Implementation (OSDI). Seattle, Wa, USA.
  4. Cockshott, W. P., D. Mcgregor, et al., 1998. "HighPerformance Operations Using a Compressed Database Architectureā€¯. The Computer Journal 41:5: 283-296.
  5. Garcia-Molina, H. and K. Salem., 1992. "Main Memory Database Systems: An Overview " IEEE Transaction on Knowledge and Data Engineering 4:6: 509-516.
  6. Hoque, A. S. M. L., 2002. Storage and Querying of High Dimensional Sparsely Populated Data in Compressed Representation. Euro-Asia ICT. LNCS 2510.
  7. Hoque, A. S. M. l., D. McGregor, et al., 2002. Database compression using an off-line dictionary method. The Second International Conference on Advances in Information Systems (ADVIS). LNCS 2457, Springer Verlag Berlin Heidelberg.
  8. Lawrence, R. and A. Kruger., 2005. An Architecture for Real-T'ime Warehousing of Scientific Data. The International Conference on Scientific Computing (ICSC). Vegus, Nevada.
  9. Lawrence, R. and A. Kruger., 2005. An Architecture for Real-Time Warehousing of Scientific Data. The International Conference on Scientific Computing (ICSC). Vegus, Nevada, USA.
  10. Lee, I., H. Y. Yeom, et al., 2004. "A New Approach for Distributed Main Memory Database Systems: A Casual Commit Protocol." IEICE Trans. Inf. & Syst. 87:1 196-204.
  11. Lehman, T. J., E. J. Shekita, et al., 1992. "An Evaluation of Starburst's Memory Resident Storage Component." IEEE Transaction on Knowledge and Data Engineering: 555-566.
  12. Litwin, W., R. Moussa, et al. (2004). LH*RS: A Highly Available Distributed Data Storage The 30th International Conference on Very Large Databases Conference. Toronto, Canada.
  13. Poess, M. and D. Potapov., 2003. Data Compression in Oracle. The 29th International Conference on Very Large Databases(VLDB), Berlin, Germany.
  14. Pucheral, P., J.-M. Thevenin, et al., 1990. Efficient Main Memory Data Management using DBGraph Storage Model. The 16th International Conference on Very Large Databases(VLDB). Brisbase, Australia.
  15. Simonds, L., 2005. A Terabyte for your Desktop. The Maxtor Corporation Technical Report.
  16. Stonebraker, M., D. J. Abadi, et al., 2005. C-Store: A Column-Oriented DBMS. The 31st International Conference on very Large Dtabases (VLDB). Trondheim, Norway.
  17. Thakar, A., A. Szalay, et al., 2003. "Migrating a MultiTerabyte Archive from Object to Relational Databas." The Journal of Computing Science and Engineering 5:5 16-29.
Download


Paper Citation


in Harvard Style

M. Monjurul Alom B., Henskens F. and Hannaford M. (2008). COMPRESSED DATABASE STRUCTURE TO MANAGE LARGE SCALE DATA IN A DISTRIBUTED ENVIRONMENT . In Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT, ISBN 978-989-8111-53-1, pages 37-44. DOI: 10.5220/0001875600370044


in Bibtex Style

@conference{icsoft08,
author={B. M. Monjurul Alom and Frans Henskens and Michael Hannaford},
title={COMPRESSED DATABASE STRUCTURE TO MANAGE LARGE SCALE DATA IN A DISTRIBUTED ENVIRONMENT},
booktitle={Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT,},
year={2008},
pages={37-44},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001875600370044},
isbn={978-989-8111-53-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT,
TI - COMPRESSED DATABASE STRUCTURE TO MANAGE LARGE SCALE DATA IN A DISTRIBUTED ENVIRONMENT
SN - 978-989-8111-53-1
AU - M. Monjurul Alom B.
AU - Henskens F.
AU - Hannaford M.
PY - 2008
SP - 37
EP - 44
DO - 10.5220/0001875600370044