MITIGATION OF LARGE-SCALE RDF DATA LOADING WITH THE EMPLOYMENT OF A CLOUD COMPUTING SERVICE
Hyun Namgoong, Harshit Kumar, Hong-Gee Kim
2010
Abstract
An expanding need for interoperability and structuralization of web data has made use of RDF (Resource Description Framework) plentiful. To guarantee a common usage of the data within various applications, several RDF stores providing data management services have been developed. Here, we represent a systematic approach to solve a late latency problem of data loading of the stores. It enables a fast loading performance for very large size of RDF data, and it is proven with an existing RDF store. This approach employs a cloud computing service and delegates preparation works to the machines which are temporarily borrowed at little payment. Our implementation for a native version of the Sesame RDF Repository was tested on LUBM 1000 University data (138 million triples), and it showed a local store loading time of 16.2 minutes with additional preparation time on a cloud service taking approximately an hour, which can be reduced by adding supplemental machines to the cluster.
References
- Bizer, C., Cyganiak, R., Heath, T., 2008. How to Publish Linked Data on the Web, Available at: http://www4.wiwiss.fu-berlin.de/bizer/pub/ LinkedData Tutorial/20070727/.
- Broekstra, J., Kampman, A., Harmelen., F., 2002. Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, International Semantic Web Conference (ISWC 2002).
- Guo, Y., Pan, Z., Heflin., J., 2005. LUBM: A Benchmark for OWL Knowledge Base Systems, Journal of Web Semantics3.
- Schmidt, M., Hornung, T., Küchlin, N., Lausen, G., Pinkel, C., 2008. An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario, International Semantic Web Conference (ISWC 2008).
- Schmidt, M., Hornung , T., Küchlin, N., Lausen, G., Pinkel, C., 2008. An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario, International Semantic Web Conference (ISWC 2008).
- Liu, B., Hu, B., 2005. An evaluation of RDF storage systems for large data applications, In Proceedings of the First International Conference on Semantics, Knowledge and Grid.
- Dean, J., Ghemawat, S., 2008. MapReduce: simplified data processing on large clusters, Communications of the ACM, v.51.
Paper Citation
in Harvard Style
Namgoong H., Kumar H. and Kim H. (2010). MITIGATION OF LARGE-SCALE RDF DATA LOADING WITH THE EMPLOYMENT OF A CLOUD COMPUTING SERVICE . In Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2010) ISBN 978-989-8425-29-4, pages 489-492. DOI: 10.5220/0003142204890492
in Bibtex Style
@conference{keod10,
author={Hyun Namgoong and Harshit Kumar and Hong-Gee Kim},
title={MITIGATION OF LARGE-SCALE RDF DATA LOADING WITH THE EMPLOYMENT OF A CLOUD COMPUTING SERVICE},
booktitle={Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2010)},
year={2010},
pages={489-492},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003142204890492},
isbn={978-989-8425-29-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2010)
TI - MITIGATION OF LARGE-SCALE RDF DATA LOADING WITH THE EMPLOYMENT OF A CLOUD COMPUTING SERVICE
SN - 978-989-8425-29-4
AU - Namgoong H.
AU - Kumar H.
AU - Kim H.
PY - 2010
SP - 489
EP - 492
DO - 10.5220/0003142204890492