RDF Doctor: A Holistic Approach for Syntax Error Detection and Correction of RDF Data

Ahmad Hemid, Lavdim Halilaj, Abderrahmane Khiat, Steffen Lohmann


Over the years, the demand for interoperability support between diverse applications has significantly increased. The Resource Definition Framework (RDF), among other solutions, is utilized as a data modeling language which allows for encoding the knowledge from various domains in a unified representation. Moreover, a vast amount of data from heterogeneous data sources are continuously published in documents using the RDF format. Therefore, these RDF documents should be syntactically correct in order to enable software agents performing further processing. Albeit, a number of approaches have been proposed for ensuring error-free RDF documents, commonly they are not able to identify all syntax errors at once by failing on the first encountered error. In this paper, we tackle the problem of simultaneous error identification, and propose RDF-Doctor, a holistic approach for detecting and resolving syntactic errors in a semi-automatic fashion. First, we define a comprehensive list of errors that can be detected along with customized error messages to allow users for a better understanding of the actual errors. Next, a subset of syntactic errors is corrected automatically based on matching them with predefined error messages. Finally, for a particular number of errors, customized and meaningful messages are delivered to users to facilitate the manual corrections process. The results from empirical evaluations provide evidence that the presented approach is able to effectively detect a wide range of syntax errors and automatically correct a large subset of them.


Paper Citation