Qsense

Learning Semantic Web Concepts by Querying DBpedia

Andrei Panu, Sabin C. Buraga and Lenuta Alboaie

Faculty of Computer Science, Alexandru Ioan Cuza University, General Berthelot 16, Iasi, Romania

Keywords:

Semantic Web, DBpedia, Linked Data, SPARQL, Knowledge, Learning Tool, Web Interaction.

Abstract:

This paper describes Qsense – an educational and research Web application, developed to ease the interrogation

of DBpedia for users intending to learn the most important concepts regarding the Semantic Web, especially

to increase the knowledge about DBpedia in a pragmatic way. Additionally, by providing a compelling Web

interface, Qsense offers the possibility to explore the DBpedia’s ontology structure and to practice various

SPARQL queries by expert and especially nonexpert users.

1 INTRODUCTION

Qsense is an educational and research Web project,

offering an easy to use interface for querying DBpe-

dia (DBpedia, 2013a) in order to learn to work with

large amounts of semantic structured data. DBpe-

dia is one of the most important thesauruses whose

focus is “on the task of converting Wikipedia con-

tent into structured knowledge, such that Semantic

Web techniques can be employed against it, asking

sophisticated queries against Wikipedia, linking it to

other datasets on the Web, creating new applications

or mashups” (Auer et al., 2007).

Because DBpedia contains such a large amount

of information, structured in a very complex man-

ner, some people – especially the beginners like stu-

dents enrolled at the courses concerning semantic

Web, knowledge managements or related topics –

may have trouble understanding and working with it.

We decided to look for a common set of vocabularies,

classes, and properties. The DBpedia project had at

this time already developed the interlinking hub for

the Linking Open Data projects, so it was the obvious

choice to rely on DBpedia identiﬁers – according to

the linked data initiative (Bizer et al., 2009).

DBpedia also provides structured data about con-

cepts and the relations established between them.

This aspect offers a strong foundation to automat-

ically provide rich relationships between terms the

user may need.

We initiated Qsense project with the following

three main goals in mind:

• providing intuitive means for learning DBpedia

structure and, additionally, semantic Web tech-

nologies (data modeling via RDF, querying via

SPARQL, knowledge modeling by using vocabu-

laries, thesauri, and ontologies);

• getting the right information fast and easy without

knowing any special query language;

• practicing certain SPARQL (W3C, 2013) queries

for the ones who have the proper knowledge.

The paper continues with an overview regarding the

semantic Web technologies and DBpedia. Section 3

details the Qsense requirements and features. The ar-

chitecture of the project is presented by section 4, fol-

lowed by several use cases. At the end, we provide

conclusions and future directions of research.

2 DBpedia – A SUCCESSFUL

SEMANTIC WEB PROJECT

The main purpose of Semantic Web is to enrich ex-

isting content on the Web with meaning in order for

machines to understand it like humans to. According

to (Bizer et al., 2009), “the ﬁrst step is putting data on

the Web in a form that machines can naturally under-

stand, or converting it to that form”. This creates the

Semantic Web – “a web of data that can be processed

directly or indirectly by machines”.

To model available data on the Web, a standard-

ized graph-based format is used: RDF (Resource De-

scription Framework) (W3C, 2004). Each fact is de-

noted by a triple statement (subject, predicate, ob-

ject), consisting of Web addresses (e.g., URLs). Also,

351

Panu A., C. Buraga S. and Alboaie L..

Qsense - Learning Semantic Web Concepts by Querying DBpedia.

DOI: 10.5220/0004610603510356

In Proceedings of the 4th International Conference on Data Communication Networking, 10th International Conference on e-Business and 4th

International Conference on Optical Communication Systems (ICE-B-2013), pages 351-356

ISBN: 978-989-8565-72-3

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

a query language was developed – SPARQL, which

can be used to “express queries across diverse data

sources, whether the data is stored natively as RDF or

viewed as RDF via middleware” (W3C, 2013).

The huge amount of such linked data be-

came one of the main showcases for successful

community-driven adoption of Semantic Web tech-

nologies (Berners-Lee et al., 2001). It aims at devel-

oping best practices to opening up the data thesaurus

on the Web, interlinking open data sets on the Web

and permitting Web developers to make use of that

rich source of information.

Additionally, the machine-understandable data

made available and the existing practices and tech-

nologies provide beneﬁts for both end-users and en-

terprises. Knowledge bases have an important role in

enhancing the intelligence of Web and in supporting

information integration (DBpedia, 2013a).

DBpedia is one of the main projects of the se-

mantic Web, together with FOAF (Friend Of A

Friend) (Brickley and Miller, 2000), SIOC (Seman-

tically Interconnected Online Communities) (Breslin

and Bojars, 2004), GoPubMed (Doms and Schroeder,

2005) and NextBio (NextBio, 2013). It extracts struc-

tured information from Wikipedia and makes this in-

formation accessible on the Web under an open li-

cence terms. Wikipedia is an important source of in-

formation nowadays and many Web sites tend to pro-

vide information extracted from it, in a semantic and

meaningful way. We use DBpedia to provide a com-

mon knowledge model – expressed by a controlled

vocabulary and various well-known ontologies – and

a stabile service (endpoint), in order to give useful in-

formation to our users.

The data contained in DBpedia is quite impres-

sive. As stated on the Web site (DBpedia, 2013a),

the English version of the DBpedia knowledge base

currently describes 3.77 million things, out of which

2.35 million are classiﬁed in a consistent ontology, in

different categories like places, creative works, orga-

nizations, species, and many others. It provides lo-

calization in more than 100 languages and all these

versions together describe 20.8 million things, out of

which 10.5 million overlap (are interlinked) with con-

cepts from the English DBpedia. The full dataset con-

sists of 1.89 billion pieces of information stored as

RDF triples.

Unfortunately, there are few initiatives concern-

ing the querying and exploring such as big amount

of structured data with beneﬁts for end-users or neo-

phytes. Thus, it is a real challenge to work with all

this data, to understand DBpedia inner structure and

to reuse it in personal future semantic Web work. In

most cases, the existing applications require various

knowledge regarding semantic Web key technologies.

These facts raised the idea of our Qsense project.

The DBpedia data set enables quite astonishing query

answering possibilities against Wikipedia data. There

is a public SPARQL endpoint over the DBpedia data

set (DBpedia, 2013b). The endpoint is provided using

Virtuoso (OpenLink, 2013) as the back-end database

engine.

For making all kind of queries against DBpedia,

we need a strong approach with respect to the Linked

Data principles (Bizer et al., 2009), which means

we need a method of publishing RDF data on the

Web and of interlinking data between different data

sources. Linked Data on the Web can be accessed us-

ing semantic Web browsers (or hyperdata software),

just as the traditional Web of documents is accessed

using traditional browsers. By using semantic Web

technologies, users could navigate, browse, and vi-

sualize – in an intelligent manner – different data

sources by following self-described RDF links. Ad-

ditionally, these technologies are used to model the

human knowledge as ontologies, to be further auto-

matically managed – stored, queried, ﬁltered, trans-

formed, reused, etc. (Allemang and Hendler, 2011).

The information in Semantic Web is structured in

triples in the form of hsubject, predicate, objecti – the

RDF model. Altogether, the DBpedia dataset consists

of more than 100 million RDF triples and because of

this it is provided for download as a set of smaller

RDF ﬁles.

To understand and manually interrogate this large

amount of data is a very difﬁcult task, especially for

the beginner students studying the semantic Web or

related ﬁelds.

3 REQUIREMENTS

AND FEATURES

In the Semantic Web community, a very important

feature is what someone is looking for and how the

RDF document is structured. Learning structured in-

formation reduces the amount of work that users may

have to perform. Also, people who have knowledge

of RDF want to perform speciﬁc queries by using

SPARQL language.

One possibility is to download a speciﬁc software,

like Twinkle (Dodds, 2007) – which needs Java to be

installed –, or to use a SPARQL endpoint – a public

Web service.

For such semantic Web developers, we are pre-

senting Qsense – a solution that is tailored to every

very speciﬁc use case, taking into account the limited

information about the term and the language users

ICE-B2013-InternationalConferenceone-Business

352

are looking for. We made available some of the spe-

ciﬁc characteristics of DBpedia, constructing queries

based only on the subject of interest. The user can

ﬁnd out more about the subject with just one click.

The subject is represented by a keyword and, after the

user types it, Qsense is able to automatically suggest –

via Ajax (Asynchronous JavaScript And XML) – the

possible predicates (properties) and to give the object

value, if it exists in DBpedia. This is a straightforward

SPARQL query and the very ﬁrst step in Qsense.

Another goal of our developed project is to ease

the understanding of the internal structure of seman-

tic data in DBpedia. This means that the user starts

to deal with the DBpedia ontology. For this part, the

user should be able to learn the main notions regard-

ing the ontological model expressed by RDF Schema

and, furthermore, OWL language: “classes”, “sub-

classes”, “properties”, and “instances”.

In the traditional approach regarding the ontology

engineering, the experts add new data by carefully an-

alyzing other existing ontologies and ﬁtting their new

concepts into the existing hierarchy. In the emerging

Semantic Web approach, common users may inspect

the DBpedia ontology before creating their own con-

ceptual model – eventually, using a speciﬁc modeling

tool and/or a (formal) speciﬁcation. Once the user

had learned a bit of DBpedia structure, (s)he will be

able to build the desired ontology for a future seman-

tic Web project. This is another important aim of the

Qsense project.

By using Qsense, a user can also learn how to

use Wikipedia information according to his/her needs

and the application domain. We consider that DBpe-

dia project is a very important open source semantic

project and learning it by as many people as possible

is a real gain for the semantic Web community.

Additionally, Qsense offers support for more ex-

perienced people who are used to build and work

with RDF documents, too. The advanced users (e.g.,

data analysts, ontologists) can make any SPARQL

query using our Qsense online service. If the user

has his/her own RDF document and wants to make in-

terrogations on it, (s)he has the possibility to directly

upload it. This facilitates practicing SPARQL queries

on local or remote RDF documents through various

existing endpoints (i.e. RESTful Web services) – con-

sult (Vandenbussche, 2013).

Other speciﬁc projects concerning DBpedia are:

• DBpedia Mobile – a location-aware (mobile) Web

application designed for the use case of a tourist

exploring a city; users are not able to perform their

own queries (Becker and Bizer, 2008);

• RelFinder – an interactive tool for visualizing re-

lationships established between the entities of a

knowledge base like DBpedia; unfortunately, by

offering only a Flash UI, this application could not

be used on mobile devices (Heim et al., 2009);

• LodLive – a visualization Web application able to

explore DBpedia or Freebase; the users have ac-

cess only to the English version of the knowledge

base (Camarda et al., 2012).

4 Qsense ARCHITECTURE

Qsense consists of three main components that intro-

duce the user into the world of Semantic Web and as-

sists him/her in learning the basics by querying and

exploring the DBpedia knowledge base. The system’s

architecture is depicted in Figure 1.

4.1 DBpedia Interrogation Module

The ﬁrst Qsense component performs DBpedia in-

terrogations. The user has the ability to search for

any keyword (e.g., the name of a concept and/or in-

dividual such as artefact, country, domain, person,

etc.). The system automatically constructs a SPARQL

statement which is sent to DBpedia, then the result

is parsed and displayed in a friendly manner. The

user does not see the interrogation, (s)he just enters

the keyword and inspects the result, which repre-

sents, technically, the (abstract) information from the

Wikipedia page related to the searched keyword.

To retrieve data from DBpedia, Qsense makes the

following SPARQL query:

PREFIX res:http://dbpedia.org/resource/

PREFIX ont:http://dbpedia.org/ontology/

SELECT ?abstract WHERE {

res:’.$keyword.’ ont:abstract ?abstract.

FILTER

langMatches

(lang (?abstract), "’ . $lang . ’")

}

Beside extracting the basic information, for the ﬁrst

time, the system performs a SPARQL interrogation

that retrieves all the keyword’s properties (the re-

sponse is cached for further uses). Afterwards, us-

ing a JavaScript library, it automatically and asyn-

chronously loads these properties in background.

When the user starts typing any character in the spe-

ciﬁc ﬁeld, several suggestions will appear – the user

does not need to know in advance any property name.

After completing the ﬁeld with as many words as

the user wants, the system displays in a very simple

and elegant way the value of those properties. The

Qsense-LearningSemanticWebConceptsbyQueryingDBpedia

353

QsenseUI

DBpedia

interrogation

Learn

DBpedia

Standard

RDF

Knowledge Base

(consisting of user’s RDF les)

Hercules JavaScript

Framework

QsenseBackend

Interrogation

Module

DBpedia

endpoint

Other

endpoints

sparqllib

Figure 1: Qsense General Architecture.

construct used to get all the properties is the follow-

ing:

PREFIX res:<http://dbpedia.org/resource/>

SELECT DISTINCT ?property WHERE {

{ res:".$keyword." ?property ?obj . }

UNION

{ ?obj ?property res:".$keyword.". }

}

4.2 Standard RDF Queries

Another Qsense component could be used by the per-

sons knowing the SPARQL basics. It provides a sim-

ple editor that allows them to write their own queries

and execute them on an existing SPARQL endpoint,

a RDF resource identiﬁed by an URL or a local

RDF ﬁle. This module uses the Hercules JavaScript

Semantic Web Framework (Arielworks, 2008) that

transforms the given RDF data into a NTriple format

– via Apache Anything To Triples (any23) online ser-

vice – and then parse it in order to get the SPARQL

result. To minimize the response time, data is cached

and asynchronously loaded.

4.3 Learn DBpedia Module

The users need to explore DBpedia without any

knowledge of how it works or about SPARQL. This

Qsense module provides an easy way to query DBpe-

dia and also helps users to understand how it is struc-

tured. First, it automatically loads all the basic cate-

gories in a list, from where the user can choose any

category (s)he wants.

When choosing a category, its subcategories are

automatically discovered and loaded in an asyn-

chronous manner. The SPARQL statement that loads

the SubCategory list is:

SELECT DISTINCT ?subject WHERE {

?subject

rdfs:subClassOf

<http://dbpedia.org/ontology/".

$scat . "> .

?instance a ?subject .

}

Using this approach, the tool can be easily adapted to

use another knowledge base.

Now, the user can click on the

View Category’s

Properties

button in order to see all the proper-

ties existing in DBpedia for that category. Qsense

performs the following parameterized query that

searches all properties for the chosen category:

PREFIX ont:<http://dbpedia.org/ontology/>

SELECT DISTINCT ?property WHERE {

?instance a ont:" . $category .".

?instance ?property ?obj .

}

Moreover, the user can obtain all the proper-

ties/instances for that particular subcategory. There

is also possible to search for a keyword and see all its

properties speciﬁed by DBpedia.

Qsense is implemented by using PHP as the back-

end technology and HTML, CSS and JavaScript for

the front-end. In order to perform SPARQL queries,

we choose SPARQLLib (Gutteridge, 2012). By

adopting responsive Web design principles, the appli-

cation could be also used in the mobile context – a

signiﬁcant advantage for an e-learning tool.

ICE-B2013-InternationalConferenceone-Business

354

5 USE CASES

Qsense is a Web instrument that gives the user the

opportunity to interrogate DBpedia in a very simple

and intuitive way, without knowledge about semantic

Web technologies and DBpedia architecture.

First, the user can ﬁnd out basic information about

a certain topic. He/she just enters the keyword and

the system automatically does the interrogation and

provides the result which represents, technically, the

abstract information from the Wikipedia page related

to the searched keyword. The user can also select the

language in which (s)he wants to receive the informa-

tion. Our system supports all the languages in which

the given information is found on Wikipedia.

Figure 2: Obtaining the values of certain properties regard-

ing a speciﬁc concept (e.g., “Semantic Web”).

Moreover, when the user searches information about a

certain topic, the system automatically retrieves all its

properties and allows the user to complete a new ﬁeld

with as many desired properties, in order to ﬁnd out

more about the topic. The user input is assisted with

an autocomplete feature, so the user does not need to

know in advance the name of the properties – see Fig-

ure 2. The user searches for

Semantic Web

and gen-

eral information about this topic are displayed. Af-

terwards, by using the properties of the keyword, the

user can ﬁnd out speciﬁc knowledge about a given

concept.

Our project provides also a module where the user

can easily learn how a part of DBpedia is structured,

within a very simple, user-friendly interface. In the

ﬁrst place, the system automatically loads and shows

all the basic categories of objects from DBpedia.

The user can choose a category from the list and

see its (ﬁrst level) subcategories. Using this simple

approach, the beginners do not have to know the en-

tire complex graph structure of the knowledge model.

Instead of this, they could incrementally explore cer-

tain topics of interest.

For example, by selecting

Animal

, all its existing

subcategories – provided by DBpedia – are automati-

cally loaded in an asynchronous manner (Figure 3).

Figure 3: Browsing DBpedia (sub)categories.

Furthermore, by simply clicking one button, the user

sees all the properties existing in DBpedia for a

category. Adopting the ”learning by example” ap-

proach, a student could learn about the existing list

of properties (e.g., expressed by RDF Schema or

OWL) attached to a given class (concept). For ex-

ample, the

Bird

concept has a relatively long list of

common/speciﬁc properties such as

type

synonim

comment

classis

genus

, etc.

There is also possible to see all the instances

of a (sub)category, just by clicking the

View

Subcategory’s Instances

button. In the case

Brain

category, there are shown

Cerebellum

Neocortex

and so on – see Figure 4.

Figure 4: DBpedia instances for a chosen subcategory.

Finally, the user has the possibility to type a keyword

denoting an item of interest and browse all its proper-

ties. In this way, we encourage him /her to try to learn

and understand better how DBpedia works, without

any previous knowledge about SPARQL or other se-

mantic Web technologies.

For example, if the user enters

Tim

Berners-Lee

, then (s)he will obtain a list of

properties to be further explored (e.g.,

sameAs

founder

inventor

leaderName

KeyPeople

, and

many others).

Additionally, beginners could access a help sec-

tion that includes a video demo (Cristea and Timof-

ciuc, 2013) showing the most important features.

Qsense-LearningSemanticWebConceptsbyQueryingDBpedia

355

6 CONCLUSIONS AND FUTURE

WORK

We introduced a straightforward, intuitive and easy

to use (mobile) Web application which allows users

to (transparently) make queries on DBpedia and learn

about its structure. It does not require an account or

additional software installed besides a Web browser.

The key difference that we see between other Web

sites/applications that are using DBpedia information

and our approach is that expert and especially non-

expert users can perform queries against such a large

amount of structured data with or without any knowl-

edge of semantic Web languages/technologies, such

as RDF, SPARQL, linked data, vocabularies, etc.

Various Qsense versions were tested by a number

of about 30 students during the lab classes regarding

semantic Web discipline. Currently, we are preparing

to launch the ﬁrst public beta version of our developed

application.

Qsense is designed to be a learning tool that can

be easily extended. We intend to add support for

more complex queries and to link multiple DBpedia

resources through common properties. This feature

will allow users, by only selecting items and ﬁlling

some ﬁelds, to link data and to obtain deducted re-

sults via a speciﬁc reasoner.

Additionally, a visualization module is planned

to be included, eventually using the VOWL nota-

tion (Negru and Lohmann, 2013).

Another useful feature seems to be a cloud-based

recommending system about things the user may be

interested, such as books, geographical information,

movies, activities and so on, following the directions

exposed by (Luca and Buraga, 2009).

ACKNOWLEDGEMENTS

We are grateful to Ana Maria Timofciuc and Alexan-

dra Cristina Cristea for their important support re-

garding the project development and testing.

REFERENCES

Allemang, D. and Hendler, J. (2011). Semantic Web for the

Working Ontologist: Effective Modeling in RDFS and

OWL. Morgan Kaufmann.

Arielworks (2008). Hercules Semantic Web Framework.

http://hercules.arielworks.net/

Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak,

R., and Ives, Z. (2007). Dbpedia: A Nucleus for a Web

of Open Data. The Semantic Web, pages 722–735.

Becker, C. and Bizer, C. (2008). DBpedia Mobile: a

Location-aware Semantic Web Client. Proceedings of

the Semantic Web Challenge, pages 13–16.

Berners-Lee, T., Hendler, J., Lassila, O., et al. (2001). The

Semantic Web. Scientiﬁc american, 284(5):28–37.

Bizer, C., Heath, T., and Berners-Lee, T. (2009). Linked

Data - The Story So Far. International Journal

on Semantic Web and Information Systems (IJSWIS),

5(3):1–22.

Breslin, J. and Bojars, U. (2004). Semantically-Interlinked

Online Communities (SIOC) Initiative.

http://

sioc-project.org/

Brickley, D. and Miller, L. (2000). Friend of a Friend

(FOAF) Project.

http://www.foaf-project.org/

Camarda, D. V., Mazzini, S., and Antonuccio, A. (2012).

LodLive, Exploring the Web of Data. In Proceedings

of the 8th International Conference on Semantic Sys-

tems, pages 197–200. ACM.

Cristea, A. and Timofciuc, A.-M. (2013). Qsense

Video Demo.

https://www.youtube.com/watch?

v=FBGUqs__DmQ

DBpedia (2013a). DBpedia Knowledge Base.

http://

wiki.dbpedia.org/About

DBpedia (2013b). DBpedia SPARQL Endpoint.

http://

dbpedia.org/sparql

Dodds, L. (2007). Twinkle: A SPARQL Query Tool.

http:

//www.ldodds.com/projects/twinkle/

Doms, A. and Schroeder, M. (2005). GoPubMed: Explor-

ing PubMed with the Gene Ontology. Nucleic Acids

Research, 33(Supplement 2):W783–W786.

Gutteridge, C. (2012). RDF Library to Query SPARQL

from PHP.

http://graphite.ecs.soton.ac.uk/

sparqllib/

Heim, P., Hellmann, S., Lehmann, J., Lohmann, S., and

Stegemann, T. (2009). Relﬁnder: Revealing Relation-

ships in RDF Knowledge Bases. In Semantic Multi-

media, pages 182–187. Springer.

Luca, A.-P. and Buraga, S. C. (2009). Enhancing User Ex-

perience on the Web via Microformats-Based Recom-

mendations. In Enterprise Information Systems, pages

321–333. Springer.

Negru, S. and Lohmann, S. (2013). A Visual Notation

for the Integrated Representation of OWL Ontolo-

gies. In Proceedings of the 9th International Confer-

ence on Web Information Systems and Technologies.

SciTePress.

NextBio (2013). NextBio Platform.

http://www.

nextbio.com/

OpenLink (2013). Virtuoso Universal Server.

http://

virtuoso.openlinksw.com/

Vandenbussche, P.-Y. (2013). SPARQL End-

point Status.

http://labs.mondeca.com/

sparqlEndpointsStatus/

W3C (2004). Resource Description Framework (RDF).

http://www.w3.org/RDF/

W3C (2013). SPARQL Query Language for RDF.

http:

//www.w3.org/TR/sparql11-overview/

ICE-B2013-InternationalConferenceone-Business

356