JOPA: Accessing Ontologies in an Object-oriented Way

Martin Ledvinka and Petr K

remen

Czech Technical University in Prague, Technick

a 2, Praha, Czech Republic

Keywords:

Ontology, Persistence, Application Access, Complexity.

Abstract:

Accessing OWL ontologies programmatically by complex IT systems brings many problems stemming from

ontology evolution, their open-world nature and expressiveness. This paper presents Java OWL Persistence

API (JOPA), a persistence layer that allows using the object-oriented paradigm for accessing semantic web

ontologies. Comparing to other approaches, it supports validation of the ontological assumptions on the object

level, advanced caching, transactional approach, uniﬁcation and optimization of repository access through the

OntoDriver component, as well as accessing multiple repository contexts at the same time. Additionally, we

present a complexity analysis of OntoDriver operations that allows optimizing object-oriented access perfor-

mance for underlying storage mechanisms. Last but not least, we compare our object-oriented solution to low

level Sesame API in terms of efﬁciency.

1 INTRODUCTION

Large expressive ontologies are a powerful tool for

knowledge modelling. However, their complexity re-

quires proper design and development of end-user in-

formation systems. During information system de-

sign, its creators face the challenge of choosing an

appropriate software library that is reasonably easy to

use and maintain, but that allows exploiting the ontol-

ogy in its complexity (K

remen and Kouba, 2012).

On one side, information systems, that accept

closed-world assumption by their nature, have to deal

with distributed and open-world knowledge repre-

sented in ontologies. On the other hand, ontologi-

cal changes often do not affect the information system

data model assumptions and thus can be smoothly ap-

plied without information system recompilation and

redeployment (e.g. taxonomy/metadata extension).

Furthermore, expressive power of semantic web on-

tologies is signiﬁcantly higher than that of relational

databases.

For example, an ontology speciﬁes that each

Person has a name. Due to the open world assump-

tion, the ontology is consistent even if a particular

Person does not have recorded his/her name. How-

ever, a genealogical application accessing the ontol-

ogy needs the name to be known, which causes the

application to crash whenever it receives (consistent,

but application–incompatible) data from an ontologi-

cal source, specifying a Person without a name.

This paper presents a solution for these issues –

the Java OWL Persistence API (JOPA), see (Led-

vinka and K

remen, 2014), a persistence layer that al-

lows using the object-oriented paradigm for access-

ing semantic web ontologies. Comparing to other ap-

proaches, it supports validation of the ontological as-

sumptions on the object level ((K

remen and Kouba,

2012) and (K

remen, 2012)), advanced caching, trans-

actions, uniﬁcation and optimization of repository ac-

cess through the OntoDriver component, as well as

accessing multiple repository contexts at the same

time. Additionally, we present a complexity anal-

ysis of OntoDriver operations that allows optimiz-

ing object-oriented access performance for underly-

ing storage mechanisms. Lastly, we compare our so-

lution to low level Sesame API in terms of efﬁciency.

Section 2 shows the relationship of our work to the

state-of-art research. Section 3 introduces design and

implementation of a prototype system for ontology-

based information system access. Section 4 analyses

complexity of operations deﬁned in the API

for stor-

age access. The paper is concluded in Section 5.

2 RELATED WORK

Some object-oriented solutions try to approximate

ontological OWL reasoning (Motik et al., 2009)

Application Programming Interface

212

Ledvinka M. and K

remen P..

JOPA: Accessing Ontologies in an Object-oriented Way.

DOI: 10.5220/0005400302120221

In Proceedings of the 17th International Conference on Enterprise Information Systems (ICEIS-2015), pages 212-221

ISBN: 978-989-758-097-0

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

by means of procedural code, like (Meditskos and

Bassiliades, 2008), or (Poggi, 2009). However, this

signiﬁcantly limits the expressive power of the on-

tology and is memory-consuming on the information

system side.

There is another research direction, not compro-

mising reasoning completeness, while maintaining its

scalability – simplifying programmatic access to se-

mantic web ontologies stored in optimized transac-

tional ontology storages. This is also where our solu-

tion lies. Two main existing approaches are presented

in the next sections.

2.1 Domain-independent APIs

Many APIs for programmatic access to ontologies

make no assumptions about the particular ontology

schema. This paradigm is exploited in frameworks

like OWL API (Horridge and Bechhofer, 2011),

Sesame (Broekstra et al., 2002) or Jena (Carroll et al.,

2004). These systems are generic, allowing to ex-

ploit full range of ontological expressiveness, trading

it for verbosity and poor maintainability of the result-

ing code. Furthermore, using these tools requires soft-

ware designers to hold deep knowledge of the under-

lying ontological structures. Comparing to these sys-

tems, our solution provides object-ontological map-

ping that helps software designers in keeping the de-

sign readable, consistent and short, see Section 4.2.2.

2.2 Domain-speciﬁc APIs

There are already several established solutions, where

the ontology schema is compiled directly into the

object model. This paradigm makes use of an

OOM

. Representatives of this paradigm are e.g.

Empire (Grove, 2010) or AliBaba

. Comparing to

the former, these systems actually access ontolo-

gies in a frame-based (or object-oriented) manner.

Object-ontology mappings bind the information sys-

tem tightly to the particular ontology. This signiﬁ-

cantly simpliﬁes programmatic access and is less de-

manding on the developer expertise in semantic web

ontologies. On the other hand, these solutions bury

most of the ontological expressiveness. Comparing to

these systems, our solution allows to monitor ontol-

ogy changes during its evolution, access both asserted

and inferred knowledge and provide individual types

as well as extra properties through additional Java an-

notations.

A thorough discussion of these architectures can

be found in (Ledvinka and K

remen, 2014) and

Object-ontological Mapping

http://rdf4j.org/, accessed 02-12-2014

in (K

remen, 2012). JOPA, introduced in Section 3

aims at taking the best of both types, as shown in Fig-

ure 1. It provides compiled object-based mapping of

the ontology schema similar to the domain-speciﬁc

approaches described above, while also enabling ac-

cess to the dynamically changing aspects of the ontol-

ogy (see Section 3).

3 JOPA

JOPA stands for Java OWL Persistence API. It is in

essence an API for efﬁcient access to ontologies in

Java, designed to resemble its relational-world coun-

terpart Java Persistence API (JCP, 2009).

In this section, we introduce the layered architec-

ture of JOPA.

Architecture. From the architectural point of view,

JOPA is divided into two main parts:

OOM, realizes the object-ontological mapping and

works as a persistence provider for the user ap-

plication. The API resembles JPA 2 (JCP, 2009),

but provides additional features speciﬁc to ontolo-

gies.

OntoDriver, provides access to the underlying stor-

age optimized for the purposes of object-oriented

applications. OntoDriver has a generic API which

decouples the underlying storage API from JOPA.

Figure 2 shows a possible conﬁguration of an in-

formation system using JOPA, together with some in-

sight into the architecture of JOPA. The application

object model is deﬁned by means of a set of integrity

constraints which guard that the ontological data are

usable for the application and vice versa. Thanks to

the well-deﬁned API between the OOM part of JOPA

and OntoDriver, there can be various implementations

of OntoDriver and the user can switch between them

(and between the underlying storages) without having

to modify the actual application code.

3.1 JOPA OOM

Let us now brieﬂy describe the main features of the

object-ontological mapping layer of JOPA.

The OOM layer is mainly represented by the

EntityManager interface, which corresponds to its

JPA 2 counterpart (JCP, 2009) to a large extent. It

contains CRUD

operations: ﬁnd, persist, merge and

remove, but it enhances them with versions supporting

context descriptors (Ledvinka and K

remen, 2014). It

Create, Retrieve, Update, Delete

JOPA:AccessingOntologiesinanObject-orientedWay

213

Figure 1: JOPA and related approaches.

also contains operations for transaction management

and cache access.

Mapping. As was stated earlier, the mapping be-

tween ontological data and the application domain

model is based on a contract expressed using integrity

constraints. In practice, these integrity constraints are

compiled into a set of Java annotations that guard their

semantics on the object level. This allows easy valida-

tion of cardinality constraints, as well as domains and

ranges of annotation/data or object properties. In this

respect, JOPA is very similar to Empire(Grove, 2010).

An example of a JOPA entity class can be seen in List-

ing 1. The @Types ﬁeld contains information about

ontology classes to which individual represented by

an instance of this type belongs.

@OWLClass (

iri =" http :// e x a m ple . org / St u d ent ")

class Studen t {

@Id

URI id ;

@Dat a P r o p e r ty (

iri =" ht tp :// e x ample . org / name ")

Stri n g n ame ;

@Dat a P r o p e r ty (

iri =" ht tp :// e x ample . org / e mail ")

Stri n g email ;

@Ob j e c t P r op er t y (

iri =" ht tp :// e x ample . org / c o u rse ")

Set < Course > courses ;

@Inferred

@Typ e s

Set < String > type s ;

@Proper t i e s

Map < String , Set < String >> p r o p e r t i e s ;

}

Listing 1: Example of a business entity class declaration

with JOPA annotations representing the object-ontological

mapping.

It is necessary to point out that the identity of a

business object in JOPA is given not only by the indi-

vidual’s IRI

, but also by the ontology class mapped

by its type, i.e. in the underlying ontology the indi-

vidual must be explicitly stated to be of the speciﬁed

ontology class. It should also be noted that JOPA does

not support blank nodes and anonymous individuals.

JOPA does not support class subsumption either, due

to the lack of clear relationship between ontological

and object identity, as well as the lack of multiple

inheritance in many object-oriented languages, like

Java. To facilitate working with entity relationships,

JOPA does support operation cascading.

Unmapped Properties. In addition to the ﬁxed set

of modelled properties, JOPA enables the application

to access also the properties which are not part of the

object model. The property values are currently re-

stricted to their string representation and represented

by a map where the keys are property IRIs and val-

ues are sets of property values. The map is annotated

with the @Properties annotation. This way the ap-

plication has, although limited, access to the dynamic

part of ontological data without having to adjust the

domain model. See the properties attribute in List-

ing 1.

Inferred Attributes. Ontologies contain two types

of information:

• Explicit (asserted),

• Implicit (inferred).

Inferred information cannot be changed, as it is

derived from the asserted knowledge by a reasoner

and can change only by modiﬁcation of the explic-

itly stated information. As a consequence, it is nec-

essary to prevent modiﬁcation of inferred data. JOPA

supports both asserted (in read/write mode) and in-

ferred (read-only) attributes. This support is realized

by means of the @Inferred and @Asserted annota-

tions. The @Asserted annotation is optional. Every

Internationalized Resource Identiﬁer

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

214

Figure 2: JOPA Architecture. A clear separation between JOPA and the underlying ontology store layer can be seen. Integrity

constraints are used to represent the contract between the ontology and the application object model.

ﬁeld not annotated with @Inferred is considered as-

serted and allowed to be modiﬁed.

Contexts. Another feature of JOPA is its ability to

work with ontologies distributed in several contexts

(graphs). When the underlying storage supports this

feature, the application is able to specify not only in

which context an instance should be searched for, but

also contexts for individual attributes of the instance.

If the context is not speciﬁed, the default one is used.

Transactions and Caching. JOPA supports trans-

actional processing of the ontological data. However,

the mechanism is different from standard relational-

based persistence, because reasoning makes it more

difﬁcult to reﬂect pending changes to the transaction

that produced them. For example, when a property

value is changed during a transaction T

, only T

has

to be able to see effects of that change even before

commit. JOPA itself does not employ any reasoning

and ofﬂoads this burden to the underlying OntoDriver

implementation. The OntoDriver is free to choose any

strategy for keeping track of transactional changes.

When a business transaction commits, JOPA tells the

OntoDriver to make the pending changes persistent in

the storage.

Since applications often manipulate the same data,

it is reasonable to use cache to reduce the necessity

to query the storage. JOPA contains a second-level

cache (JCP, 2009), which is shared between all open

persistence contexts and enables quick entity lookup.

Another performance improving feature is the support

for lazily loaded attributes

3.2 OntoDriver

OntoDriver is a software layer designed to unify ac-

cess to various ontology storages. It achieves this

goal by presenting a single API to JOPA and enabling

the implementation to use whatever framework is re-

quired by the underlying storage, e.g. Sesame API

for Sesame storage or Jena for SDB. In this regard,

OntoDriver is similar to a JDBC

driver known from

Lazily loaded attribute values are retrieved from the

data source only upon application request.

Java Database Connectivity

JOPA:AccessingOntologiesinanObject-orientedWay

215

the relational world. But in contrast to JDBC, where

all operations are done using SQL

statements, On-

toDriver provides dedicated CRUD operations, which

give the implementations more opportunity for opti-

mizations, since they know beforehand what opera-

tion is executed.

However, the OntoDriver API does not elimi-

nate the possibility of using SPARQL (Harris and

Seaborne, 2013) queries for information retrieval and

SPARQL Update (Gearon et al., 2013) statements for

data manipulation.

3.2.1 OntoDriver API

The key idea behind OntoDriver is a uniﬁed API pro-

viding access to ontology storages. To formally de-

scribe the API, let us ﬁrst deﬁne basic ontological ter-

minology:

Theoretical Background. We consider program-

matic access to OWL 2 DL ontologies, corre-

sponding in expressiveness to the description logic

SR OI Q (D)

. In the next sections, consider an

OWL 2 DL ontology O = (T , A), consisting of a

TBox T = {τ

} and an ABox A = {α

}, where

is either of the form C(i) (class assertion), or

P(i, j) (object property assertion), where i, j ∈ N

are

OWL named individuals, C ∈ N

is a named class,

P ∈ N

is a named object property. Other axiom

types belong to T . W.l.o.g. we do not consider

C(i) and P(i, j) for complex C and P here. We do

not consider anonymous individuals either. See full

deﬁnition of OWL 2 DL (Motik et al., 2009) and

SR OI Q (D) (Horrocks et al., 2006).

In addition to ontological (open-world) knowl-

edge, a set S

= {γ

} of integrity constraints is used

to capture the contract between an ontology and an

information system object model. Each integrity con-

straint γ

has the form of an OWL axiom with closed-

world semantic, as deﬁned in (Tao et al., 2010).

By multi-context ontology we denote a tuple M =

,...,O

), where each O

is an ontology identi-

ﬁed by a unique IRI and is called context, O

denotes

the default ontology (default context) which is used

when no other context is speciﬁed. This structure

basically corresponds to an RDF dataset with named

graphs (Cyganiak et al., 2014). An ontology store is a

software layer that provides access to M .

An axiom descriptor δ

is a tuple

(i,{(r

)...(r

)}), where i ∈ N

, r

∈ N

∈ {0,1} and m ∈ 1...k. The b

s specify whether

Standard Query Language

For the sake of compactness, we neglect datatypes and

literals (D) and use description logic notation.

inferred values for the given role should be included

as well. The axiom descriptor is used to specify for

which information the OntoDriver is queried.

An axiom value descriptor δ

is a tuple

(i,{(r

)...(r

)}), where i ∈ N

, r

∈ N

, v

∈ N

and m ∈ 1...k. The v

s represent property assertion

values for the given individual and property. The ax-

iom value descriptor speciﬁes information which shall

be inserted into the storage.

Please note that for the sake of readability we have

omitted context information from the formal deﬁni-

tions. In reality, a context can be speciﬁed for the

whole descriptor and for each role.

OntoDriver API. The core operations of the Onto-

Driver API are as follows:

• f ind(M , δ

): 2

× N

× {0,1}

→

×N

, where δ

is an axiom descriptor,

– Given an individual, load values for the speci-

ﬁed properties,

– Used by EntityManager.find() in OOM,

• persist(M ,δ

) = O

∪ {α

...α

}, where α

...α

are property assertion axioms created from role-

value pairs in δ

– Persist axioms representing entity attribute val-

ues,

– Used by EntityManager.persist() in

OOM,

• remove(M ,δ

) = O

\ {α

...α

}, where α

...α

are property assertion axioms for the roles spec-

iﬁed in δ

– Remove axioms representing entity attribute

values,

– Used by EntityManager.remove() in OOM,

• update(M ,δ

) = (O

\ {α

...α

}) ∪ {α

...α

where α

...α

are original property assertion ax-

ioms for the roles r

...r

deﬁned in δ

and α

...α

are new property assertion axioms created for

role-value pairs in δ

– Remove old and assert new values for entity at-

tributes,

– Used by EntityManager.merge() or on at-

tribute change during transaction in OOM,

• getTypes(M ,i,b): 2

× N

× {0,1} → 2

where the resulting axioms represent types of the

speciﬁed individual i, b speciﬁes whether inferred

types should be included as well,

– Get types of the speciﬁed named individual,

– Used by EntityManager.find() in OOM,

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

216

• updateTypes(M , i, {c

...c

}) = (O

\ {α

...α

})

∪{α

...α

}, where c

∈ N

, the α

are original

class assertion axioms and the α

are the new

class assertion axioms for the given individual i,

– Updates class assertion axioms for the given in-

dividual by removing obsolete types and adding

new ones,

– Used by EntityManager.persist(),

EntityManager.merge() or on attribute

change during transaction in OOM,

• validateIC(M ,{γ

...γ

}) : 2

× 2

×N

→ {0,1}, where γ

∈ S

and m ∈ 1...k,

– Validate the speciﬁed integrity constraints,

verifying reasoning-time integrity constraints

which cannot be validated at runtime (K

remen

and Kouba, 2012),

– Called on transaction commit in OOM.

The actual programming interface written in Java

contains, besides methods representing the above op-

erations, also methods for issuing statements (pre-

sumably SPARQL and SPARQL Update) and trans-

action managing methods. We omit these here for the

sake of brevity.

3.2.2 Prototype of OntoDriver

To evaluate our design of OntoDriver, we have created

a prototypical implementation. For this prototype, we

have chosen to use Sesame API. One of the main rea-

sons for such decision was that there exist Sesame

API connectors for some of the most advanced on-

tology repositories including GraphDB (successor of

OWLIM, see (Bishop et al., 2010)) and Virtuoso (Er-

ling, 2012). The implementation can thus be used to

access a variety of storages. More optimized imple-

mentations of OntoDriver which would exploit spe-

ciﬁc features of the underlying storages can be cre-

ated, but the prototype was intended as a general proof

of concept for the layered design of JOPA.

The Sesame OntoDriver uses neither SPARQL nor

the SeRQL (Broekstra et al., 2002) language to per-

form data manipulation. We use the Sesame ﬁlter-

ing API, which ﬁlters statements according to subject,

predicate and object (i.e. it basically corresponds to

triple pattern matching in a SPARQL query). On the

one hand, this requires for example asking for each

property of an individual separately (or asking for all

of them by making the property unbound). On the

other hand a SPARQL query that would correspond

to the ﬁnd operation (see above) would be a union of

triple patterns. We expect the performance of asking

for the property values one by one using the Sesame

ﬁltering API to be similar to using a single SPARQL

query with union on triple patterns, because even-

tually both strategies lead to evaluation of multiple

triple patterns with the subject and property bound

and no joins. Nevertheless, we have a more ﬁne-

grained control over the operation itself, because we

are able to specify whether inferred statements should

or should not be included in the query result. This is

an important feature of JOPA and can generally not be

done in standard SPARQL statements. However, we

plan to investigate the possibility of using SPARQL

instead of the Sesame ﬁltering API as well.

Another important point is how the Sesame On-

toDriver deals with transactions. As was mentioned

in Section 3.1, JOPA transfers the burden of mak-

ing changes done in a transaction visible to the

transaction itself to the OntoDriver. The prototype

handles this task by creating local graphs of added

and removed statements. When the store is queried

for some knowledge, the added and removed trans-

actional snapshots are used to enhance the results

returned by the storage to reﬂect the transactional

changes. These local graphs are of course unique to

every transaction on the OntoDriver level. Currently,

this approach is handicapped by the fact that such lo-

cal graphs do not provide any reasoning support, so

they represent only explicit assertions. A solution

to this drawback would be for example using an in-

memory reasoner, e.g. Pellet (Sirin et al., 2007), for

the local graphs.

We are also considering another possible solu-

tion for keeping the transactional changes. This so-

lution would require temporary contexts created by

the store, which would hold the transactional changes

kept currently in the local graphs. This would enable

us to transfer the reasoning task over to the underly-

ing storage. This solution remains as an idea for the

future development.

4 OPERATION COMPLEXITY

ANALYSIS

The OntoDriver API enables us to examine the com-

plexity of operations it consists of. In this section

we consider this complexity with regards to sev-

eral selected ontology storages. A careful reader

may have noticed that some of the operations in

the API could share the same implementation, for

instance update(M ,δ

) can be implemented using

remove(M ,δ

) and persist(M ,δ

). Thus, we con-

centrate the analysis on the following operations:

• f ind(M , δ

• persist(M ,δ

JOPA:AccessingOntologiesinanObject-orientedWay

217

• remove(M ,δ

When done with theoretical complexity analysis,

we will look at the prototypical implementation of

OntoDriver described in Section 3.2.2 and measure

the operation complexity on some real data.

4.1 Complexity Analysis

For the theoretical complexity analysis, we have se-

lected two well known storages, each representing a

different approach to reasoning – one performing total

materialization on data insertion, the other reasoning

at query time and doing no materialization (the differ-

ence being similar to forward and backward chaining

strategies in rule systems):

GraphDB, formerly known as OWLIM (Bishop

et al., 2010), GraphDB is a Sesame SAIL

with

rule-based reasoner using forward chaining,

Stardog,

performs real-time model checking with

no materialization.

Each of these strategies has its pros and cons. To-

tal materialization is fast in querying, as there is no

reasoning performed at query execution time. On the

other hand, statement removal and insertion are slow.

In addition it is necessary to specify reasoning expres-

siveness before any data is inserted. Total materializa-

tion can also cause signiﬁcant inﬂation of the dataset

size. Real-time reasoning keeps the dataset compact

and it is fast on insertion, however performing reason-

ing at query time can be time consuming.

4.1.1 A Note on Indexes

The most important part of every ontology storage is

its index – it determines how quickly the data can

be accessed. Ontology repositories follow the trend

of data storages from other domains and use B-trees

(Comer, 1979). GraphDB uses a modiﬁed version

of B-trees – a B+ tree (Hepp et al., 2007). There is

not much information about the indexing strategies of

Stardog, but we were able to determine that it also

uses a B+ tree from a post in Stardog forum

To efﬁciently access data which are statements

consisting of three parts – subject (S), predicate (P)

and object (O), the storages usually contain multiple

indexes. Since there exist six combinations of the

three statement parts, there could be up to six different

indexes. With increasing number of indexes the space

required to store the data and the indexes obviously

Storage And Inference Layer

http://www.stardog.com, accessed 02-12-2014

The post is available at http://tinyurl.com/ke4ozf7, ac-

cessed 25-01-2015

grows. Another problem of multiple indexes is their

updating when the data is modiﬁed. Given the fact

that most storages also support contexts, the number

of possible indexes grows even more.

Therefore, storages usually restrict themselves to

only a few indexes, based on the structure of the most

frequent queries. It is often the case that property

is bound in such queries. Thus, storages mostly use

PSO and POS indexes, with others optionally avail-

able. The PSO index searches statements ﬁrst by

predicate, then by subject and last by object. The POS

index is similar, only switching object and subject.

Although the indexes are designed for generic RDF

statements, they are adequate in our setup, as the on-

tological axioms manipulated by OntoDriver have the

form of atomic class assertions, or atomic property as-

sertions, both being serialized as single RDF triples.

The PSO and POS indexes are also the default ones

used by GraphDB (Ontotext, 2014) and Stardog (Star-

dog, 2014).

4.1.2 Analysing Complexity of Typical

Operations

In the following paragraphs we will examine time

complexity of each of the operations enumerated at

the beginning of this section with regards to the se-

lected storages, with a short comment on possible im-

plementations of these operations in OntoDriver.

f ind(M ,δ

). Multiple strategies can be employed

to realize the ﬁnd operation, but in essence they all

perform a search for property assertion axioms where

the individual and property are bound. Therefore,

the PSO index will be triggered. However, while

GraphDB will proceed directly to ﬁnding the corre-

sponding data, Stardog must ﬁrst perform reasoning

and rewrite the query according to the schema seman-

tics. The complexity can be seen in Table 1.

We would like to stress here that the ﬁnd operation

is theoretically very favourable in terms of possible

performance, because it does not require any joins,

as it is supposed to return a simple union of property

values for a single individual. Therefore it is straight-

forwardly mappable to the PSO index.

persist(M ,δ

). Persisting assertion values speci-

ﬁed in δ

requires insertion of the corresponding state-

ments into the storage’s indexes (in our case the PSO

and POS indexes).

In addition, in GraphDB, a materialization of

statements inferred from the inserted knowledge is

performed. Thus, from a set of statements K

, in-

serted into the database, a new set K

of statements

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

218

Table 1: Asymptotic time complexity of the selected operations for GraphDB and Stardog. b is branching factor of the

index B+ tree, n is the size of the dataset. The complexity of processing B+ trees is described in (Comer, 1979). C

is the

reasoning cost, which depends on the selected language expressiveness and m is the number of reasoning cycles performed in

materialization of statements inserted into GraphDB.

Storage T

f ind

persist

remove

GraphDB O(log

n) O(

∑

i=0

× log

n) O(

∑

i=0

× log

Stardog O(C

) + O(log

n) O(log

is derived, K

being in turn inserted into the dataset,

triggering more materialization, until a set K

is in-

serted, from which no additional knowledge can be

deduced. This, of course, makes the persist operation

in GraphDB more complex than in Stardog. Again,

the theoretical complexity is shown in Table 1.

remove(M ,δ

). Doing remove in an ontology re-

quires knowledge of what exactly should be removed.

Thus, JOPA performs epistemic remove, i.e. only val-

ues of properties mapped in the object model are re-

moved. Therefore if the dataset contains property val-

ues which are not mapped by the object model man-

aged by JOPA, these values are retained. In case the

entity contains a ﬁeld gathering unmapped asserted

properties (see Section 3.1), the unmapped values are

contained in this attribute and, to be consistent with

the epistemic remove, JOPA deletes all statements

where the removed individual is the subject.

remove in Stardog is again relatively straightfor-

ward. Since JOPA does allow only removal of explicit

statements, there is no reasoning required. The pro-

cedure thus consists of ﬁnding the relevant statements

and removing them from the index.

The situation is more interesting in GraphDB, be-

cause with the removal of explicit statements, some

inferred knowledge may become irrelevant. GraphDB

resolves the operation with a combination of forward

and backward chaining (Bishop et al., 2010). In

short, all possible inferred data is found from the re-

moved statements ﬁrst (this is the forward chaining

part). From the results, backward chaining is per-

formed to determine whether the implicit knowledge

is backed by explicit knowledge other than that being

removed. If not, the inferred statements are removed

as well. Asymptotically, the complexity of remove in

GraphDB is the same as persist, see Table 1.

The asymptotic complexities suggest that

GraphDB is more suitable for read-oriented ap-

plications, especially when the expressiveness of

reasoning increases. In these cases the cost of infer-

ence in GraphDB is paid when the dataset is loaded

and at the actual runtime the queries will presumably

be much faster. On the other hand, applications

performing a lot of data modiﬁcations will beneﬁt

from the non-materializing approach of Stardog.

Of course, these are only theoretical results which

consider asymptotic complexities and disregard pos-

sible hidden constants. We are currently working on a

benchmark which would validate these expectations.

4.2 OntoDriver in Practice

In this section we brieﬂy evaluate the beneﬁts of On-

toDriver from two important perspectives – perfor-

mance and code metrics.

4.2.1 Performance of JOPA with OntoDriver

After examining theoretical complexity of the On-

toDriver API, we can proceed to measuring perfor-

mance of our OntoDriver prototype. The goal of this

evaluation is to determine performance differences

between ontology access using JOPA and OntoDriver

and using Sesame API directly. Since the OntoDriver

prototype internally uses Sesame API, we hardly ex-

pect JOPA to outperform pure Sesame API solution,

instead we will concentrate on the possible perfor-

mance penalties stemming from the additional logic

JOPA has to do. As an extra result, we will discuss

the real performance w.r.t. our theoretical approxima-

tion of the asymptotic complexity of GraphDB. The

test machine setup is following:

• Linux Mint 17 (64-bit)

• Java 8 update 31 (HotSpot), -Xms6g -Xmx6g

• Sesame API 2.7.14, GraphDB 6.0 RC6

• Intel i5 2.67GHz

• 8GB RAM

A class diagram of the benchmark schema is

shown in Figure 3. The application model is rather

small, but sufﬁcient to exercise most of the features

supported by JOPA. The application model and the

datasets are based on the UOBM benchmark (Ma

et al., 2006), the datasets were generated using a gen-

erator application (Zhou et al., 2013).

Results of the benchmark are shown in Table 2.

The ﬁnd operation was loading approximately 450 in-

stances of UndergraduateStudent. Each of them

was connected to three courses in average. Thus,

JOPA:AccessingOntologiesinanObject-orientedWay

219

Table 2: Benchmark results. The times are average from 100 runs of the benchmark. T

load

represents the amount of time it

took GraphDB to load the respective datasets.

Dataset T

load

persist

/s T

f ind

/s T

update

/s T

remove

JOPA Sesame JOPA Sesame JOPA Sesame JOPA Sesame

UOBM 1 38 4.158 2.209 13.738 13.353 32.28 9.571 36.456 2.740

UOBM 5 213 4.245 2.252 13.830 13.366 32.461 9.993 36.718 2.918

UOBM 10 424 4.255 2.260 13.840 13.293 32.625 10.077 36.433 3.024

Figure 3: Benchmark application model. Although small in

size, it exercises most of the concepts supported by JOPA,

including inferred entity types and data and object proper-

ties with lazy loading.

the total number of loaded individuals with properties

was more than 1800, representing over 5000 state-

ments. The persist test inserted 500 new instances

of UndergraduateStudent, connected to four exist-

ing courses and a new paper, into the ontology. The

update evaluation updated the name and telephone of

each of the previously persisted student, removed ref-

erence to one of his courses and added another one

instead. Finally, the remove benchmark removed the

500 persisted undergraduate students.

Benchmark Results Discussion. The benchmark

results show that JOPA performs comparably when

loading or persisting entities. It is important to point

out that to mimic the behaviour of JOPA on entity

loading, the Sesame API runner was verifying that

the object property values were of the correct type.

However, there is a signiﬁcant performance gap be-

tween Sesame API and JOPA in update and remove.

Major part in this gap is given by the fact that JOPA

ﬁrst has to load the entities before updating or re-

moving them. For Sesame API, we simply removed

(and inserted) the required statements without load-

ing them ﬁrst. Of course, the benchmark is skewed

in this regard, because a real world application would

most likely require the entity loading anyway. Also,

JOPA currently does not support the getReference

method (JCP, 2009), which would be suitable for the

update and remove scenarios. Still, there is a large

margin for improvement in JOPA for these operations.

Table 2 also shows that dataset loading times grow

linearly for GraphDB. Other times do not show any

particular trends. The sample is probably to small to

be compared to our theoretical ﬁndings.

In the future, we would like to try comparing dif-

ferent strategies of implementing OntoDriver. We are

also currently working on a performance comparison

of Stardog and GraphDB, using a set of SPARQL

queries speciﬁcally designed to correspond to the op-

erations deﬁned in the OntoDriver API.

4.2.2 JOPA and OntoDriver versus Sesame API

One of the most important advantages of using JOPA

with OntoDriver is the ability to treat ontological indi-

viduals with their properties as cohesive objects with

attributes and possibly add behaviour to those ob-

jects, thus increasing readability and maintainability

of the application. This cannot be achieved using

plain Sesame API without writing a lot of wrapper

code. This difference is very similar to what the de-

veloper gains when using JPA instead of pure JDBC.

Due to the lack of space we cannot show a code ex-

ample using the two approaches here, but we believe

that readers familiar with Sesame API, OWL API or

Jena know how tedious and clumsy code using these

frameworks can be.

To provide at least a short insight into the differ-

ence of the amount of code that needs to be written

for basic ontological data manipulation using JOPA

and Sesame API, Table 3 compares the number of

lines of code of classes used to perform the bench-

mark described in the previous section. The entity

classes are not included in the comparison, because

they are reusable objects not speciﬁc to the runners.

Table 3: Benchmark runner code length. LOC rep-

resents the number of lines of code in each of the

runner classes. The benchmark project is available

at https://krizik.felk.cvut.cz/kbss/JopaDemo/, the runner

classes are JopaRunner and SesameRunner.

Framework LOC

JOPA 123

Sesame API 296

We believe that even on such a small example we

have demonstrated the beneﬁts of using JOPA with

OntoDriver against low level APIs like Sesame.

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

220

5 CONCLUSIONS

We have introduced JOPA as a solution for object-

oriented access to ontologies. We have described the

architecture of JOPA and its prototypical implemen-

tation, including the OntoDriver component with its

API and implementation using Sesame API. Onto-

Driver represents a software layer which separates

object-ontological mapping in JOPA from access to

different storages, providing a general and concise

API. Based on this API, we have examined complex-

ity of its operations both theoretically with regards to

two well-known storages and practically in a bench-

mark of our prototype.

In the future, we intend to work on implemen-

tations of OntoDriver for other storages and a more

thorough benchmark of these storages in cooperation

with the corresponding drivers.

ACKNOWLEDGEMENTS

This work was supported by the grant No.

SGS13/204/OHK3/3T/13 Effective solving of engi-

neering problems using semantic technologies of the

Czech Technical University in Prague.

REFERENCES

Bishop, B., Kiryakov, A., Ognyanoff, D., Peikov, I., Ta-

shev, Z., and Velkov, R. (2010). OWLIM: A family of

scalable semantic repositories. Semantic Web – Inter-

operability, Usability, Applicability.

Broekstra, J., Kampman, A., and van Harmelen, F. (2002).

Sesame: A Generic Architecture for Storing and

Querying RDF and RDF Schema. In Proceedings of

the First International Semantic Web Conference on

The Semantic Web, pages 54–68.

Carroll, J. J., Dickinson, I., Dollin, C., Reynolds, D.,

Seaborne, A., and Wilkinson, K. (2004). Jena: im-

plementing the semantic web recommendations. In

Proceedings of the 13th international World Wide Web

conference (Alternate Track Papers & Posters), pages

74–83.

Comer, D. (1979). The Ubiquitous B-Tree. Computing Sur-

veys, 11(2).

Cyganiak, R., Wood, D., and Lanthaler, M. (2014). RDF

1.1 Concepts and Abstract Syntax. Technical report,

W3C.

Erling, O. (2012). Virtuoso, a Hybrid RDBMS/Graph Col-

umn Store. IEEE Data Engineering Bulletin, 35(1):3–

Gearon, P., Passant, A., and Polleres, A. (2013). SPARQL

1.1 Update. Technical report, W3C.

Grove, M. (2010). Empire: RDF & SPARQL Meet JPA.

semanticweb.com.

Harris, S. and Seaborne, A. (2013). SPARQL 1.1 Query

Language. Technical report, W3C.

Hepp, M., de Leenheer, P., de de Moor, A., and Sure,

Y. (2007). Ontology Management: Semantic Web,

Semantic Web Services, and Business Applications.

Springer.

Horridge, M. and Bechhofer, S. (2011). The OWL API: A

Java API for OWL ontologies. Semantic Web – Inter-

operability, Usability, Applicability.

Horrocks, I., Kutz, O., and Sattler, U. (2006). The even

more irresistible SROIQ. In Proc. of the 10th Int.

Conf. on Principles of Knowledge Representation and

Reasoning (KR 2006), pages 57–67.

JCP (2009). JSR 317: Java

T M

Persistence API, Version 2.0.

remen, P. (2012). Building Ontology-Based Information

Systems. PhD thesis, Czech Technical University,

Prague.

remen, P. and Kouba, Z. (2012). Ontology-Driven Infor-

mation System Design. IEEE Transactions on Sys-

tems, Man, and Cybernetics: Part C, 42(3):334–344.

Ledvinka, M. and K

remen, P. (2014). JOPA: Developing

Ontology-Based Information Systems. In Proceed-

ings of the 13th Annual Conference Znalosti 2014.

Ma, L., Yang, Y., Qiu, Z., Xie, G., Pan, Y., and Liu, S.

(2006). Towards a Complete OWL Ontology Bench-

mark. In ESWC’06 Proceedings of the 3rd European

conference on The Semantic Web: research and appli-

cations, pages 125–139.

Meditskos, G. and Bassiliades, N. (2008). A Rule-Based

Object-Oriented OWL Reasoner. IEEE Transactions

on Knowledge and Data Engineering, 20(3):397–410.

Motik, B., Parsia, B., and Patel-Schneider, P. F. (2009).

OWL 2 Web Ontology Language Structural Speciﬁ-

cation and Functional-Style Syntax. W3C recommen-

dation, W3C.

Ontotext (2014). GraphDB-SE–GraphDB6–Ontotext

Wiki. http://owlim.ontotext.com/display/GraphDB6/

GraphDB-SE+Indexing+Speciﬁcs.

Poggi, A. (2009). Developing Ontology Based Applications

with O3L. WSEAS Trans. on Computers.

Sirin, E., Parsia, B., Grau, B. C., Kalyanpur, A., and Katz,

Y. (2007). Pellet: A practical OWL-DL reasoner. Web

Semantics: Science, Services and Agents on the World

Wide Web, 5(2).

Stardog (2014). Stardog Docs. http://docs.stardog.com/.

Tao, J., Sirin, E., Bao, J., and McGuinness, D. L. (2010).

Integrity Constraints in OWL. In Fox, M. and Poole,

D., editors, AAAI. AAAI Press.

Zhou, Y., Grau, B. C., Horrocks, I., Wu, Z., and Baner-

jee, J. (2013). Making the Most of your Triple Store:

Query Answering in OWL 2 Using an RL Reasoner.

In Proceedings of the 22nd international conference

on World Wide Web.

JOPA:AccessingOntologiesinanObject-orientedWay

221