A Generic Mapping-based Query Translation from SPARQL to Various
Target Database Query Languages
Franck Michel, Catherine Faron-Zucker and Johan Montagnat
I3S, UMR 7271, University Nice Sophia Antipolis, CNRS, Sophia Antipolis, France
Keywords:
Linked Data, Query rewriting, SPARQL, RDF, NoSQL, xR2RML.
Abstract:
Fostering the development of SPARQL interfaces to heterogeneous databases is a key to efficiently expose
legacy data as RDF on the Web. To deal with the variety of modern database formats and query languages,
this paper describes a two-step approach to translate a SPARQL query into an equivalent target database query.
First, given an xR2RML mapping describing how native database entities can be mapped to RDF, a SPARQL
query is translated into a pivot abstract query language independent of the database. In a second step, the pivot
query is translated into the target database query language, considering the specific database capabilities. The
paper focuses on the first step of the query translation, from SPARQL to a pivot query that takes into account
join constraints and SPARQL filters, and embeds conditions entailed by matching SPARQL graph patterns
with relevant mappings. It discusses the query optimisations that can be implemented at this level, and briefly
describes an application to the case of MongoDB, a NoSQL document store.
1 INTRODUCTION
The exposure of legacy data as RDF is an increasingly
hot topic as new data integration challenges emerge.
Notably, the Web-scale data integration progressively
gives birth to the Web of Data thanks to the open
publication, in RDF, of data sets on the Web. Two
approaches generally apply: legacy data can all be
translated into a materialized RDF graph or data can
be accessed on-the-fly as a virtual RDF graph using
the SPARQL query language. Although the materi-
alization is of interest in some contexts it is hardly a
one-fits-all solution in practice, due to the size of the
generated graphs. Dynamic access on the other hand
scales better and guarantees data freshness.
In the last decade, translation methods for re-
lational databases (RDB) have matured, spurred
by the publication of the R2RML mapping lan-
guage (Das et al., 2012). Several methods were
proposed to achieve SPARQL access to relational
data, either in the context of RDB-backed RDF
stores (Chebotko et al., 2009; Sequeda and Miranker,
2013; Elliott et al., 2009) or using arbitrary rela-
tional schemas (Bizer and Cyganiak, 2006; Unbe-
hauen et al., 2013a; Priyatna et al., 2014; Rodr
´
ıguez-
Muro and Rezk, 2015). The mapping of XML data
to RDF has been addressed extensively by works
such as (Bischof et al., 2012) and (Bikakis et al.,
2015), among which the latter proposes a transla-
tion from SPARQL to XQuery. Furthermore, exten-
sions of R2RML were proposed such as RML (Dimou
et al., 2014) to map heterogeneous data formats (e.g.
CSV/TSV, XML or JSON), and xR2RML (Michel
et al., 2015a) to enable the mapping of an extensible
scope of databases to RDF.
At the same time, new actors, the NoSQL
databases, have gained a remarkable success. Ini-
tially confined to serve as the core system of Big
Data applications for which they were designed, they
are being increasingly adopted as general-purpose
databases, fostered by their open source licenses and
the lightweight, easy-to-start packaging of some of
them. Today, this overwhelming success makes them
a natural candidate for RDF-based data integration
systems, and in particular to feed the Web of Linked
Data.
In this regard, it shall be necessary to de-
velop SPARQL access methods for heterogeneous
databases, that shall vary greatly depending on the tar-
get database query languages: for instance RDBs sup-
port joins, nested queries and string manipulations,
but this is hardly the case of some NoSQL document
stores like MongoDB or CouchDB. Thus, to avoid
defining yet another SPARQL translation method for
each and every query language, this paper introduces
a two-step approach. First, given a set of mappings
Michel, F., Faron-Zucker, C. and Montagnat, J.
A Generic Mapping-based Query Translation from SPARQL to Various Target Database Query Languages.
In Proceedings of the 12th International Conference on Web Information Systems and Technologies (WEBIST 2016) - Volume 2, pages 147-158
ISBN: 978-989-758-186-1
Copyright
c
2016 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
147
of the target database to RDF, a SPARQL query is
translated into an pivot abstract query by matching
SPARQL graph patterns with relevant mappings. This
step can be made generic if the mapping language
used is generic enough to apply to an extensible set
of databases. In a second step, the abstract query
is translated into the target database query language,
taking into account the specific database capabilities.
Our focus, in this paper, is on the first step. Lever-
aging previous works on R2RML-based SPARQL-to-
SQL methods, we define a pivot abstract query lan-
guage and a method to translate a SPARQL query
into an abstract query, utilizing xR2RML to describe
the mapping of a target database to RDF. The method
determines a reduced set of mappings matching each
SPARQL graph pattern, and takes into account join
constraints implied by shared variables and cross-
references denoted in the mappings, and SPARQL fil-
ters. Lastly, common query optimization techniques
are applied to the abstract query in order to alleviate
the work required in the second step.
This paper is organized as follows. Section 2 first
reviews related works, in particular R2RML-based
SPARQL-to-SQL methods that we build upon. Then
section 3 describes the xR2RML mapping language,
and section 4 presents the main contribution of this
paper: the translation of a SPARQL query into an
abstract query under xR2RML mappings. Section 5
presents an application of our method to the case of
the MongoDB database. Finally we underline current
limitations and highlight perspectives in section 6.
2 RELATED WORKS
Various methods have been defined to translate
SPARQL queries into another query language, that
are generally tailored to the expressiveness of the tar-
get query language. For instance, SPARQL-to-SQL
translation methods harness the ability of SQL to sup-
port joins, unions, nested queries and various string
manipulation functions. Typically, a conjunction of
two basic graph patterns (BGP) results in the inner
join of their respective translations; their union re-
sults in an SQL UNION ALL clause; the SPARQL OP-
TIONAL clause between two BGPs results in a left
outer join, and a SPARQL FILTER results in an en-
capsulating SQL SELECT WHERE clause.
Chebotko’s algorithm (Chebotko et al., 2009) fo-
cuses on the SPARQL-to-SQL query translation in
the context of RDB-based triple stores. Priyatna et
al. (Priyatna et al., 2014) have extended it to support
custom R2RML mappings; they address the prob-
lem of eliminating null answers by adding not null
conditions for SPARQL variables. Two limitations
can be underlined though: (i) R2RML triples maps
must have constant predicate maps, i.e. the predi-
cates of the generated RDF triples cannot be built
from database values; and (ii) triple patterns are con-
sidered and translated independently of each other,
even when they share SPARQL variables. Unneces-
sary complexity is thus entailed in the SQL query with
nested queries whose solutions are ruled out only dur-
ing the final join step.
Unbehauen et al. (Unbehauen et al., 2013a) de-
fine the concept of compatibility between the RDF
terms of a triple pattern and R2RML term maps.
This more general approach effectively manages vari-
able predicate maps, which clears the first aforemen-
tioned limitation. Furthermore, they reduce the num-
ber of candidate triples maps for each triple pattern by
pre-checking join constraints implied by shared vari-
ables. This clears the second aforementioned limita-
tion. Yet, two limitations can be noticed: (i) R2RML
referencing object maps are not considered, there-
fore joins implied by shared variables are dealt with
but joins declared in the mapping graph are ignored.
(ii) The rewriting maps each term map to a set of
columns, called column group, that enables filtering,
join and data type compatibility checks. This relies
on SQL capabilities (CASE, CAST, string concate-
nation, etc.), making it hardly applicable out of the
scope of SQL-based systems.
Similarly, approaches have been proposed to deal
with XML databases. SPARQL2XQuery (Bikakis
et al., 2015) relies on the ability of XQuery to sup-
port joins, nested queries and complex filtering. For
instance a SPARQL FILTER is translated into an en-
capsulating For-Let-Where XQuery clause.
The rich expressiveness of SQL and XQuery
makes it possible to translate a SPARQL query into
a single, possibly deeply nested, target query, whose
semantics is strictly equivalent to that of the SPARQL
query. In the general case however, the target query
language may not support joins, unions and/or sub-
queries. NoSQL databases typically make a trade-off
between query language expressiveness and scalabil-
ity. For instance, MongoDB does not support joins,
and only supports nested queries under strong restric-
tions. To tackle this issue, we propose to translate
the SPARQL query into a pivot abstract query inde-
pendent of the target database, under xR2RML map-
pings. Our method relies on and extends the afore-
mentioned SPARQL-to-SQL approaches. It supports
variable predicate maps, it deals both with implicit
joins (shared variable) and explicit joins (mapping-
defined). It also considers query execution perfor-
mance issues by pushing as much as possible of the
WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies
148
{ " id ": 105632 , " f i r s t n a m e ": " John " ,
" emails ": [" j o h n @ f o o . com "," john@example . o rg "] ,
" c o ntacts ": [" ch r i s @ exampl e . org " , " a l i c e @ f o o . com "] }
{ " id ": 327563 , " f i r s t n a m e ": " A lice " ,
" emails ": [" a l i c e @ f o o . com "] ,
" c o ntacts ": [" j o h n @ f o o .com "] }
Listing 1: MongoDB collection “people” containing two documents.
<# Mbox >
xr rxr rxr r : lo g i c a lSourc elogic a l S o u r c el o g i c a l Source [ xrrxrrxrr : queryque r yq u ery "db . people . find ({ e mai ls :{ $ne : null }} )" ];
rrrrrr : su b j e c t M a psubjectMapsu bjectMap [ rrrrrr : te mplatetemplatete m p l a t e " http :/ / ex a m p l e . o rg / me m ber /{ $. id }" ];
rrrrrr : pr e d ica t e O b j ect M a ppre d i c a t eOb j e c t M appredica t e O b j ectM a p [
rrrrrr : p r e d i c a t epredicatepr e d i c a te f oaf : mbox ;
rrrrrr : o b j e c t M a pobjectMapob j e c t M ap [ xrrxrrxrr : re f e r e n c ereferencereference "$. e mails . *" ] ].
<# Knows >
xr rxr rxr r : lo g i c a lSourc elogic a l S o u r c el o g i c a l Source [ xrrxrrxrr : queryque r yq u ery "db . people . find ({ conta cts :{ $size : { $gte :1}}})" ];
rrrrrr : su b j e c t M a psubjectMapsu bjectMap [ rrrrrr : te mplatetemplatete m p l a t e " http :/ / ex a m p l e . o rg / me m ber /{ $. id }" ];
rrrrrr : pr e d ica t e O b j ect M a ppre d i c a t eOb j e c t M appredica t e O b j ectM a p [
rrrrrr : p r e d i c a t epredicatepredicate f oaf : knows ;
rrrrrr : o b j e c t M a pobjectMapobjectMap [
rrrrrr : pa r e ntTr i p l e s M appar e n t T r i p les M a pp a r e n tTri p l e s M a p <# Mbox >;
rrrrrr : jo i n C o n ditionjoinCo n d i t i o nj o i n C o n d ition [ rrrrrr : childchil dc h ild "$. c o ntacts .*"; rrrrrr : p a rentpare n tp a rent "$. e mails . *" ] ] ].
Listing 2: xR2RML example mapping graph.
original query to the native query engines, thus mak-
ing sub-queries as selective as possible.
3 THE xR2RML MAPPING
LANGUAGE
The xR2RML mapping language (Michel et al.,
2015a) is designed to map an extensible scope of re-
lational and non-relational databases to RDF. It is in-
dependent of any query language or data model. It
is backward compatible with R2RML and it relies on
RML for the handling of various data formats. It can
translate data with mixed embedded formats and gen-
erate RDF lists and containers. Below we shortly de-
scribe the main xR2RML features and propose a run-
ning example.
xR2RML Language Description. An xR2RML
mapping defines a logical source (property
xrr:logicalSource) as the result of executing a
query (property xrr:query) against an input database.
Data from the logical source is mapped to RDF
triples using triples maps. A triples map consists of
several term maps: they are functions that extract
values from a query result set, and translate them
into RDF terms. A subject map generates the subject
of RDF triples, and multiple predicate-object maps
produce the predicate and object terms. Optionally, a
graph map is used to name a target graph. Listing 2
depicts two triples maps <#Mbox> and <#Knows>, each
consisting of a subject map, a predicate map and an
object map.
Term maps extract data from query results by eval-
uating xR2RML data element references, hereafter
named xR2RML references. The syntax of xR2RML
references depends on the target database e.g. a
column name in case of a relational database, an
XPath expression in case of a native XML database,
or a JSONPath
1
expression in case of JSON doc-
uments like in MongoDB. xR2RML references are
used with properties xrr:reference and rr:template.
The value of a xrr:reference property is a sin-
gle xR2RML reference, whereas the value of a
rr:template property is a template string possibly in-
volving several references. Properties xrr:reference
and rr:template may also accept mixed-syntax path
expressions that are useful to deal with content of
mixed formats: for instance, a JSON value embedded
in the cells of a relational table can be addressed by
specifying the column name followed by a JSONPath
expression.
Running Example. We define a running example
that we shall use throughout this paper. The reader in-
terested in more detailed examples of xR2RML may
look at (Michel et al., 2015a; Callou et al., 2015).
We consider a MongoDB database with a collection
people depicted in Listing 1: each JSON document
provides the identifier, email addresses and contacts
of a person; contacts are given by their email ad-
1
http://goessner.net/articles/JsonPath/
A Generic Mapping-based Query Translation from SPARQL to Various Target Database Query Languages
149
< A b s t ractQuer y > ::= < A tomicQ u e ry > | <Query > | < Query > FILTER
FILT E R
FILT E R < S PARQL filter >
< Query > :: = < Abstr a c t Q uery > INN E R
INN E R
INN E R J OIN
JOI N
JOI N < Abstrac t Q u ery > ON
ON
ON {v
1
,... v
n
} |
< A b s t ractQuer y > AS
AS
AS c hild INNER
INN E R
INN E R J OIN
JOI N
JOI N < Abstrac t Q u ery > AS
AS
AS p a rent
ON
ON
ON c hild / < Ref > = par e n t / < Ref > |
< A b s t ractQuer y > L EFT
LEF T
LEF T O U TER
OUT E R
OUT E R J OIN
JOI N
JOI N < Abstrac t Q u ery > ON
ON
ON {v
1
,... v
n
}|
< A b s t ractQuer y > U N ION
UNI O N
UNI O N < Abstract Q u e ry >
< A t omicQu e r y > :: = { From , Proj ect , Wh e re }
Listing 3: Grammar of the Abstract Pivot Query Language.
dresses. Listing 2 defines two xR2RML triples maps.
The logical source of triples map <#Mbox>, respec-
tively <#Knows>, is a MongoDB query that retrieves
documents having a non-null emails field, respec-
tively a contacts array field with at least one element.
Both subject maps use a template to build IRI terms
by concatenating http://example.org/member/ with
the value of JSON field id. Applied to the first docu-
ment in Listing 1, the triples maps generate three RDF
triples:
< h ttp :// e x a m p le . org / me mber /1056 32 >
foa f : m box " jo h n @ f o o .com ";
foa f : m box " john@example . org ";
foa f : k n ows
< h ttp :// e x a m p le . org / me mber /32756 3 >.
When the evaluation of an xR2RML reference
produces several RDF terms, the xR2RML processor
creates one triple for each term. Alternatively, it can
group them in an RDF list (rdf:List) or collection
(rdf:Seq, rdf:Bag and rdf:Alt). This is achieved us-
ing specific values of the rr:termType property. Be-
sides, property xrr:nestedTermMap is a means to cre-
ate nested lists and collections, and to qualify terms
of a list or collection with a language tag or data type.
Like R2RML, xR2RML can model cross-
references: a referencing object maps uses subject
values produced by the subject map of another triples
map (the parent) as objects. Properties rr:child and
rr:parent specify the join condition between docu-
ments of the current triples map (the child), and the
parent triples map. This is illustrated in triples map
<#Knows>, that joins email addresses from the emails
and contacts fields.
4 REWRITING A SPARQL
QUERY INTO AN ABSTRACT
QUERY
4.1 Abstract Query Language
Our pivot abstract query language complies with the
grammar defined in Listing 3. Operators INNER JOIN
ON, LEFT OUTER JOIN ON and UNION follow the se-
mantics of SQL operators of the same name, with the
difference that the semantics of UNION is that of the
SQL UNION ALL, i.e. it keeps duplicate entries. They
are entailed by the dependencies between graph pat-
terns of the SPARQL query. The first INNER JOIN no-
tation is entailed by join constraints implied by shared
variables. The second INNER JOIN notation, includ-
ing the A S child, A S parent and O N child/<Ref>
= parent/<Ref> notations, is entailed by join con-
straints expressed in xR2RML mappings using refer-
encing object maps. The computation of these op-
erators shall be delegated to the target database if it
supports them (i.e. if the target query language has
equivalent operators like in the case of a relational
database), or to the query processing engine otherwise
(case of MongoDB).
The translation of a SPARQL query into an ab-
stract query consists of three steps:
1. A SPARQL graph pattern is decomposed into an
abstract expression exhibiting only operators from
the abstract query language and SPARQL triple
patterns: see function trans
m
in section 4.2;
2. The xR2RML triples maps that are likely to gen-
erate RDF triples matching each triple pattern are
identified: see function bind
m
in section 4.3; and
3. Each triple pattern is translated into one or several
atomic abstract queries (<AtomicQuery>), under
the set of xR2RML triples maps identified in step
2. Each atomic query is made as selective as pos-
sible by pushing relevant SPARQL filter condi-
tions: see function transTP
m
in section 4.4.
4.2 Translation of a SPARQL Graph
Pattern
Function trans
m
(Definition 1) translates a well-
designed SPARQL graph pattern (P
´
erez et al., 2009)
into an abstract query, while making no assumption
with respect to the target database query capabilities.
It relies on function transTP
m
(section 4.4) to trans-
late each triple pattern, and extends the translation al-
gorithms defined in (Chebotko et al., 2009), (Unbe-
hauen et al., 2013a) and (Priyatna et al., 2014). In
particular we propose a generalized management of
SPARQL filters: the goal is to push down SPARQL
WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies
150
Definition 1. Translation of a SPARQL query into an abstract query under xR2RML mappings.
Let m be an xR2RML mapping graph consisting of a set of xR2RML triples maps. Let gp be a well-designed
SPARQL graph pattern. trans
m
(gp) is the translation, under m, of gp into an abstract query. trans
m
is defined as
follows:
trans
m
(gp) = trans
m
(gp, true)
if gp consists of a single triple pattern tp, trans
m
(gp, f) = transTP
m
(tp, sparqlCond(tp, f))
if gp is (P FILTER f’), trans
m
(gp, f) = trans
m
(P, f && f’) FILTER sparqlCond(P, f && f’)
if gp is (P1 AND P2), trans
m
(gp, f) = trans
m
(P1, f) INNER JOIN trans
m
(P2, f) ON var(P1) var(P2)
if gp is (P1 OPTIONAL P2), trans
m
(gp, f) =
trans
m
(P1, f) LEFT OUTER JOIN transF
m
(P2, f) ON var(P1) var(P2)
if gp is (P1 UNION P2), trans
m
(gp) =
trans
m
(P1, f) LEFT OUTER JOIN trans
m
(P2, f) ON var(P1) var(P2)
UNION
trans
m
(P2, f) LEFT OUTER JOIN trans
m
(P1, f) ON var(P1) var(P2)
filters into the translation of each triple pattern, in or-
der to make inner queries more selective and limit the
size of intermediate results. A SPARQL filter f can be
considered as a conjunction of n conditions (n 1):
C
1
&& ... C
n
. Function sparqlCond, further detailed
in (Michel et al., 2015b), discriminates between con-
ditions with regards to two criteria:
(i) A condition wherein all variables show in a sin-
gle triple pattern tp of the SPARQL query is pushed
into the translation of tp by function transTP
m
. This
ensures that filters are applied at the earliest stage, as
opposed to the encapsulating SELECT WHERE strat-
egy in SPARQL-to-SQL translations.
(ii) For a condition wherein at least one variable is
shared by several triple patterns, a FILTER operator is
created to represent the join criteria.
Note that a condition may match both criteria.
Running Example. We illustrate this process with
the running example introduced in section 3 and the
SPARQL query Q depicted below, in which tp
1
, tp
2
and tp
3
denote the triple patterns and c
1
and c
2
denote
the conditions of the SPARQL filter.
SELECT ?x WHERE {
?x foaf:mbox ?mbox1. # tp
1
?y foaf:mbox "john@foo.com". # tp
2
?x foaf:knows ?y. # tp
3
FILTER {
contains(str(?mbox1), "foo.com") && # c
1
?x != ?y } # c
2
}
Let us compute function sparqlCond for each triple
pattern:
tp
1
has two variables, ?x and ?mbox1. No con-
dition involves both variables, but c
1
involves
?mbox1 and has no other variable, thereby c
1
matches criteria (i) for tp
1
. Condition c
2
involves
?x but it also involves ?y that is not in tp
1
. Hence,
sparqlCond(tp
1
, c
1
&& c
2
) = c
1
.
tp
2
has one variable, ?y, and no condition involves
only ?y. Hence, no condition can be pushed
into the translation of tp
2
, that we denote by
sparqlCond(tp
2
, c
1
&& c
2
) = true.
tp
3
has two variables ?x and ?y, and only
condition c
2
involves them both. Hence,
sparqlCond(tp
3
, c
1
&& c
2
) = c
2
.
Lastly, only condition c
2
involves variables shared
by several triples patterns: ?x and ?y, this shall
entail a FILTER operator.
Finally we come up with the following abstract query:
trans
m
(Q, c
1
&& c
2
) = transTP
m
(tp
1
, c
1
)
INNER JOIN transTP
m
(tp
2
, true) ON {}
INNER JOIN transTP
m
(tp
3
, c
2
) ON {?x,?y}
FILTER(c
2
)
Note that an INNER JOIN on an empty set of variables
is equivalent to a Cartesian product (a CROSS JOIN in
SQL).
4.3 Binding xR2RML Triples Maps to
Triple Patterns
Before we define function transTP
m
, that translates
SPARQL triple patterns into atomic abstract queries,
we elaborate on how to figure out which ones of
the xR2RML triple maps are likely to generate RDF
triples matching the triple pattern.
In the following, we assume that xR2RML triples
are normalized in the sense defined by (Rodr
´
ıguez-
Muro and Rezk, 2015) for R2RML: a normalized
triples map contains exactly one predicate-object map
with exactly one predicate map and one object map,
and any rr:class property is replaced by an equiv-
alent predicate-object map with a constant predicate
rdf:type. Also, we denote by TM.sub, TM.pred and
A Generic Mapping-based Query Translation from SPARQL to Various Target Database Query Languages
151
Definition 2. Binding of xR2RML mappings to SPARQL triple patterns.
Let m be a set of xR2RML triples maps, and gp be a well-designed graph pattern. bind
m
(gp) is the set of triple
pattern bindings of gp under m, defined as follows:
bind
m
(gp) = bind
m
(gp, true)
if gp consists of a single triple pattern tp, bind
m
(gp, f) is the pair (tp, TMSet) where TMSet = {TM | TM m
compatible(TM.sub, tp.sub, f) compatible(TM.pred, tp.pred, f) compatible(TM.obj, tp.obj, f)}
if gp is (P1 AND P2), bind
m
(gp, f) = reduce(bind
m
(P1, f), bind
m
(P2, f)) reduce(bind
m
(P2, f), bind
m
(P1,
f))
if gp is (P1 OPTIONAL P2), bind
m
(gp, f) = bind
m
(P1, f) reduce(bind
m
(P2, f), bind
m
(P1, f))
if gp is (P1 UNION P2), bind
m
(gp, f) = bind
m
(P1, f) bind
m
(P2, f)
if gp is (P FILTER f’), bind
m
(gp, f) = bind
m
(P, f && f’)
TM.obj respectively the subject map, the predicate
map and the object map of triples map TM. Lastly,
we adapt the concept of triple pattern binding defined
by Unbehauen et al. as follows:
Definition 3. Let m be an xR2RML mapping graph
consisting of a set of xR2RML triples map, and tp be
a triple pattern. A triples map TM m is bound to tp
if it is likely to produce triples matching tp. A triple
pattern binding is a pair (tp, TMSet) where TMSet is
the set of triples maps of m that are bound to tp.
Function bind
m
, depicted in Definition 2, deter-
mines, for a graph pattern gp, the bindings of each
triple pattern of gp. It takes into account join con-
straints implied by shared variables, and the SPARQL
filter constraints whose unsatisfiability can be verified
statically. This is achieved by means of two functions:
compatible and reduce. These functions were intro-
duced by (Unbehauen et al., 2013a), but important
details were left untold. Especially, the authors did
not formally define what the compatibility between a
term map and a triple pattern term means, and they
did not investigate the static compatibility between a
term map and a SPARQL filter. Below we describe
these functions in details and extend them to fit our
context of an abstract query language.
Function compatible (Definition 4) checks if a
term map is compatible with a triple pattern term and
a SPARQL filter. A term map is always considered
compatible with a variable triple pattern term, unless
a SPARQL filter contradicts the term map. These
situations are identified in function compatibleFilter
(Definition 5), they pertain to type constraints ex-
pressed using SPARQL operators isIRI, isLiteral or
isBlank, as well as language and data type constraints
expressed using operators lang and datatype. For in-
stance, if variable ?var is matched with an object
map that produces literals (rr:termType rr:Literal),
the SPARQL constraint isIRI(?var) is unsatisfiable.
When the triple pattern term is not a variable, function
compatible identifies the similar situations wherein
the triple pattern term and the term map cannot
match with regards to the type of the triple pattern
term (literal, IRI, blank node), its language tag (e.g.
"string"@en) or its data type (e.g. 10ˆˆxsd:integer).
Function reduce uses the variables shared by two
triple patterns to detect unsatisfiable join constraints,
and thus reduces the set of triple maps bound to each
triple pattern. For instance, let us consider two triple
patterns tp
1
and tp
2
that have a shared variable v,
triples map TM
1
is bound to tp
1
and triples map TM
2
is bound to tp
2
. If the term map associated to v in TM
1
generates literals whereas the term map associated to
v in TM
2
generates IRIs, we say that the term maps are
incompatible. Consequently, function reduce rules
out TM
1
from the bindings of tp
1
and TM
2
from the
bindings of tp
2
. In other words, reduce(bind
m
(tp
1
),
bind
m
(tp
2
)) returns the reduced bindings of tp
1
such
that the term maps associated to v in the bindings of
tp
1
are compatible with the term maps associated to v
in the bindings of tp
2
. A formal definition of function
reduce is given in (Michel et al., 2015b), and the con-
cept of compatibility between term maps is shown in
Definition 6.
Running Example. Let us consider the SPARQL
query Q proposed in section 4.2. We first compute the
triple pattern bindings for tp
1
, tp
2
and tp
3
indepen-
dently. The constant predicate of tp
1
and tp
2
matches
the constant predicate map of triples map <#Mbox>.
The subject and object of tp
1
are variables and the
constant object of tp
2
(“john@foo.com”) is compat-
ible with the object map of <#Mbox>. Consequently
<#Mbox> is bound to both:
bind
m
(tp
1
, c
1
&& c
2
) = (tp
1
, {<#Mbox>})
bind
m
(tp
2
, c
1
&& c
2
) = (tp
2
, {<#Mbox>})
Similarly we can show that <#Knows> is bound to tp
3
:
bind
m
(tp
3
, c
1
&& c
2
) = (tp
3
, {<#Knows>}).
Now let us consider the join constraint implied by the
shared variable ?y:
?y foaf:mbox "john@foo.com". # tp
2
?x foaf:knows ?y. # tp
3
?y is the subject in tp
2
that is bound to <#Mbox>, thus
?y is associated to <#Mbox>s subject map. ?y is also
WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies
152
Definition 4. Compatibility between a term map, a triple pattern term and a SPARQL filter.
Let tpTerm be a triple pattern term, termMap be a term map of an xR2RML triples map TM and f be a SPARQL
filter. It holds that termMap is compatible with tpTerm and f, denoted by compatible(termMap, tpTerm, f), if
termMap is compatible with filter f denoted by compatibleFilter(termMap, f), and either (i) tpTerm is a variable
or (ii) none of the following assertions holds:
tpTerm is a literal and the term type of termMap is not rr:Literal;
tpTerm is an IRI and the term type of termMap is not rr:IRI;
tpTerm is a blank node and the term type of termMap is not one of rr:BlankNode, xrr:RdfList, xrr:RdfBag,
xrr:RdfSeq, xrr:RdfAlt;
tpTerm is a literal with a language tag L, and the language of termMap is undefined or different from L;
tpTerm is a literal with a datatype T, and the datatype of termMap is either undefined or different from T;
termMap is constant-valued with value V, and tpTerm is different from V;
termMap is template-valued with template string T, and tpTerm cannot match T;
termMap is a ReferencingObjectMap and the subject map of the parent triples map is not compatible with
tpTerm, i.e. ¬compatibleTermMaps(termMap.parentTriplesMap.subjectMap, tpTerm).
Definition 5. Compatibility between a term map and a SPARQL filter.
Let termMap be an xR2RML term map and f be a SPARQL filter. termMap is compatible with f, denoted as
compatibleFilter(termMap, f) if f =“true” or none of the following assertions holds:
a necessary condition of f is isIRI(?var) and the term type of termMap is not rr:IRI;
a necessary condition of f is isLiteral(?var) and the term type of termMap is not rr:Literal;
a necessary condition of f is isBlank(?var) and the term type of termMap is not rr:BlankNode;
a necessary condition of f is lang(?var)=“L” or langMatches(lang(?var),“L”), and the language of termMap
is either not defined or different from L;
a necessary condition of f is datatype(?var)=<T> and the datatype of termMap is either undefined or different
from <T>.
Definition 6. Compatibility between two term maps.
Let termMap
1
and termMap
2
be two xR2RML term maps. It holds that termMap
1
and termMap
2
are compatible,
denoted by compatibleTermMaps(termMap
1
, termMap
2
) if none of the following assertions holds:
termMap
1
and termMap
2
have different term types (property rr:termType).
termMap
1
and termMap
2
have different language tags, or one has a language tag and the other does not.
termMap
1
and termMap
2
are both template-valued, and they have incompatible template strings.
termMap
1
(resp. termMap
2
) is a referencing object map and the subject map
of its parent triples maps is not compatible with termMap2 (resp. termMap1),
i.e. ¬compatibleTermMaps(termMap
1
.parentTriplesMap.subjectMap, termMap
2
), (resp.
¬compatibleTermMaps(termMap
1
, termMap
2
.parentTriplesMap.subjectMap))
the object in tp
3
that is bound to <#Knows>, thus ?y
is associated to <#Knows>s object map. The latter
is a referencing object map whose parent is <#Mbox>.
Therefore, reduce(bind
m
(tp
2
, c
1
&&c
2
), bind
m
(tp
3
,
c
1
&&c
2
)) checks if the subject map of <#Mbox> is com-
patible with the object map of <#Knows>, that amounts
to check if the subject map of <#Mbox> is compatible
with itself, which is obvious. We can then show that
reduce(bind
m
(tp
2
, c
1
&&c
2
),bind
m
(tp
3
, c
1
&&c
2
))=
(tp
2
, {<#Mbox>}), and
reduce(bind
m
(tp
3
, c
1
&&c
2
),bind
m
(tp
2
, c
1
&&c
2
))=
(tp
3
, {<#Knows>})
In a similar manner, we can show that the join con-
straint implied by variable ?x, shared by tp
1
and tp
3
,
does not rule out any binding. Lastly, we obtain:
bind
m
(tp
1
AND tp
2
AND tp
3
, c
1
&&c
2
) =
{(tp
1
, {<#Mbox>}),
(tp
2
, {<#Mbox>}),
(tp
3
, {<#Knows>})}
A Generic Mapping-based Query Translation from SPARQL to Various Target Database Query Languages
153
4.4 Translation of a SPARQL Triple
Pattern
Below we define the transTP
m
function and we elab-
orate on its main concepts. The interested reader is
referred to (Michel et al., 2015b) for the comprehen-
sive algorithm transTP
m
.
Definition 7. Function transTP
m
.
Let m be an xR2RML mapping graph consisting of a
set of xR2RML triples maps, gp be a well-designed
graph pattern, tp be a triple pattern of gp, and f be
a SPARQL filter expression. Let getBoundTMs
m
be
the function that, given gp, tp and f, returns the set of
triples maps of m that are bound to tp in bind
m
(gp,f).
We denote by transTP
m
(tp,f) the translation, under
getBoundTMs
m
(gp,tp,f), of “tp FILTER f” into an ab-
stract query whereof results can be translated into
RDF triples matching “tp FILTER f”.
The abstract query generated by function
transTP
m
consists of operators from the abstract
query language and atomic abstract queries. An
atomic abstract query is obtained by matching a triple
pattern with a triples map and denoted by {From,
Project, Where} that we describe below. We have
seen in the definition of function bind
m
that several
triples maps may be bound to a single triple pattern
tp, each one may produce a subset of the RDF triples
matching tp. In such a case, transTP
m
translates tp
into a UNION of per-triples-map atomics abstract
queries. Additionally, we have seen that an xR2RML
triples map may denote a cross-reference by means
of a referencing object map, e.g. child triples map
TM
1
produces the subject and predicate terms while
parent triples map TM
2
produces object terms. This
construct is translated by transTP
m
into the INNER
JOIN of two atomic abstract queries:
{From
1
, Project
1
, Where
1
} AS child
INNER JOIN
{From
2
, Project
2
, Where
2
} AS parent
ON child/childRef = parent/parentRef
where childRef and parentRef denote the values
of properties rr:child and rr:parent respectively.
This is illustrated by the translation of tp
3
in List-
ing 4. Interestingly, we notice that INNER JOINs may
be implied by shared SPARQL variables as well as
cross-references denoted in the mappings. Similarly,
UNIONs may arise from the SPARQL UNION opera-
tor or the binding of several triples maps to a triple
pattern.
We now describe the three components of an
atomic abstract query.
- The From part provides the concrete query that the
abstract query relies on. It contains the logical source
of a triples map i.e. the xrr:query property and an
optional iterator (property rml:iterator). In our run-
ning example, triples map <#Mbox> is bound to tp
1
, the
From part consists of the query in the logical source of
<#Mbox>: db.people.find({’emails’:{$ne:null}})
- Project is the set of xR2RML references that must
be projected, i.e. returned as part of the query results.
An xR2RML reference may be e.g. a column name
in a relational database, a JSONPath expression for
MongoDB database or an XPath expression for a na-
tive XML database. If an xR2RML reference corre-
sponds to a variable of the triple pattern, it is always
projected. In our running example, the subject and ob-
ject of tp
1
are the ?x and ?mbox1 variables. The refer-
ences in the subject map and object map of triples map
<#Mbox> must be projected, hence the Project part for
tp
1
: {$.id AS ?x, $.emails.* AS ?mbox1}. Further-
more, the child and parent joined references of a ref-
erencing object map must be projected in order to fit
databases that do not support joins. In the relational
database case, those projected references (columns)
are useless since the database can compute the join
operation. Conversely, in MongoDB for instance, the
join shall be processed by the query processing en-
gine, therefore joined references are necessary.
- Where is a set of conditions about xR2RML
references. They are entailed by matching each term
of triple pattern tp with its corresponding term map
in triples map TM: the subject of tp is matched with
TMs subject map, the predicate with TMs predicate
map and the object with TMs object map. Additional
conditions are entailed from the SPARQL filter f.
In (Michel et al., 2015b) we show that three types of
condition may be created:
(i) a SPARQL variable in the triple pattern is turned
into a not-null condition on the xR2RML reference
corresponding to that variable in the term map,
denoted by isNotNull(<xR2RML reference>);
(ii) A constant triple pattern term (IRI or literal) is
turned into an equality condition on the xR2RML
reference corresponding to that RDF term in the
term map, denoted by equals(<xR2RML reference>,
value);
(iii) A SPARQL filter condition about a SPARQL
variable is turned into a filter condition, denoted by
sparqlFilter(<xR2RML reference>, f).
Running Example. Triple pattern tp
2
is matched
with <#Mbox>. It has the variable ?y in the subject po-
sition and a constant term in the object position. Con-
sequently the Where part for tp
2
contains two condi-
tions: isNotNull($.id) and
equals($.emails.*, "john@foo.com"). When we
put all the pieces together, we can rewrite the
WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies
154
tra n s
tra n s
tra n s
m
( tp
1
AN D tp
2
AN D tp
3
, c
1
&& c
2
) = t r a n sTP
transT P
transT P
m
( tp
1
,c
1
)
INN E R
INN E R
INN E R J OIN
JOI N
JOI N t r a n s T P
transT P
transT P
m
( tp
2
, t rue ) ON
ON
ON {}
INN E R
INN E R
INN E R J OIN
JOI N
JOI N t r a n s T P
transT P
transT P
m
( tp
3
,c
2
) ON
ON
ON {? x ,? y }
FILT E R
FILT E R
FILT E R (? x != ? y )
transT P
transT P
transT P
m
( tp
1
,c
1
) =
{ From
Fro m
Fro m : {" db . people . fin d ( { emails : {$ne : nu ll }})"} ,
Projec t
Projec t
Projec t : {$. id AS
AS
AS ?x , $. e m a ils .* AS
AS
AS ? mbo x 1 } ,
Whe r e
Whe r e
Whe r e : {isNotNull
isNotNull
isNotNull ($. id ) , i s N o tNull
isNotNull
isNotNull ($. emails .*) ,
sparql F i l t e r
sparqlF i l t e r
sparqlF i l t e r ( c o n t a i n s ( s tr (? mbox1 ) ," f oo . com "))} }
transT P
transT P
transT P
m
( tp
2
, t rue ) =
{ From
Fro m
Fro m : {" db . people . fin d ( { emails : {$ne : nu ll }})"} ,
Projec t
Projec t
Projec t : {$. id AS
AS
AS ?y} ,
Whe r e
Whe r e
Whe r e : {isNotNull
isNotNull
isNotNull ($. id ) , equals
equa l s
equa l s ($. em ails .* ," j o h n @foo . com ")}}
transT P
transT P
transT P
m
( tp
3
,c
2
) =
{ From
Fro m
Fro m : {" db . people . fin d ( { conta cts :{$s ize : {$gt e :1}}}) " } ,
Projec t
Projec t
Projec t : {$. id AS
AS
AS ?x , $. c o n t a c t s .*} ,
Whe r e
Whe r e
Whe r e : {isNotNull
isNotNull
isNotNull ($. id ) , i s N o tNull
isNotNull
isNotNull ($. co n t acts .*) ,
sparqlF i l t e r
sparqlF i l t e r
sparql F i l t e r (? x != ? y )}} AS
AS
AS c hlid
INN E R
INN E R
INN E R J OIN
JOI N
JOI N
{ From
Fro m
Fro m : {" db . people . fin d ( { emails :{$ne : null }})" },
Projec t
Projec t
Projec t : {$. em ails .* , $. id AS
AS
AS ?y} ,
Whe r e
Whe r e
Whe r e : {isNotNull
isNotNull
isNotNull ($. emails .*) , is N o t N u l l
isNotNull
isNotNull ($. id ) ,
sparqlF i l t e r
sparqlF i l t e r
sparql F i l t e r (? x != ? y )}} AS
AS
AS p a rent
pare n t
pare n t
ON
ON
ON c hild
chi l d
chi l d /$. contacts .* = parent
pare n t
pare n t /$. em ails .*
Listing 4: Rewriting of SPARQL query Q into an abstract query.
SPARQL query Q into the abstract query depicted in
Listing 4.
4.5 Abstract Query Optimization
At this point, our method produces abstract queries
that are effective, i.e. that preserve the semantics of
SPARQL queries. Yet, their structure may show un-
necessary complexity, and entail inefficient queries
when translated into a target query language. Al-
though query optimizations may be postponed to the
final translation step, it is interesting to figure out
which ones can be achieved on the abstract repre-
sentation first, and leave only database-specific opti-
mizations to the latter stage. SPARQL-to-SQL meth-
ods proposed various SQL query optimizations (Un-
behauen et al., 2013b; Rodr
´
ıguez-Muro and Rezk,
2015; Elliott et al., 2009), that are often independent
of SQL. Below we review some of these techniques
referring to the terminology defined in (Unbehauen
et al., 2013b). We show that some of them are im-
plemented in our method by construction, and others
apply in the context of our abstract query language.
Filter Optimization. In a naive approach, strings
generated by R2RML templates are dealt with using
an SQL comparison of the resulting strings rather than
the database values used in the template. This is no-
tably the case of IRIs that are generally built accord-
ing to a string template. As a consequence, the query
evaluation cannot take advantage of existing indexes
and performs poorly. In our approach, equality con-
ditions apply to xR2RML references rather than on
the generated IRIs, hence the Filter Optimization is
enforced by construction.
Filter Pushing. As mentioned earlier, the transla-
tion of a SPARQL filter into an encapsulating SELECT
WHERE clause tends to lower the selectivity of inner
queries, and the query evaluation process may have to
deal with unnecessarily large intermediate results. In
our approach, Filter pushing is achieved by construc-
tion in function trans
m
by pushing down SPARQL fil-
ters, as much as possible, in the translation of each
triple pattern.
Self-join Elimination. A self-join may occur
when several triples maps share the same logical
source. This can result in several triple patterns be-
ing translated into atomic abstract queries with the
same From part. The Self-Join Elimination consists
in merging the criteria of both atomic queries into a
single equivalent query. In our running example (List-
ing 4), the atomic query in transTP
m
(tp
2
, true) and
the second atomic query in transTP
m
(tp
3
, c
2
) have
the same From part and project the same variable ?y.
Using joins commutativity, those two queries can be
merged into a single one depicted in the third atomic
abstract query in Listing 5.
A Generic Mapping-based Query Translation from SPARQL to Various Target Database Query Languages
155
tran s m (tp1 AND tp2 AND tp3 , c1 && c2 ) =
{ From
Fro m
Fro m : {" db . people . fin d ( { emails :{$ne : n ull }})"} ,
Projec t
Projec t
Projec t : {$. id AS
AS
AS ?x , $. e m a ils .* AS
AS
AS ? mbo x 1 } ,
Whe r e
Whe r e
Whe r e : {isNotNull
isNotNull
isNotNull ($. id ) , i s N o tNull
isNotNull
isNotNull ($. emails .*) ,
sparqlF i l t e r
sparql F i l t e r
sparqlF i l t e r ( c o n t a i n s ( s tr (? mbox1 ) ," f oo . com "))} }
INN E R
INN E R
INN E R J OIN
JOI N
JOI N
{ From
Fro m
Fro m : {" db . people . fin d ( { conta cts :{$s ize : {$gt e :1}}}) " } ,
Projec t
Projec t
Projec t : {$. id AS
AS
AS ?x , $. c o n t a c t s .*} ,
Whe r e
Whe r e
Whe r e : {isNotNull
isNotNull
isNotNull ($. id ) , i s N o tNull
isNotNull
isNotNull ($. co n t acts .*) ,
sparqlF i l t e r
sparql F i l t e r
sparqlF i l t e r (? x != ? y )}} AS
AS
AS c hlid
ON
ON
ON {? x ,? y }
INN E R
INN E R
INN E R J OIN
JOI N
JOI N
{ From
Fro m
Fro m : {" db . people . fin d ( { emails :{$ne : null }})" },
Projec t
Projec t
Projec t : {$. em ails .* , $. id AS
AS
AS ?y} ,
Whe r e
Whe r e
Whe r e : {isNotNull
isNotNull
isNotNull ($. emails .*) , is N o t N u l l
isNotNull
isNotNull ($. id ) ,
equa l s
equa l s
equa l s ($. em ails .* ," j o h n @foo . com ") ,
sparql F i l t e r
sparqlF i l t e r
sparqlF i l t e r (? x != ? y )}} AS
AS
AS p a rent
pare n t
pare n t
ON
ON
ON c hild
chi l d
chi l d /$. contacts .* = parent
pare n t
pare n t /$. em ails .* )
FILT E R
FILT E R
FILT E R (? x != ? y )
Listing 5: Optimization of query Q by self-join elimination.
Optional-self-join Elimination. The self-join is-
sue can equally occur in the case of an OPTIONAL
triple pattern that is translated into a LEFT OUTER
JOIN. Similarly to the Self-Join Elimination, we can
merge abstract atomic queries with the difference that
null values must be allowed for terms that only show
in the right operand of the left join. As a result, isNot-
Null conditions of the right operand are removed, and
equals conditions of the form equals(expr, value)
are replaced with a new type of condition including
an isNull condition and an OR operator:
isNull(expr) OR equals(expr, value).
Self-union Elimination. A UNION operator can
be created either due to the SPARQL UNION operator
or during the translation of a triple pattern to which
several triples maps are bound (in function transTP
m
).
Similarly to the Self-Join Elimination, a union of sev-
eral atomic abstract queries sharing the same logical
source can be merged into a single query.
Projection Pushing. In future works, we intend to
study the relevance and applicability of the Projection
Pushing (Elliott et al., 2009) that helps to efficiently
deal with queries on distinct values of variables bound
with constant term maps, such as:
SELECT DISTINCT ?p WHERE {?s ?p ?o}.
5 APPLICATION TO MongoDB
MongoDB is a NoSQL database that stores data as
JSON documents (more precisely Binary-JSON). Its
JavaScript interface defines a declarative query lan-
guage exemplified in the logical sources of Listing 2.
In recent years, MongoDB has become a leader in the
NoSQL market, making it an interesting candidate for
RDF-based data integration systems, and a potential
contributor to the Web of Data.
In our running example, we have shown how to
translate a SPARQL query into an abstract query, un-
der xR2RML mappings of arbitrary MongoDB doc-
uments to RDF. Unlike SQL or XQuery whose ex-
pressiveness is similar to that of SPARQL, the expres-
siveness of the MongoDB query language is far more
limited: joins are not supported and filters are sup-
ported with strong restrictions (e.g. no comparison
between fields of a document). Consequently, in the
abstract query of Listing 5, the INNER JOIN and FIL-
TER operators on the one hand, and the sparqlFilter
conditions on the other hand, cannot be translated into
equivalent MongoDB queries. Hence, they shall be
computed by the query processing engine. In (Michel
et al., 2015b), we show that it is possible to rewrite an
atomic abstract query with isNotNull and equals con-
ditions into a union of MongoDB queries that shall
retrieve at least all matching documents.
Implementation. To validate our method, we are
developing an open source prototype implementation
available on Github
2
. The prototype currently im-
plements the rewriting of an atomic abstract query
into concrete MongoDB queries. The translation of
a SPARQL query into an abstract query and its eval-
uation by the query processing engine is an on-going
development work at the time of writing.
2
https://github.com/frmichel/morph-xr2rml/tree/query re
write
WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies
156
6 CONCLUSION AND
PERSPECTIVES
The method proposed in this paper aims at fostering
the development of SPARQL interfaces to heteroge-
neous databases, as we believe this is a key to the ad-
vent of the Web of Data.
Leveraging R2RML-based SPARQL-to-SQL
works, we have defined a method to translate a
SPARQL query into a pivot abstract query, utilizing
xR2RML to describe the mapping of a target database
to RDF. The method determines a reduced set of
mappings matching each SPARQL triple pattern,
and takes into account join constraints and SPARQL
filters. Lastly, several query optimizations are ap-
plied to the abstract query in order to facilitate the
subsequent translation into the target query language.
At this stage, our method has some limitations
that we may address in the future. Firstly, SPARQL
named graphs and solution modifiers (DISTINCT,
OFFSET, LIMIT, ORDER BY, HAVING) are not con-
sidered. Besides, like in most SPARQL rewriting ap-
proaches so far, SPARQL 1.1 is not fully supported, in
particular with respect to property paths and negation
operators (NOT EXISTS, MINUS). In the translation
of a triple pattern, the management of SPARQL fil-
ters is postponed to the translation into a target query
using the sparqlFilter condition. Yet, further works
shall try to raise filters at the abstract query level. For
instance SPARQL operators BIND or VALUES may be
turned into equivalent equals conditions, and BOUND
into isNotNull conditions. Secondly, thanks to the ex-
ample of MongoDB, we have shown that bridging the
gap between the expressiveness of SPARQL and that
of the target query language may entail the genera-
tion of multiple independent target database queries,
delegating several steps to the query processing en-
gine, e.g. joins or complex filtering. Therefore, some
classical query optimization questions shall arise dur-
ing the development of the query processing engine,
such as what is the most efficient order to compute IN-
NER JOINs of intermediate queries. In this regard, the
query processing engine may need to embark query
plan optimization logics such as the bind join (Haas
et al., 1997) to inject intermediary results into a sub-
sequent query, and the join re-ordering based on the
number of results that queries shall retrieve, very sim-
ilarly to the methods applied in distributed SPARQL
query engines (Schwarte et al., 2011; G
¨
orlitz and
Staab, 2011).
We are currently developing a prototype imple-
mentation of our method. In the short-term we in-
tend to run performance evaluations. Beyond this, we
envisage two real-life use cases. Firstly, in the con-
text of the Zoomathia research project
3
, a taxonomic
reference designed to support studies in Conservation
Biology was translated into a SKOS
4
thesaurus (Cal-
lou et al., 2015). The taxonomic reference is stored in
a MongoDB database, and the RDF graph is material-
ized at once. Our perspective is to provide a dynamic
access to the SKOS thesaurus using our SPARQL-to-
MongoDB prototype. Secondly, we are having dis-
cussions with researchers who intend to explore the
added value of Semantic Web technologies to sup-
port ecology and agronomic studies. They maintain
a large MongoDB database of phenotype information
about thousands of plants, that they wish to access us-
ing SPARQL. This context would be a significant and
realistic use case of our method.
REFERENCES
Bikakis, N., Tsinaraki, C., Stavrakantonakis, I., Gi-
oldasis, N., and Christodoulakis, S. (2015). The
SPARQL2XQuery interoperability framework. World
Wide Web, 18(2):403–490.
Bischof, S., Decker, S., Krennwallner, T., Lopes, N., and
Polleres, A. (2012). Mapping between RDF and XML
with XSPARQL. J. Data Semantics, 1(3):147–185.
Bizer, C. and Cyganiak, R. (2006). D2R server - Publishing
Relational Databases on the Semantic Web. In ISWC.
Callou, C., Michel, F., Faron-Zucker, C., Martin, C., and
Montagnat, J. (2015). Towards a Shared Reference
Thesaurus for Studies on History of Zoology, Ar-
chaeozoology and Conservation Biology. In SW For
Scientific Heritage, ESWC.
Chebotko, A., Lu, S., and Fotouhi, F. (2009). Seman-
tics preserving SPARQL-to-SQL translation. Data &
Knowledge Engineering, 68(10):973–1000.
Das, S., Sundara, S., and Cyganiak, R. (2012). R2RML:
RDB to RDF mapping language.
Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R.,
Mannens, E., and Van de Walle, R. (2014). RML:
A generic language for integrated RDF mappings of
heterogeneous data. In LDOW.
Elliott, B., Cheng, E., Thomas-Ogbuji, C., and Ozsoyoglu,
Z. M. (2009). A complete translation from SPARQL
into efficient SQL. In IDEAS’09, pages 31–42. ACM.
G
¨
orlitz, O. and Staab, S. (2011). SPLENDID: SPARQL
Endpoint Federation Exploiting VOID Descriptions.
In Intl. Ws. COLD.
Haas, L., Kossmann, D., Wimmers, E., and Yang, J. (1997).
Optimizing Queries across Diverse Data Sources. In
VLDB, pages 276–285.
Michel, F., Djimenou, L., Faron-Zucker, C., and Montag-
nat, J. (2015a). Translation of Relational and Non-
Relational Databases into RDF with xR2RML. In We-
bIST, pages 443–454.
3
http://www.cepam.cnrs.fr/zoomathia
4
http://www.w3.org/2009/08/skos-reference/skos.html
A Generic Mapping-based Query Translation from SPARQL to Various Target Database Query Languages
157
Michel, F., Faron-Zucker, C., and Montagnat, J.
(2015b). Mapping-based SPARQL access to a
MongoDB database. Technical report, CNRS.
https://hal.archives-ouvertes.fr/hal-01245883v4.
P
´
erez, J., Arenas, M., and Gutierrez, C. (2009). Semantics
and complexity of SPARQL. ACM Transactions on
Database Systems, 34(3):1–45.
Priyatna, F., Corcho, O., and Sequeda, J. (2014). Formali-
sation and experiences of R2RML-based SPARQL to
SQL query translation using Morph. In WWW.
Rodr
´
ıguez-Muro, M. and Rezk, M. (2015). Efficient
SPARQL-to-SQL with R2RML mappings. J. Web Se-
mantics, 33:141–169.
Schwarte, A., Haase, P., Hose, K., Schenkel, R., and
Schmidt, M. (2011). Fedx: Optimization techniques
for federated query processing on Linked Data. In
ISWC, pages 601–616. Springer.
Sequeda, J. F. and Miranker, D. P. (2013). Ultrawrap:
SPARQL execution on relational data. J. Web Seman-
tics, 22:19–39.
Unbehauen, J., Stadler, C., and Auer, S. (2013a). Accessing
relational data on the web with sparqlmap. In Seman-
tic Technology, pages 65–80. Springer.
Unbehauen, J., Stadler, C., and Auer, S. (2013b). Optimiz-
ing SPARQL-to-SQL Rewriting. In Proceedings of
IIWAS ’13, page 324. ACM.
WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies
158