using the target data. The work of Roth (2011)
provides a completeness-driven query planning. It
forwards queries by considering peers and mappings
that promise large result sets and mappings with low
information loss. Histograms are used to estimate
the potential data contribution of mappings.
In terms of query semantics preservation,
Kantere et al. (2009) present GrouPeer, an adaptive,
automated approach to clustering peers based on
their common interests. This work allows peers to
individually decide whether to answer the
successively rewritten query or to automatically
rewrite its original version. The work of Delvedouris
et al. (2009) discusses the query semantic loss in
query reformulation process. It proposes an
algorithm that estimates the semantic loss of the
rewritten queries by means of syntactic differences
between the original query and the reformulated
queries.
Differently from these related works, our
approach enhances query routing processes by
assuring the semantics preservation of the original
query, as closely as possible. To this end, we use a
semantic reference to avoid the query semantic loss
and some metrics to assess the query semantic value.
This value is used to avoid forwarding queries with a
high semantic loss.
7 CONCLUSION AND FUTURE
WORKS
In this work, we address the problem of preserving
the original query semantics in query routing
processes. We argue that the reformulated queries
along the set of peers should be analyzed according
to the original required query semantics. Query
semantics evaluation may contribute not only to
minimize the query routing time, but also to reduce
the search space by considering only peers that can
indeed contribute with relevant answers.
To help matters, we have proposed a semantic
reference, which is built by considering a domain
ontology (available as a background knowledge)
according to a given submitted query. We use the
semantic reference to avoid semantic loss at query
reformulation time. Furthermore, we have specified
three metrics in the light of a query routing process,
namely: query preservation, query enrichment and
query semantic value. These metrics have been used
in some accomplished experiments, which have
shown some benefits. Particularly, we verify that we
can minimize the semantic loss and avoid
forwarding queries with a high semantic loss. As a
result, our approach helps to produce query results
which best meet the users’ needs.
As further work, we are integrating our approach
with an information quality management service.The
idea is joining the three defined metrics with other
ones in order to select the most relevant neighbour
peers to route queries.
REFERENCES
Arenas, M., Perez, J., Reutter, J. L., Riveros, C., 2010.
Foundations of schema mapping management. In
Proc. of PODS, Indianapolis, USA, pages 227–238.
Baader, F., Calvanese, D., McGuinness, D., Nardi D.,
Patel-Schneider P. Editors, 2003. The Description
Logic Handbook: Theory, Implementation and
Applications. Cambridge University Press.
Campos, L. M., Fernández-Luna, J. M., Huete, J. F.,
Vicente-López, E. 2013. XML search personalization
strategies using query expansion, reranking and a
search engine modification. In Proc. of 28th Annual
ACM Symposium on Applied Computing - SAC '13,
Coimbra, Portugal, pages 872-877.
Carpineto, C., Romano, G. 2012. A Survey of Automatic
Query Expansion in Information Retrieval. In ACM
Computing Surveys, v.44, n.1, pages 1-50.
Delveroudis, Y., Lekeas, P. V. 2007. Managing Semantic
Loss during Query Reformulation in PDMS. In SWOD
IEEE, pages 51-53.
Delveroudis, Y., Lekeas, P. V., Souliou, D., 2009. On
Estimating Semantic Loss in Peer Data Management
Systems. In Proc. of First International Conference on
Advances in P2P Systems, Sliema, Malta, pages 51-53.
Euzenat, J., Shvaiko, P., 2007. Ontology Matching.
Springer-Verlag.
Han, J. W., Kamber, M., 2006. Data Mining Concepts and
Techniques: Elsevier Inc, 2
nd
edition.
Kantere, V., Tsoumakos D., Sellis T., Roussopoulos N.,
2009. GrouPeer: Dynamic Clustering of P2P
Databases. In Information Systems Journal, v. 34, n. 1,
pages 62–86.
Pires, C. E., 2009. Ontology-based Clustering in a Peer
Data Management System. Ph.D. thesis, CIn/UFPE,
Recife, Brazil.
Pires, C. E., Souza, D., Pachêco, T., Salgado, A. C., 2009.
A Semantic-based Ontology Matching Process for
PDMS. In Proc. of 2nd International Conference on
Data Management in Grid and P2P Systems
(Globe’09), Linz, Austria, pages 124-135.
Rijsbergen, C. J. 1979. Information Retrieval, 2
nd
Ed.
Stoneham, MA: Butterworths.
Roth, A. 2011. Efficient Query Answering in Peer Data
Management Systems. PhD thesis, Humboldt
Universität zu Berlin, Germany.
Roth A., Skritek, S., 2013. Peer Data Management. In
Data Exchange, Information, and Streams, v. 5, pages
PreservingtheOriginalQuerySemanticsinRoutingProcesses
79