3.2 Network Structure
In our system model, we assume the cache nodes to be
connected via a communication protocol such as the
Internet Protocol (IP), i.e., nodes can send or receive
messages using distinct network addresses. On top
of this communication structure, we assume a large
number of nodes that can unexpectedly join or leave
our network of cache nodes. Each node keeps the net-
work addresses of a respectably small subset of nodes,
thus limiting the impact of unexpected node join or
departure to local parts of the cache network. When
two nodes mutually know their network addresses, we
say they are connected via a link. Similar to com-
mon peer-to-peer models such as (Stoica et al., 2001),
every node in the cache network can be reached via
multiple-hop routing over the links. We continue with
a detailed description of distributed request process-
ing, describe how to maintain the cache network and
finally propose a mechanism to reorganize the net-
work structure during load-balancing.
Distributed Query Processing. To enable efficient
processing of range queries, the geographic relation
of the data being cached by the nodes has to be re-
flected in the link structure of the cache network. A
Delaunay triangulation of the cache foci constitutes
such a link structure which preserves the topological
relationship. Figure 4 depicts a Delaunay triangula-
tion where triangle vertices (black dots) represent the
cache foci and the triangle edges (bold lines) repre-
sent links between the corresponding nodes. The De-
launay property is met when no vertex exists which is
inside the circumcircle of any other triangle (a formal
definition can be found in e.g. (de Berg et al., 2000)).
In a Delaunay-based link topology, it is possible to
apply greedy routing, i.e., a node forwards a given
query to the neighbor that has the closest cache focus
to the requested query region until no closer neighbor
is found. The node at which greedy routing termi-
nates processes the query, as it has the closest cache
focus and thus most likely keeps the requested data in
its cache. In case of cache misses, the node fetches
missing data from the data back-end. On an arbitrary
graph greedy routing does not always find the global
optimum, but prematurely terminates at a local opti-
mum which is not closest to the query region. How-
ever, it is proven in (Bose et al., 1999) that greedy
routing always succeeds to find the closest node for
Delaunay-based link topologies.
For performance reasons, we do not force the
whole link topology to be coherent to the Delaunay
property all the time, but occasionally allow minor
deviations from that property in certain parts of the
link structure. Consequently, the accuracy of greedy
(a) Greedy routing
(b) Delaunay test
(c) Link reorganization
Figure 4: Move a node’s cache focus position.
routing will degrade, as the global optimum can not
always be found in a non-perfect Delaunay triangu-
lation. Nevertheless, this does not influence the cor-
rectness of the query results as every node is able to
request missing data from the data back-end. Yet, in-
accurate routing may decrease the cache hit-rate, as
a mislead query will be processed by a node whose
cache focus is not closest to the query region and thus
may not have cached as much of the requested data
as the optimal node. In our system, the movement
of cache foci may violate the Delaunay property in
certain regions and thus decrease the cache hit-rate.
Therefore it is necessary to reorganize the link struc-
ture once the routing accuracy has been decreased too
much.
Network Maintenance. Particularly for application
fields that can cope with partial routing inaccuracy, a
set of protocols were devised which are able to estab-
lish and maintain a Delaunay-based link topology in a
distributed setting under node churn, so that nodes can
join or leave unexpectedly (Lee et al., 2008). More-
over, it has been shown that the topology converges to
an accurate Delaunay topology once the churning has
stopped. In particular, this property is extremely use-
ful for our purposes, as it ensures that the system’s
efficiency returns to normal after adapting to node
churn or repositioned cache foci. Thus, this mainte-
nance protocol can be used for node joins and depar-
tures. However, to provide the flexibility needed for
our multi-level load-balancing, we extend this proto-
col in the following section.
Network Reorganisation. The maintenance protocol
outlined in the previous section can be extended by a
new primitive MOVE which moves a node’s cache fo-
cus to a certain position and updates the link structure
if required. Figure 4(a) depicts an exemplary network
before a node has moved. Suppose that the cache fo-
cus of node a ought to be moved into the center of
triangle (c, d, e). With greedy routing (visualized as
red arrows) we are able to find the closest node to
DATA2013-2ndInternationalConferenceonDataManagementTechnologiesandApplications
186