
metrics changes depending on the location service
and on the application requirements, but it may
include hops, bandwidth, stability of the nodes,
processing power, etc.
The location service creates an overlay network
on top of the compute nodes, which supports the
application server lookup operation. The user
preferred applications evolve in time. For instance, a
local news service may become a top news
application due to a notorious event (e.g. a local
elections tie or an accident), or an e-commerce site
may jump to the top due to aggressive marketing. A
huge jump on the preference order may produce a
huge increment on the number of clients (n
2
/n
1
if
Zipf distribution (Adamic, 2002) is followed). This
will lead necessarily to the increase on the number
of servers to cope with the demand (e.g. Content
Delivery Network applications (Vakali, 2003)
distribute replicas of pages to handle load peaks). A
generic algorithm was proposed in (Bernardo, 1998)
to control the replica deployment. The location
service must be able to handle this peak of updates,
and in parallel, the concurrent peak of lookups.
Centralized approaches, based on a home location
server may fail due to a peak of millions of requests.
Caching solutions may also fail, because they may
conceal the appearing of new application server
replicas.
The envisioned location service provides two
operations: lookup(id, range) and update(id,
serv_reference, range). Each application server
registers on the location service its reference
associated with a unique application identifier (id)
for a certain range. Clients search for one or more
replicas within a range on the network.
3 LOCATION-LIKE SERVICES
Several existing services support the location service
required functionalities. They differ on how lookups
are performed: either use flooding (broadcast when
available) or guided search.
Flooding approaches are common for micro-
location services (e.g. Jini (Gupta, 2002)), for
unstructured peer-to-peer (P2P) networks (e.g.
Gnutella), and for routing algorithms in Ad Hoc
networks (e.g. AODV (Perkins, 2003)). Updates are
made on a local node, resulting on random
information distribution. A flooding approach does
not require (almost) any setting up, and adapts
particularly well to unstable networks, unstable data
and unstable nodes. However, it has high search
costs and does not scale with the increase of the
number of clients and of the lookup range
(Schollmeier, 2002). Therefore, it is not adapted to
provide a global view of a system. Strategies for
reducing the lookup costs include (Chawathe, 2003):
the creation of supernodes; the replication of
information on neighbor nodes; the use of selective
flooding to reduce the number of messages; and the
control of the message flow. Supernodes create
centralization points on a distributed network, which
inter-connect lower power and more unstable nodes.
They define a backbone that carries most of the
flooded messages. In result, a small world effect is
created that reduces the range needed to run lookups.
However, supernodes also create concentration
points, which can become a bottleneck on the system
through link and server saturation or the increased
message delay in result of flow control. Replication
of id information distributes the load through several
nodes. When replication is done at supernodes (e.g.
a Clip2 Reflector replicates information for all
subordinate nodes), it restricts flooding to a second
hierarchical layer (connecting supernodes) with a
slight increase in update costs (see table 1).
On the other hand, guided search approaches
create an id table. Updates and lookups are made on
nodes dedicated to that id, selected using operation
route(id). The table can be kept on a centralized
node or partitioned and distributed on several nodes.
Centralized approaches (e.g. Napster) simplify
routing but introduce a single point of failure that
can slow down the entire system. The performance
of distributed approaches depends on the structure of
id and on the geometry of the overlay network
defined by the nodes (Gummadi, 2003). The
distributed approaches include the big majority of
naming and routing services and structured P2P.
DNS is a good example of the first group. DNS
relies on a hierarchical structure of nodes matched
with the identifier hierarchy. This approach
simplifies routing because the name completely
defines the resolution path. If h is the maximum
hierarchical level, it has a maximum length of 2h-1.
However, it contains most of the centralized
approaches limitations, benefiting only from the
information fragmentation over several nodes. DNS
improves its scalability using extensively caching
and node replication. Caching reduces the amount of
information exchanged amongst peers but prevents
the use of DNS when referring to moveable or on-
off entities. It was not a requirement at the time
because IP addresses did not change frequently. The
inflexibility of DNS routing (a single path towards
the node with the required id) dwarfs the effects of
node replication. The localization of id resolution
(the selection of the nearest replica) is only
supported by DNS extensions (e.g. Internet2
Distributed Storage Information (Beck, 1998)).
Structured P2P are based on distributed hash
tables (DHT). Location servers (nodes) and
ICETE 2004 - GLOBAL COMMUNICATION INFORMATION SYSTEMS AND SERVICES
40