b) Reducing the physical RTT: Since the number
of data objects to be transferred across the network
is reduced by caching, more backbone bandwidth is
automatically freed for public sharing. As a result
less chance of network congestion reduces the T
value above as a fringe benefit.
Our literature search indicates that researchers
are incessantly trying to find effective caching
techniques to achieve the following:
a) Yielding a high
without deleterious effects:
The solution computation must be optimal in a real-
time sense. That is, this solution should be ready
before the problem has disappeared. Otherwise, the
solution would correct a spurious problem and
inadvertently lead to undesirable side effects (i.e.
deleterious effects) that cause system instability or
failure. The survey in (Wang 1999) presented the
various ways (deployed and experimental) whereby
can be enhanced. But, the most researched topic
is replacement algorithms, which decide which
objects in the cache should be evicted first to make
room for the newcomers.
b) Maintaining
on the fly: This school of thought
is called dynamic cache size tuning (Wu 2004), and
the only published model, which works with a
variable cache size, is the MACSC (Model for
Adaptive Cache Size Control (Wong 2003). In
contrast, other extant dynamic caching techniques
work with a static/fixed cache size; they may yield a
high hit ratio but do not maintain it.
2.1 Replacement Algorithm
Some researchers compared the hit ratios by
different replacement algorithms that work with a
static cache size by trace-driven simulations (Cao
1997, Wasf 1996, Asawf 1995); LRU (least
frequently used), LFU (least frequently used), Size
(Williams 1996), LRU-Threshold, Log(Size)+LRU,
Hyper-G, Pitkow/Recker, Lowest Latency First,
Hybrid (Wooster 1997), and LRV (lowest relative
value). The simulations showed that the following
five consistently produce the highest hit ratios: a)
LRU (least frequently used), b) Size, c) Hybrid, d)
LRV (lowest relative value) and (e) GreedyDual-
Size (or GD-Size). The GD-Size yielded the highest
hit ratio of them all, but its hit ratio can dip suddenly
because of the fixed-size cache, which cannot store
enough data objects to maintain a high hit ratio. So
far, the only known model for dynamic cache size
tuning from the literature is the MACSC [Wong
2003]. All the available MACSC performance data,
however, was produced together with the basic LRU
replacement algorithm as a component. In effect, the
MACSC makes the LRU mechanism adaptive for
MACSC adjusts the cache size on the fly; the cache
size is now a variable. Since the previous experience
(Cao 1997) had confirmed that the GD-Size
algorithm yields a higher hit ratio than LRU, it is
logical to combine MACSC and GD-Size to create
an even more efficient caching framework. This
combination, which is proposed in this paper, creates
the novel dynamic GD-Size (DGD-Size)
replacement strategy. The DGD-Size should be able
to maintain a higher hit ratio than the previous LRU-
based MACSC’s.
2.2 The MACSC Concept
Figure 2 shows the popularity profile over time for
the same set of data objects. The A, B and C curves
represent the three instances of the profile changes.
These instances were caused by changes in user
preference towards certain data objects. Any such
change is reflected by the current profile standard
deviation, for example, SD
A
, SD
B
and SDB
C
. From
the perspective of a changeable popularity spread,
any replacement algorithm designed to gain a high
hit ratio with a fixed-size cache
C
works well only
for
CL
S
SS
∇
. That is, the cache size
∇L
accommodates the given hit ratio equal to L
times the standard deviation
∇
of the data object
popularity profile. If the expected hit ratio is for
S
1
L
(i.e. 68.3%), then the cache size is
initialized accordingly. In the MACSC case,
C
S is
continuously adjusted with respect to the current
relative data object popularity profile on the fly.
Conceptually,
CL∇
always holds;
t
C
is the
adjusted cache size at time t. From two successive
measured standard deviation values the MACSC
computes the popularity ratio (PR) for tuning the
cache size. The PR is also called the standard
deviation ratio (SR) as shown by equation (2.1),
where
SR
is the new cache size adjustment and
the current SR is inside the brackets.
C
S
t
SS ≤ S
CS
Figure 2: Spread changes of the relative data object
popularity profile over time.
)1.2...(*
_
⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛
∇
∇
=
−−
−−
profilepopularityLast
profilepopularityThis
SROldSR
CSCS
A NOVEL DYNAMIC GREEDYDUAL-SIZE REPLACEMENT STRATEGY TO MAKE TIME-CRITICAL WEB
APPLICATIONS SUCCESSFUL
19