TEMPORAL ASPECTS-BASED REPLACEMENT

IN MEDIA OBJECT CACHES

Hagen H

opfner, Andreas Thenn and Maximilian Schirmer

Mobile Media Group, Bauhaus-Universit

at Weimar, Bauhausstr. 11, 99423 Weimar, Germany

Keywords:

Caching, Cache Replacement, Temporal Aspects.

Abstract:

Caching is an appropriate and well-known approach for reducing data transmissions in distributed information

systems by creating and maintaining redundant data. As cache memory is limited and cached data might get

outdated, it is impossible to store everything forever. If a cache is full, a replacement strategy decides on the

cache entry that needs to be replaced by new data. There exist various strategies utilising different indicators

for making this decision. Almost all of them do not take the content and the context of the systems’ users

into account. In this paper, we present two novel cache replacement strategies called TA and aTA that utilize

temporal aspects included in media objects such as websites, speciﬁed by the user, or learnt from her or his

behaviour. Our evaluation results show that, in the used application scenario, TA and aTA outperform classical

replacement schemes like LRU.

1 INTRODUCTION AND

MOTIVATION

Data transmissions are the most cost-intensive and

time-consuming subtasks in most distributed infor-

mation systems. Since wireless networks get more

and more important, energy requirements also justify

the need for reducing the number of unnecessarily re-

peated transmissions. Besides replication and hoard-

ing, caching is an appropriate and widely-used ap-

proach for this purpose (H

opfner et al., 2009). The

basic idea is to implicitly store received data as close

as possible to the data processing subsystem. Well-

known examples are caches in Web-based systems

(Wang, 1999). Web browsers, e.g., cache Web ob-

jects such as images, videos, or HTML documents

in the ﬁle system of the device running the browser.

The browser reuses this locally available data if the

user requests the same Web object again. Hence, a re-

peated transmission is not necessary. Certain research

questions result from this simple idea:

(1) Which data should be or needs to be cached?

(2) What happens if cached data gets outdated?

(3) How to reuse cached data?

(4) What happens if the cache is full?

Although all questions are important in this paper, we

only address the fourth one. For focussing purposes,

we assume as “answers” to the other three questions

that (1) all received objects are cached, (2) there are

no updates on cached data, and (3) cache entries are

identiﬁed by an identiﬁer and reused in an atomic

manner. Question four results from the fact that cache

memory is limited. In case of a full cache, the system

has to decide on removing or replacing, respectively,

cached data. This decision is made by a cache re-

placement strategy. Various approaches have been re-

searched in the past. In general, they differ in the pa-

rameters they factor into the decision. FIFO (first-in-

first-out) (Tanenbaum, 2007), e.g., replaces the oldest

cache entry, while LRU (Bennett and Kruskal, 1975)

replaces the least recently used one. Hence, time or

recency of accesses are taken into account. Almost

all strategies published until today ignore the con-

tents of cached data. This, of course, is necessary if a

strategy aims for being as general as possible. How-

ever, in user-centered information systems, generic

approaches might thwart the users’ expectations, as

the subjective importance of a cached object for the

user might change over time. For example, informa-

tion about a meeting taking place tomorrow is very

important before and right after the meeting. There-

fore, it should stay in cache. If one uses the system

the day after the meeting, the subjective importance

decreased and the data might be replaced.

In this paper, we analyse various temporal aspects

of a user-based access to media objects. This includes

Höpfner H., Thenn A. and Schirmer M..

TEMPORAL ASPECTS-BASED REPLACEMENT IN MEDIA OBJECT CACHES.

DOI: 10.5220/0003799700730082

In Proceedings of the 8th International Conference on Web Information Systems and Technologies (WEBIST-2012), pages 73-82

ISBN: 978-989-8565-08-2

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

temporal aspects within media objects, as well as user

preferences (e.g. prefer information for weekends)

and user behaviour. Based on these temporal aspects,

we developed two novel cache replacement strategies,

called TA (temporal aspects-based replacement) and

aTA (adaptive TA). Both were implemented and suc-

cessfully evaluated.

The remainder of this paper is structured as fol-

lows: Section 2 presents an application scenario that

motivates the research. Section 3 discusses the related

work. Section 4 introduces and classiﬁes the temporal

aspect considered in our work. Section 5 describes the

novel replacement strategies. Section 6 presents the

evaluation scenario and the evaluation results. Sec-

tion 7 summarizes the paper with a conclusion and

outlines future research directions.

2 APPLICATION SCENARIO

Today is Monday, August 29, 2011. Anna and Irene

utilise smartphones for organising their spare time ac-

tivities. Both installed the WeIS app, which accesses

the WeIS Web service. WeIS is the so called “We

imar

Information System”. It collects and integrates infor-

mation about events (e.g., cinema schedules, concert

information) from various Web sources. The WeIS

app caches received data on the smartphone. As Anna

and Irene have different interests, they browse the

WeIS content in a different manner. Anna prefers

to go to the cinema on weekends and therefore fre-

quently reads movie descriptions and watches trailers

of movies that will be shown on the next weekend.

She did this a week ago, too. Hence, Annas cache

contains movie information and trailers that were rel-

evant for August 27 and August 28 as well as those

that are relevant for September 3 and September 4.

Obviously, upcoming events are more important than

passed ones. She also had a short look at the cin-

ema schedule for next Wednesday. As Anna speciﬁed

a “weekend data caching”, the Wednesday informa-

tion is not cached. However, in case of a full cache,

if Anna reads the cinema schedule for the week after

this week, the information about last week’s movies

is replaced rather than this week’s information. If she

uses the app again on September 5, the August in-

formation became less important, and so on. Irene

is more keen on going to concerts, no matter if they

take place on weekends or not. Hence, all data (band

biographies, audio ﬁles, etc.) accessed by Irene is

cached on her smartphone. Furthermore, she fre-

quently reads the newsletters of her favourite artists

using WeIS. If the cache is full, the importance of the

cache entries depends on the event date in relation to

the current date. Outdated events are less important

than recent ones, and recent events are more impor-

tant than future ones. Hence, Irene’s app replaces

the less important concert information. As shown

by both examples, the importance of cached data de-

pends on the temporal aspects of the cached informa-

tion as well as on the current date. Furthermore, im-

portance changes over time.

3 RELATED WORK

As already brieﬂy mentioned in Section 1, our re-

search falls into the area of distributed information

systems (dIS) in general. More precisely, our work

is related to redundant data management approaches

in dIS. According to the classiﬁcation presented in

opfner et al., 2009), there exist three categories: (1)

replication, (2) hoarding, and (3) caching. They differ

in their approach to specifying the chosen data, their

possibility of using redundant data without having a

connection to the master copy, their required software

components, the potential dynamics of the data, and

their handling of updates to the redundant data. For

details on (1) and (2), we refer to (H

opfner et al.,

2009). The basic idea of caching is to reuse implicitly

received data. If a user requests data for the ﬁrst time,

the client keeps processed data locally. This copy is

used later on for answering new requests on the same

information. There are various cache related issues

to be considered. In this paper, we introduce new

replacement strategies that are necessary to shrink a

cache if cache memory is full. The literature intro-

duced many different approaches to decide about the

cached data items that are replaced by new incoming

data (Wang, 1999; Podlipnig and B

osz

ormenyi, 2003;

Romano and ElAarag, 2008). Almost all of them try

to replace data that will most likely be useless in the

future, taking one of the following indicators into ac-

count: time point t

of the last access to data item i, the

time T

since the last access, and the access frequency

. Other indicators are the size s

of a data item, the

costs c

of retransmitting it, or the access latency l

There are only a few approaches that use the se-

mantics of the cached data or the context of the sys-

tem’s usage. The LSR strategy (Calsavara, 2003) re-

places the least semantically related object but does

not take temporal aspects into account. Even more en-

hanced caching strategies like location-aware seman-

tic caches (Ren and Dunham, 2000) and their replace-

ment strategies (i.e., FAR) do not analyse cached data

in this regard. In contrast to this, the cache replace-

ment strategies introduced in this paper take the tem-

poral relevancy of and within data items into account.

WEBIST2012-8thInternationalConferenceonWebInformationSystemsandTechnologies

4 TEMPORAL ASPECTS

The term “media object” used in our work extends

the deﬁnition of “Web objects” given in (Podlipnig

and B

osz

ormenyi, 2003). The term “Web object”

subsumes “all possible objects (HTML pages, im-

ages, videos, etc.) that can be stored in a proxy

cache.” Media objects are Web objects plus other

remotely accessible documents like emails, calendar

items, etc. Temporal aspects of a media object result

from explicit meta information speciﬁcations (e.g.,

well-deﬁned date information of an event) or from the

contents of the object (e.g., a date mentioned in the

text of an HTML page). In the following we, refer to

explicit temporal aspects as external. There are vari-

ous approaches for harvesting temporal aspects from

the contents of a media object (Lienhart, 1999; Sa-

quete and Mart

ınez-Barco, 2000; Morita et al., 2000;

Mart

ınez-Barco et al., 2002). They are also built into

common desktop applications. Apples Mail.app, e.g.,

analyses emails and allows to create a calendar entry

if a date is found in the text of the email. However,

these information retrieval approaches are out of the

scope of this paper. We assume, that there is a proper

way to harvest such internal temporal aspects.

As the importance of a media object is a subjective

measure, we have to take the user into account. He

or she might use an explicit deﬁnition of the impor-

tance, e.g., a certain user might be more interested in

events taking place on weekends. In addition, one can

analyse the user’s access behaviour to cached media

objects. If he or she reuses media objects that have a

temporal relevancy within a week, the system should

keep such objects in cache. We call such temporal as-

pects implicit. Please note, that implicit temporal as-

pects are not comparable to the access statistic used,

e.g., for LRU. It changes over time an does not refer

to a single media object, while standard LRU analyses

accesses per object.

temporal aspects

media object

external internal

user

explicit implicit

Figure 1: Taxonomy of caching relevant temporal aspects.

Figure 1 illustrates a taxonomy of the considered

temporal aspects. The application scenario (cf. Sec-

tion 2) contains all of them:

External: The temporal aspect of a movie trailer is

not explicitly given by the movie ﬁle itself. How-

ever, it results from the cinema schedule that is

easily wrapped from the well-structured website

of the cinema.

Internal: Artists might inform their fans on upcom-

ing concerts in their newsletters. As newsletters

are mostly unstructured, this information needs to

be retrieved from the plain text.

Explicit: Anne conﬁgured her WeIS app in a way

that it caches only those data relevant for week-

ends.

Implicit: Irenes WeIS app analyses her access be-

haviour and “recognises” that she most frequently

accesses information that becomes relevant one

month after the access date.

For the rest of the paper, we assume that each

cached media object has a deﬁned validity date.

5 REPLACEMENT STRATEGIES

The temporal aspect-based replacement strategies

presented in this paper are, to a certain degree, subjec-

tive. We assume that outdated media objects (validity

date is a thing of the past) are less important then cur-

rent media objects and those with a validity date in

the near future. Furthermore, we assume that media

objects becoming obsolete later in future are less im-

portant than those that become obsolete in near future.

However, all of them are more important than already

outdated media objects.

time

importance

past ← now → future



Figure 2: Validity date based importance of media objects.

5.1 TA Replacement

As illustrated in Figure 2, the importance of a cached

media object depends on the current point in time

and therefore changes over time. Let M be the set

of all media objects in the cache. The expiration

date of an object m ∈ M is given as exp(m) and

the importance of m is given as imp(m). At a cer-

tain point in time (now) the set M

⊆ M contains

all outdated media objects (∀m ∈ M

|exp(m) < now)

and the set M

⊆ M contains all not outdated me-

dia objects (∀m ∈ M

|exp(m) ≥ now). Consequently,

TEMPORALASPECTS-BASEDREPLACEMENTINMEDIAOBJECTCACHES

∩ M

0 and M

∪ M

= M must hold. Hence,

a correct importance-based order of the cached media

objects is given if

• ∀m, n ∈ M

((imp(m) < imp(n)) ↔ (exp(m) <

exp(n))) ∨ |M

| < 1,

• ∀m, n ∈ M

((imp(m) ≤ imp(n)) ↔ (exp(m) >

exp(n))) ∨ |M

| < 1, and

• ∀m ∈ M

∀n ∈ M

(imp(m) < imp(n))

hold. We are not interested in absolute importance

values, but rather in sorting the replacement queue of

the media objects according to the relative importance

among cached objects. Similar to the well-known

LRU queue, the least important media object repre-

sents the head of the queue and would be removed if

a replacement is required. The queue is ordered ac-

cording to the importance deﬁned above. The closer

a media object is placed at the tail of the queue, the

more important it is. However, in contrast to the LRU

queue, our TA queue also changes over time, even if

no objects are referenced in between. Hence, mate-

rialising the TA queue would be inefﬁcient as time

passes by and reordering would become a frequent

task even if no new media objects appear. Therefore,

we decided to realize the TA queue as a virtual queue

that is evaluated only if a replacement is required (cf.

Figure 3

object reference expiration date imp.

C 6.2.2011 1

A 10.2.2011 2

D 8.2.2011 3

B 8.2.2011 4

Figure 3: Virtual TA queue on February 7th, 2011.

Media objects are simply stored in the ﬁle system.

The virtual TA queue is a table that maintains object

references (absolute ﬁle names) and expiration dates.

For illustration purposes, we added an importance

value in Figure 3. At this, object C is least important

(imp(C) = 1), as it is already outdated. If the cache

would have to store a new media object on February

7th, 2011, C would be replaced. If more space would

be required, TA would replace A that is not outdated,

but refers to the most future point in time (cf. Fig-

ure 2). Figure 4 illustrates the virtual TA queue two

days later. C is still outdated. However, now B and

D are outdated, too, and/or would be replaced rather

than A.

Algorithm 1 implements the TA cache replace-

ment strategy. It is triggered by new incoming media

For illustration purposes, we use dates as temporal val-

ues throughout the paper. However, the implementation and

the evaluation also take time values into account.

object reference expiration date imp.

C 6.2.2011 1

D 8.2.2011 2

B 8.2.2011 3

A 10.2.2011 4

Figure 4: Virtual TA queue on February 9th, 2011.

objects in case of a full cache. As shown in Fig-

ure 3 and Figure 4, the virtual TA queue is a table.

Formally it is the set TAQ = {(re f (m),exp(m))|m ∈

M } of tuples of the form (re f (m),exp(m)). At this,

re f (m) is a reference to the media object m ∈ M and

exp(m) is the aforementioned expiration date. Fur-

thermore, size(m) returns the memory space required

for caching media object m (i.e., m’s ﬁle size).

Algorithm 1: TA cache replacement.

Precond.: // new object would ﬁt into cache

Input: TAQ // virtual TA queue

n // new media object

now // current date/time

rm // required memory

Output: R // set of discarded objects’ references

01 def TA replacement(TAQ, n, now, rm):

02 R =

0 // reset replacement set

03 f s = rm − FCM // initiate loop variable

04 nmo = FALSE // ﬂag to avoid useless scans

05 while f s > 0 do:

06 date = now; cand = NULL

07 if nmo == FALSE

08 for each (x,y) ∈ TAQ do:

09 if y < date ∧ x /∈ R then:

10 date = y; cand = x ﬁ

11 done

12 ﬁ

13 if cand 6= NULL then:

14 R = R ∪ {cand}

15 f s = f s − size(cand)

16 else // no more outdated objects

17 nmo = T RU E

18 for each (x,y) ∈ TAQ do:

19 if y ≥ date ∧ x /∈ R then:

20 date = y; cand = x ﬁ

21 done

22 R = R ∪ {cand}

23 f s = f s − size(cand)

24 ﬁ

25 done

26 return(R)

Algorithm 1 returns the set R = {re f (m)|m ∈ M }

of references to those media objects that shall be

WEBIST2012-8thInternationalConferenceonWebInformationSystemsandTechnologies

replaced. In a subsequent step, the respective media

objects are removed from cache and queue, and the

new object is cached and added to the queue. The

precondition is checked by comparing size(n) to the

guaranteed (predeﬁned) cache memory space. If a

media object is bigger than the cache, the object is

not cached at all. The basic principle of the algorithm

is as follows: We ﬁrst calculate the memory required

for caching the new object n by subtracting the

amount of free cache memory FCM from the amount

of required memory rm (Line 3) that is determined

using size(n) in the function call. Then we loop

through the TA queue and ﬁrst try to collect enough

memory by marking outdated objects for replacement

(Lines 07-15). If no more outdated objects can

be found and if we need more free space, we start

selecting those objects that expire in the future (Lines

16-24). The algorithm is guaranteed to terminate and

does not require ﬁle access to the cached objects.

Example and Threats to Validity. The following

example (cf. Figure 5) extends the Figure 3 by show-

ing the ﬁle sizes. We assume, that cache memory is

limited to 20MB.

object reference expiration date size

C 6.2.2011 4MB

A 10.2.2011 8MB

D 8.2.2011 3MB

B 8.2.2011 1MB

Figure 5: Example TA replacement on February 7th, 2011.

Anna accesses a 10MB video ﬁle (E) that expires

on March 1st, 2011. Obviously, 10MB is less than

20MB. Hence, the precondition is fulﬁlled. The cache

uses 4 + 8 + 3 + 1 = 16MB. 4MB are free that are

not sufﬁcient for caching the new ﬁle. We have to

free 10 − 4 = 6MB. The TA algorithm ﬁrst selects

outdated objects for replacement. On February 7th,

2011 only C is outdated. Removing C would free

4MB, which is not sufﬁcient either (4MB from C plus

4MB free space sum up to 8MB while 10MB are re-

quired). Hence, TA replacement now selects A for

replacement and, as enough memory was collected,

returns R = {C,A}. Both objects are removed from

cache and queue and E is added. Figure 6 illustrates

the resulting virtual TA queue.

object reference expiration date size

D 8.2.2011 3MB

B 8.2.2011 1MB

E 1.3.2011 10MB

Figure 6: Example TA replacement result on February 7th,

2011.

If Irene would have accessed E two days later (cf.

Figure 4) then either C, B, and D or C and D would

have been removed (depending on the order “within”

TAQ). This issue results from the fact that our naive

algorithm does not intelligently differentiate between

cached objects with the same expiration date. In order

to overcome this issue, we experimented with two dif-

ferent approaches (cf. Section 6). As basic behaviour,

we extended Algorithm 1 in a way that all “candi-

dates” are removed. For this example this means that

C, B, and D are always selected. The second vari-

ant utilises the idea of LRU. If two objects have the

same expiration date, then we ﬁrst select the one that

was least recently used. The LRU counter was simply

added to the TA queue.

Another issue illustrated by the example is the fact

that the new object expires later than the at latest ex-

piring cached object. Actually, this conﬂicts with our

deﬁnition of importance. However, as Irene intention-

ally accessed this new object and as time-based im-

portance is strongly subjective, anyway, we decided

to insert objects independent of their expiration date

into the cache.

5.2 aTA Replacement

The TA replacement considers only external and in-

ternal temporal aspects. Section 4 also introduced ex-

plicit and implicit temporal aspects that result from

users preferences and behaviour. Irene (cf. Sec-

tion 2) might be interested in concerts that take place

within a week from now earliest, while Anna might

be more interested in this week’s cinema information.

Hence, TA cache replacement would reﬂect Anna’s

behaviour but not Irene’s. The aTA replacement strat-

egy also considers such implicit temporal aspects.

The basic idea is to analyse access time frames, so

called “zones”.

As shown in Figure 7, accesses to objects are

grouped, based on their expiration date in relation to

the current time t. In our experiments, we decided to

use six zones, three historical zones and three future

zones. However, the number of zones as well as their

deﬁnition is subject to future research.

During the runtime of the system object accesses

are analysed and counted. The zones in Figure 7 cor-

respond to the following access behaviour:

Zone 1: access to objects expired earlier than one

week before now.

Zone 2: access to objects expired one week but not

earlier than 24 hours before now.

Zone 3: access to objects expired within 24 hours be-

fore now.

TEMPORALASPECTS-BASEDREPLACEMENTINMEDIAOBJECTCACHES

1 2 3

4 5 6

zone i

number of accesses N

per zone

Figure 7: Example access distribution for 6 zones.

Zone 4: access to objects that will expire within 24

hours from now.

Zone 5: access to objects that will expire in one week

but not earlier than 24 hours from now.

Zone 6: access to objects that will expire later than

one week from now.

Using this approach allows to deﬁne an importance

function that reﬂects the user’s behaviour rather than

the static deﬁnition illustrated in Figure 2. However,

objects of the same zone are ordered according to the

TA importance function. The example data shown in

Figure 7, that results from our experiments, illustrates

that this particular user is more interested in media

objects of zone 2 than in those of zone 3. However,

simply counting accesses is not sufﬁcient as the coun-

ters would grow with each access. In order to avoid

this phlegm, we halve and round down all zone coun-

ters after 50 accesses each and therefore ﬂatten the

importance curve (cf. Figure 8).

Figure 8 illustrates a different user behaviour than

Figure 7. Here, the user accesses data that expires

in one week but not earlier than one day from now

more often than those data expiring within the next

24 hours. Therefore, the importance function reﬂects

Irene’s interest in concerts taking place in a week

from now. If the system has to cache new data and

if the cache is full, objects from the least important

zone are replaced according to Algorithm 2. Besides

the virtual TA queue, we need a function l f z that cal-

culates the least frequently used

zone and a function

map(n) that assigns a media object n to a particular

zone. The zones are represented as the set Z = {(i,c)}

of tuples of the form (i,c) with i,c ∈ N. At this, i

is the zone number and c is the (ﬂattened) number

1 2 3

4 5 6

zones i

number of accesses N

per zone

before ﬂattening

after ﬂattening

Figure 8: Smoothing of the aTA importance function.

of accesses to this zone. In our implementation

, the

mapping is done as follows. 24h means one day (24

hours), 168h = 24h · 7 means one week.

map(n) =











1 if exp(n) < now − 168h

2 if now − 168h ≤ exp(n) . ..

··· < now − 24h

3 if now − 24h ≤ exp(n) < now

4 if now ≤ exp(n) ≤ now + 24h

5 if now + 24h < exp(n) . ..

··· ≤ now + 168h

6 if exp(n) > now + 168h

The l f z(Z) functions is a lookup function that returns

the value of i of the tuple (i, c) ∈ Z where ∀(i

) ∈

Z : c ≤ c

holds.

zone (i) number of accesses (c)

1 0 (1)

2 2 (4)

3 5 (10)

4 3 (7)

5 20 (41)

6 5 (10)

Figure 9: Example: zone based access frequency.

Figure 9 corresponds to Figure 8 and presents an

example for the zone set. The rightmost column,

Please note that this approach is not comparable to the

LFU replacement strategy (Tanenbaum, 2007). LFU con-

siders the access frequency of objects rather than the group-

ing of temporal accesses that are independent of the ac-

cessed objects.

Please note that other zone deﬁnitions might be possi-

ble. As Algorithm 2 only uses this function, the algorithm

can be adapted by providing another mapping function.

WEBIST2012-8thInternationalConferenceonWebInformationSystemsandTechnologies

which has been included into the ﬁgure for illustra-

tion purposes only, contains the access values for the

unﬂattened curve. As one can see, the zone impor-

tance does not change if ﬂattening is used.

Algorithm 2: aTA cache replacement.

Precond.: // new object would ﬁt into cache

Input: TAQ // virtual TA queue

n // new media object

now // current date/time

rm // required memory

Output: R // set of discarded objects’ references

01 def aTA replacement(TAQ, n, now, rm):

02 R =

0 // reset replacement set

03 f s = rm − FCM // initiate loop variable

04 CZ =

0 // remember cleaned zones

05 while f s > 0 do:

06 tTAQ =

0 // reset tTAQ

07 cm = 0 // cleanable memory per zone

08 zone = l f z(Z −CZ)

09 for each (x, y) ∈ TAQ do:

10 if map(x) == zone then

11 cm = cm + size(x)

12 tTAQ = tTAQ ∪ {x,y}

13 ﬁ

14 done

15 if tTAQ 6=

0 then

16 R

=TA repl.(tTAQ, n, now, cm)

17 R = R ∪ R

18 f s = f s − cm

19 ﬁ

20 CZ = CZ ∪ {zone}

21 done

22 return(R)

Algorithm 2 works in two stages. In stage 1, a

temporary virtual TA queue tTAQ is generated (Lines

6-14). It represents those objects that belong to

the least recently used zone. Furthermore, stage 1

calculates the amount of memory that can be freed

by replacing all objects from the selected zone. This

value is stored in cm. Stage 2 utilises cm and calles

Algorithm 1 based on tTAQ (Line 16). We iterate

through both stages until enough memory can be

freed. Stage 2 is only entered if the selected zone

contains objects (Line 15).

Example. Given the example discussed in the pre-

vious subsection, the virtual TA queue illustrated in

Figure 5, as well as the zone deﬁnition from Figure 9.

On February, 7th 2011 object C belongs to zone 3,

object A belongs to zone 5, and the objects D and B

belong to zone 4. We need 10MB for caching object

E. Again, we have 4MB free cache memory. Hence,

6MB must be freed. The aTA algorithm ﬁrst deter-

mines the least frequently used zone, which is zone 1

(cf. Figure 9, 0 accesses). As no object belongs to

zone 1, zone 2 (2 accesses) is analysed. This zone

does not contain objects either. Hence, zone 4 (3 ac-

cesses) is used next. Zone 4 contains B (1MB) and

D (3MB). As we have to free 6MB, both objects are

selected using the TA algorithm. The next least im-

portant zone is zone 3 or zone 6. It does not matter,

which one is selected, because both were used equally

often. As zone 6 does not contain any objects, zone 3

is used anyway. Hence, A is selected for replacement,

too.

Under the assumption that the access frequency

history represents Anna’s behaviour, B, C, and D

would be replaced. The resulting virtual TA queue

after this replacement is illustrated in Figure 10.

object reference expiration date size

A 10.2.2011 8MB

E 1.3.2011 10MB

Figure 10: Example aTA replacement result on February

7th, 2011.

5.3 Cache Filters

Explicit temporal aspects are indicators for a manual

individualisation of the caching behaviour, indepen-

dent of the used replacement strategy. Remember the

application scenarios presented in Section 2. Anna is

more interested in events taking place on weekends.

We utilise this information and cache only those me-

dia objects that are relevant for weekend events (cf.

Figure 11).

yes

Mo Tu We

Th Fr

Sa Su Mo Tu We

Th Fr

Sa Su Mo

Figure 11: Example for manually deﬁning cache contents.

6 EVALUATION

The performance of cache replacement strategies is

measured in form of the ratio of the number of cache

hits N

∈ N to the number of requests N

∈ N. A

cache hit occurs if a request can be answered using

cached objects. Otherwise a cache miss indicates that

the requested object is not in the cache. The following

formula is used in order to calculate the cache hit rate

∈ R for the TA replacement strategy:

TEMPORALASPECTS-BASEDREPLACEMENTINMEDIAOBJECTCACHES

Obviously, the value must be in the interval [0,1],

while 1 indicates best performance (all requests are

answered by the cache) and 0 indicates worst perfor-

mance (no requests are answered by the cache). We

evaluated our strategies in comparison to the LRU

strategy. The respective LRU cache hit rate is de-

noted as H

LRU

. According to this naming convention,

aTA

denotes the cache hit rate for the aTA-algorithm.

TA+LRU

names the cache hit ratio for the combina-

tion of TA and LRU.

6.1 Prototype

As mentioned in Section 1, the importance of cached

data is of a subjective nature. Therefore, evaluation

is subject to a user study. We implemented the mo-

bile information system introduced in Section 2 using

the WebOS plattform (Allen, 2009) (cf. Figure 12).

The WeIS-client requests data from a server running

CouchDB (Anderson et al., 2010). We manually ﬁlled

the server database and cleaned up the respective ex-

piration dates. The prototype also covers cache co-

herency mechanisms that are out of the scope of the

paper but necessary for future applications.

Figure 12: Screenshot of the WeIS evaluation prototype.

6.2 Experimental Setup

We implemented four different caches that ran in par-

allel and utilised one of the following replacement

strategies:

LRU: A standard least recently used replacement

strategie is used as reference implementation as.

TA: The TA implementation is based on Algorithm 1

presented in Section 5.1. As an extension, we de-

cided to replace all media objects with the same

expiration date if more alternatives exist.

TA+LRU: The TA+LRU implementation is based on

Algorithm 1 presented in Section 5.1, too. In the

case that media objects have the same expiration

date we aditionally used LRU in order to select

the replacement candidate.

aTA: The aTA+LRU implementation is based on the

Algorithm 2 presented in Section 5.2. In stage 2,

the TA+LRU implementation is used.

We conducted a rather small user study with 5 users

that used WeIS for 14 days. During this time and due

to privacy issues, we asked the users to actively sub-

mit at least 10 generated reports through the app. In

order to guarantee that replacement is used, we lim-

ited cache memory to 512 KBytes. Furthermore, in a

preprocessing phase, we guaranteed comparable ob-

ject sizes among provided media objects. This was

necessary as larger objects would cleanup the “small”

cache completely. However, in productive environ-

ments, the capacity of modern smartphones super-

sedes this assumption as well as the cache memory

limit.

6.3 Evaluation Results

The numbers shown in Figure 13 illustrate that our

test persons used the app differently. Especially, there

was no improvement over a default LRU replacement

for user

and user

. However, Figure 14 shows that

TA, TA+LRU, as well as aTA performed just as bad

as or even better than LRU. The relatively low cache

hit rates result from the fact, that users were free to

use the system and mostly accessed uncached data.

However, the trend shows, that our temporal approach

reﬂects user behaviour better than the access statistic

used in LRU.

Figure 15 supports this hypothesis, too. It shows

the average cache hit rate in relation to the mea-

sured LRU result. Interestingly, TA is better than

TA+LRU. Please remember that our TA implemen-

tation replaces all replacement candidates in case of

equal expiration dates. Due to the nature of the

used application scenario, this means that all objects

WEBIST2012-8thInternationalConferenceonWebInformationSystemsandTechnologies

user

LRU 59 44 8 23 11

TA 70 53 9 23 11

TA+LRU 70 47 9 23 11

aTA 56 54 9 23 11

Σ 266 215 56 94 69

Figure 13: Absolute number of cache hits.

user

LRU

.222 .205 .143 .245 .160

.263 .247 .161 .245 .160

TA+LRU

.263 .219 .161 .245 .160

aTA

.211 .251 .161 .245 .160

Figure 14: Cache hit rates.

(images, texts, videos) belonging to the same event

are replaced at once. In other words, this means

that more memory is freed in one replacement step.

Hence, TA+LRU keeps rather unimportant data in

cache at the point of replacement. As the importance

value might change over time, these remaining ob-

jects might become more important than those that

are more frequently used. If a replacement request

selects a more frequently accessed but, from a tempo-

ral point of view, less important media object, cache

hit ratio decreases with upcoming requests.

replacement cache hit rate in relation to LRU

strategy

LRU 100%

TA 110.36%

TA+LRU 107.49%

aTA 105.44%

Figure 15: Cache hit rate improvement in relation to LRU.

Furthermore, we analysed the number of cached

objects for user

. As illustrated in Figure 16, starting

with report 5, less objects remain in cache with LRU.

The reason for this results from the fact that LRU kept

certain larger ﬁles in cache that were replaced right

before this report. TA in contrast replaced those ﬁles

in an earlier stage at the point the corresponding event

information was replaced. However, a more detailed

analysis of this behaviour is subject to future research

and requires a more detailed and more frequent data

acquisition with more users.

Figure 17 compares the TA implementation to the

aTA implementation using the reports of user

. Obvi-

ously, TA outperforms aTA. The reason for this is the

fact that aTA requires some time to learn the user’s be-

haviour. Starting with report 6, one can see a perma-

nent increasing aTA cache hit rate. Hence, the learned

2 4

8 10

report

number of cached media objects

LRU

TA+LRU

aTA

Figure 16: Cache space usage of user

importance function becomes better. However, aTA

must adapt the zones to the current user behaviour. If

it changes, aTA cache hit rate would decrease dramat-

ically as the wrong zone would always be selected for

replacement (until the zones are corrected again).

2 4

8 10

0.1

0.2

0.3

0.4

0.5

report

cache hit rate

aTA

Figure 17: Development of the cache hit rates for user

7 SUMMARY, CONCLUSIONS

AND OUTLOOK

In this paper, we introduced two new cache replace-

ment strategies, TA and aTA. Both utilise temporal

aspects of accessing media objects in order to select

the replacement candidate(s) in case of a full cache.

TEMPORALASPECTS-BASEDREPLACEMENTINMEDIAOBJECTCACHES

We ﬁrst analysed and classiﬁed the temporal aspects.

Then we used them in order to deﬁne importance

functions describing the temporal aspect-based (sub-

jective) importance of media objects. Explicit and im-

plicit temporal aspects that result from the media ob-

jects themselves were used for the TA replacement.

Implicit temporal aspects that result from user’s be-

haviour were used for aTA. As the evaluation results

show, both approaches outperform a standard LRU re-

placement. However, we are aware of the fact that

there are plenty of other replacement strategies we

should analyse in comparison to TA and aTA. How-

ever, before doing this, we will take a deeper look

into the zone deﬁnition and the zone numbers of aTA.

Furthermore, we plan to use dynamic zone deﬁnitions

instead of ﬁxed ones.

A more general research question results from the

data preprocessing. For our experiments, we man-

ually tagged the media objects. In order to adapt

our approaches to other applications, one has to ﬁnd

proper ways to harvest time and date information

from arbitrary media objects. Our prototype contains

rudimentary support for Websites. However, not all

dates mentioned on a Website have to be expiration

dates. Another open issue is the support for durations

of validity. We currently only support expiration dates

and assume that validity starts at the ﬁrst time an ob-

ject is accessed before caching. A more detailed study

on time/date ranges could improve temporal aspect

based caching approaches.

Last but not least, we are planning a more detailed

evaluation including some of the mentioned future

works and involving more users for a longer period of

time. Therefore, we are working on porting WeIS to

Apple’s iOS and Google’s Android platform and also

include the cache ﬁltering. The latter was not eval-

uated during the experiments presented in this paper,

but might improve the cache hit ratio dramatically if

used in a proper way.

ACKNOWLEDGEMENTS

We want to thank all students of the Mobile Me-

dia Group of the Media Department at the Bauhaus-

Universit

at Weimar, who participated in the con-

ducted evaluation.

REFERENCES

Allen, M. (2009). Palm webOS. O’Reilly Media, 1st edi-

tion. http://shop.oreilly.com/product/9780596155261

.do.

Anderson, J. C., Lehnardt, J., and Slater,

N. (2010). CouchDB: The Deﬁnitive

Guide. O’Reilly Media, 1st edition.

http://guide.couchdb.org/editions/1/en/index.html.

Bennett, B. T. and Kruskal, V. J. (1975). LRU stack pro-

cessing. IBM Journal of Research and Development,

19:353–357.

Calsavara, A. (2003). The least semantically related cache

replacement algorithm. In LANC’03 Proceedings of

the 2003 IFIP/ACM Latin America conference on To-

wards a Latin American agenda for network research,

pages 21–34. ACM Press.

opfner, H., Mansour, E., and Nicklas, D. (2009). Review

of Data Management Mechanisms on Mobile Devices.

it – information technology, 51(2):79–84.

Lienhart, R. (1999). Abstracting home video automatically.

In Proceedings of the seventh ACM international con-

ference on Multimedia (Part 2), pages 37–40. ACM

Press.

Mart

ınez-Barco, P., Saquete, E., and Mu

noz, R. (2002). A

Grammar-Based System to Solve Temporal Expres-

sions in Spanish Texts. In Ranchhod, E. and Mamede,

N. J., editors, Advances in Natural Language Process-

ing — Proceedings of the Third International Confer-

ence, PorTAL 2002, volume 2389 of Lecture Notes in

Computer Science, pages 709–719, Berlin / Heidel-

berg. Springer.

Morita, M., Lethelier, E., Yacoubi, A. E., Bortolozzi, F.,

and Sabourin, R. (2000). An HMM-based Approach

for Date Recognition. In Proceedings of the Fourth

IAPR International Workshop on Document Analysis

Systems (DAS 2000), December 10-13, 2000, Rio de

Janeiro, Brazil, pages 233–244.

Podlipnig, S. and B

osz

ormenyi, L. (2003). A Survey of

Web Cache Replacement Strategies. ACM Computing

Surveys, 35(4):374–398.

Ren, Q. and Dunham, M. H. (2000). Using semantic

caching to manage location dependent data in mobile

computing. In Proceedings of the 6th Annual Inter-

national Conference on Mobile Computing and Net-

working, pages 210–221. ACM Press.

Romano, S. and ElAarag, H. (2008). A quantitative study of

recency and frequency based web cache replacement

strategies. In Ahmad, A. and Bragg, A., editors, CNS

’08: Proceedings of the 11th communications and net-

working simulation symposium, pages 70–78. ACM

Press.

Saquete, E. and Mart

ınez-Barco, P. (2000). Grammar

speciﬁcation for the recognition of temporal expres-

sions. In MT 2000: Online-Proceedings of the In-

ternational Conference on Machine Translation and

Multulingual Applications in the new Millenium. The

British Computer Society. paper 21, available online:

http://www.mt-archive.info/BCS-2000-Saquete.pdf.

Tanenbaum, A. S. (2007). Modern Operating Systems.

Prentice Hall, Upper Saddle River, NJ, USA, 3rd edi-

tion. http://www.pearsonhighered.com/product?

ISBN=0136006639.

Wang, J. (1999). A survey of web caching schemes for the

Internet. ACM SIGCOMM Computer Communication

Review, 29(5):36–46.

WEBIST2012-8thInternationalConferenceonWebInformationSystemsandTechnologies