affected (than LRU) by the change in ordering. The
LRU cache shows no indication of saturation as was
the case for the outer-user situation.
We repeated the experiment with a third alternate
job ordering situation where all user-item pairs were
processed in random order. The aggregated results of
multiple runs turned out to be almost identical to the
outer-item results.
6 CONCLUSIONS
In this work we set out to find caching strategies
that allow in-memory user-based collaborative filter-
ing algorithms to store intermediate user-user simi-
larity results. First, we showed that user similarities
are not equally important, as some are used consid-
erably more than others during the recommendation
calculation process. We tried predicting this usage
frequency upfront by applying aggregation operators
on the number of ratings, but ultimately succeeded in
accurately calculating this value by determining the
cardinality of the ‘inverse intersection’ of the set of
rated items.
We then presented two caching strategies: a basic
LRU (least recently used) cache and a novel SMART
caching approach which acted like a priority cache
that incorporated the knowledge about usage fre-
quency of user-user similarities. A number of experi-
ments were run on the MovieLens dataset to compare
the performance (execution time speedup) of each of
these caches against a no-cache baseline and under
varying cache sizes and job execution orderings.
Our results showed that the order in which user-
item recommendation values are calculated can dra-
matically impact the LRU cache performance and
therefore also the total execution time. Optimal re-
sults were obtained when calculating the recommen-
dation values of each user for every item sequen-
tially before moving on to the next user (outer-user
strategy). The LRU-enhanced UBCF algorithm per-
formed between 5 and 6 times better in that situation
than the no-cache baseline and required a cache size
of only 0.2% (vs. the SMART approach which re-
quired a cache size of 60% to obtain similar results).
For a random job (and outer-item) execution ordering
on the other hand, the SMART approach came out
best in terms of stability and performance.
Although this work focussed mainly on in-
memory algorithms, these results may as well be
useful for other situations where caching strategies
can be applied to user-based collaborative filtering
algorithms (e.g., caching similarity values to reduce
database access).
7 FUTURE WORK
As future work we would like to investigate the gener-
alizability of the obtained results to other datasets and
recommendation algorithms. We also intend to im-
prove results in terms of speedup by further refining
the job execution ordering and involving other cache
algorithms like LFU and ARC.
ACKNOWLEDGEMENTS
The described research activities were funded by a
PhD grant to Simon Dooms of the Agency for Innova-
tion by Science and Technology (IWT Vlaanderen).
REFERENCES
Herlocker, J., Konstan, J. A., and Riedl, J. (2002). An em-
pirical analysis of design choices in neighborhood-
based collaborative filtering algorithms. Inf. Retr.,
5(4):287–310.
Jannach, D., Zanker, M., Felfernig, A., and Friedrich, G.
(2011). Recommender Systems An Introduction. Cam-
bridge University Press.
Park, S.-T., Pennock, D., Madani, O., Good, N., and De-
Coste, D. (2006). Na
¨
ıve filterbots for robust cold-start
recommendations. In Proc. ACM SIGKDD Conf. on
Knowledge discovery and data mining (KDD 2006),
pages 699–705, New York, NY, USA.
Peralta, V. (2007). Extraction and integration of movielens
and imdb data. Technical report, Technical Report,
Laboratoire PRiSM, Universit
´
e de Versailles, France.
Qasim, U. (2011). Active Caching For Recommender Sys-
tems. PhD thesis, New Jersey Institute of Technology,
New Jersey.
Qasim, U., Oria, V., fang Brook Wu, Y., Houle, M. E., and
¨
Ozsu, M. T. (2009). A partial-order based active cache
for recommender systems. In Proc. ACM Conf. Rec-
ommender systems (RecSys 2009), pages 209–212.
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P.,
and Riedl, J. (1994). Grouplens: an open archi-
tecture for collaborative filtering of netnews. In
Proc. ACMF Conf. on Computer supported cooper-
ative work (CSCW ‘94), pages 175–186, New York,
NY, USA. ACM.
Seth, S. and Kaiser, G. (2011). Towards using cached
data mining for large scale recommender systems. In
Proc. Conf. Data Engineering and Internet Technol-
ogy (DEIT 2011).
WEBIST2013-9thInternationalConferenceonWebInformationSystemsandTechnologies
440