WAVELET-BASED MUSIC RECOMMENDATION

Claudio Biancalana, Fabio Gasparetti, Alessandro Micarelli, Alfonso Miola and Giuseppe Sansonetti

Department of Computer Science and Automation, Roma Tre University, Via della Vasca Navale 79, Rome, Italy

Keywords:

Wavelet, Music Recommendation, Signal Processing.

Abstract:

Recommender Systems provide suggestions for items (e.g., movies or songs) to be of use to a user. They must

take into account information to deliver more useful (perceived) recommendations. Current music recom-

mender takes an initial input of a song and plays music with similar characteristics, or music that other users

have listened to along with the input song. Listening behaviors in terms of temporal information associated to

ratings or playbacks are usually ignored. We propose a recommender that predicts the most rated songs that

a given user is likely to play in the future analyzing and comparing user listening habits by means of signal

processing techniques.

1 INTRODUCTION

Recommender systems provide suggestions based on

user preferences in order to recommend items likely

to be of interest to a user. It is obvious that user pref-

erences are inﬂuenced by the current context, such as

the currenttime of the day, mood, or current activities.

Nevertheless, a few recommender systems explicitly

include this information in the preference models.

A special group of recommender systems are the

ones based on the collaborative approach (Resnick

et al., 1994; Shardanand and Maes, 1995; Breese

et al., 1998). The system generates recommendations

using only information about rating proﬁles for dif-

ferent users. Collaborative systems locate peer users

with a rating history similar to the current user and

generate recommendations using this neighborhood.

Collaborative ﬁltering (CF) systems have been

successful in several recommender systems. The

availability of large datasets and additional informa-

tion that is easy collectable from the web, makes this

task interesting.

There are several issues that do not allow us to

directly apply the traditional CF approach for music

recommendation. The space of possible items (i.e.,

tracks) can be very large and, similarly, the user space

can also be enormous. Often user ratings are not

available or they cover only a small subset of the user

library of songs. Moreover, when new users enter to

the system or new songs are added to the global li-

brary, it is not possible to provide any recommenda-

tion to them due to the lack of any preference infor-

mation (the so known cold-start problem). There is no

chance to use taxonomies or ontologies to represent

the new items and facilitate the clustering as happens

in different domains (e.g., (Acampora et al., 2010a;

Micarelli et al., 2009)) Content-based approaches col-

lect information describing the items and then, based

on the user preferences, they predict which tracks the

user might enjoy (see for example the Pandora ser-

vice

). The key component of this approach is the

similarity function among the songs. Nevertheless,

there is a strong limitation of the highlevel descriptors

that can be automatically extracted from the tracks

(Celma, 2010).

One more relevant issue that traditional CF ap-

proachesdo not take into considerationis the listening

behavior of the user in terms of temporal information.

The timestamp of an item (i.e., when the song song

is played) is an important factor for the recommenda-

tion algorithm. Usually, the prediction function treats

the older items as less relevant than the new ones, but

any further reasoning about the temporal information

is simply ignored.

In this paper, we discuss a recommendation ap-

proach based on signal processing. In particular, a

traditional CF approach is enhanced considering an

improved similarity function between users. The user

listening habits are represented by signals. Wavelet

theory is used to study the related time-frequency rep-

resentations of signals and draw similarity between

listening behaviors. Signal processing techniques are

www.pandora.com

399

Gasparetti F., Biancalana C., Micarelli A., Miola A. and Sansonetti G..

WAVELET-BASED MUSIC RECOMMENDATION.

DOI: 10.5220/0003940903990402

In Proceedings of the 8th International Conference on Web Information Systems and Technologies (WEBIST-2012), pages 399-402

ISBN: 978-989-8565-08-2

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

not employed to extract features from the songs, but

for representing and comparing those behaviors in or-

der to group similar users together. This is the novelty

of the approach in comparison with the current litera-

ture.

The rest of this paper is organized as follows. Sec-

tion 2 brieﬂy introduces some related studies on mu-

sic recommendation. Section 3 details our proposed

approaches. Last, in Section 4 a brief account of the

testbed we are developing for the evaluation is given.

Conclusions close the discussion.

2 RELATED WORK

Many algorithms have been developed to address

the personalized recommendation problem. Content-

based approaches aim at including different sources

of information (Semeraro et al., 2009; Groh and

Ehmig, 2007; Micarelli et al., 2006) or better mod-

elling the user interests (Gasparetti and Micarelli,

2007). User-based collaborative ﬁltering (CF) is

widely used, and the main idea is to ﬁnd the items

liked by other people with similar taste. Different

from the user-based CF, the item-based CF recom-

mends the items which are similar with the user’s

collected items (Schafer et al., 2007). Context-aware

high-level frameworks (e.g., (Acampora et al., 2010b;

Gaeta et al., 2009)) are not easily adaptable to this

speciﬁc domain because of the peculiar characteris-

tic of the items. For example, in (Biancalana et al.,

2011a) the authors devise a neural network context-

awarerecommender extracting differentfeatures from

point of interests. In the music scenario, techniques

that automatically extract features from the played

songs are not easily conceivable.

As for music recommendation, the most compres-

sive survey on the literature is to be found in (Celma,

2010). The author groups the recommendation ap-

proachesin four categories: (1) collaborativeﬁltering,

based on explicit or implicit feedbacks; (2) content-

based ﬁltering, by means of manual or automatic fea-

ture extraction; (3) context-based ﬁltering, the take

advantage of potential user tags associated to each

single song; and (4) hybrid approaches that combine

more then one of the above-mentioned ones.

To the best of our knowledge, there are currently

no attempts to include temporal behavior in user

habits in the music recommendation task. A prelim-

inary attempt has been suggest for the movie domain

in (Biancalana et al., 2011b). The proposed approach

can be categorized as context-based, where the simi-

larity of different songs is evaluated according to the

implicit listening behavior that the user exhibits.

3 WAVELET-BASED

RECOMMENDATION

Traditional user-based CF approaches relies on sim-

ilar users which have similar rating patterns, that is,

the prediction of a rating r

u,s

by user u for the track

track

is evaluated as an aggregate of the rating of

some other users for the same item track

. We call

these similar users neighbors. If a user v is similar to

a user u, we say that v is a neighbor of u. User-based

algorithms generate a prediction for a track track

analyzing ratings for track

from users in u’s neigh-

borhood.

In order to draw the distance (or similarity) be-

tween two users, the Pearson correlation coefﬁcient is

usually employed (Resnick et al., 1994):

sim(u, v) =

∑

s∈S

u,v

u,s

− r

)(r

v,s

− r

)

∑

s∈S

u,v

u,s

− r

)

∑

s∈S

u,v

v,s

− r

)

(1)

where S

u,v

denotes the set of co-rated items between

u and v, r

u,s

is the rating of the user u for the item s,

and r

is the average of the ratings of the user u.

Pearson correlation ranges from 1.0 for users with

perfect agreement to -1.0 for users with perfect dis-

agreement. In this way, it is possible to generate a

prediction of rating for the user u and the item s as

follows:

pred(u, s) = r

∑

v∈NNu

sim(u, v)(r

v,s

− r

)

∑

v∈NNu

sim(u, v)

(2)

where NNu is the set of users in the u’s neighborhood.

The proposed recommendation approach is en-

hanced considering a user similarity function that

analyzes contextual factors that are included in the

data collected during the normal usage of the recom-

mender system. In particular, the timestamp associ-

ated to playbacks.

In our recommender we employ Discrete wavelet

transforms (DWT). The basic principles of wavelet

theory were put forth in a paper by Gabor in 1945

(Gabor, 1946). In comparison with the Fourier trans-

form, wavelets are localized in both time (or location)

and frequency instead of just frequency. A wavelet

is a function used to represent a time signal into dif-

ferent scale components. Usually one can assign a

frequency range to each scale component. Each scale

component can then be studied with a resolution that

matches its scale. The DWT is computed by suc-

cessive lowpass and highpass ﬁltering of the discrete

time-domain signal as shown in Fig. 1. This is called

the Mallat pyramid algorithm, a computationally efﬁ-

cient method of implementing the wavelet transform.

WEBIST2012-8thInternationalConferenceonWebInformationSystemsandTechnologies

400

Figure 1: At each decomposition level, the half band ﬁlters

produce signals spanning only half the frequency band.

The input signal is assumed to be a set of discrete-

time samples, i.e., a sequence x[n], where n is an inte-

ger. Whereas the basis function of the Fourier trans-

form is a sinusoid, the wavelet basis is a set of func-

tions. In our approach we decide to employ the popu-

lar Haar wavelets.

For an input represented by a list of 2n numbers,

the Haar wavelet transform may be considered to sim-

ply pair up input values, storing the difference and

passing the sum. This process is repeated recursively,

pairing up the sums to provide the next scale: ﬁnally

resulting in 2n− 1 differences and one ﬁnal sum. In

essence, the obtained decomposition can be thought

of as representing a frequency decomposition of the

input.

The algorithm to compute the similarity between

two users is based on the comparison of the Wavelet

transforms obtained by the two signals related to the

listening behavior of the users. In particular, the in-

put dataset is composed of users, songs and tuples

< u

,track

,timestamp > that represent tracks of a

library L listened by users in a given moment. The

signal is built in the following way:

Algorithm 1: Similarity between users u and v.

for all track

∈ L do

u,k

] ← number of times the song track

has been

played by the user u in the time interval t

< t < t

δT

u,k

] ← number of times the song track

has been

played by the user v in the time interval t

< t < t

+δT

end for

for all track

∈ L do

u,k

← discrete Haar Wavelet transform of signal vv

u,k

v,k

← discrete Haar Wavelet transform of signal vv

v,k

end for

sim

u,v

← Euclidean distance between the two vectors w

and w

Two users are considered similar if they listen to

the same songs in the same time of the day. If the two

users listen to the same songs in a given period of time

but this two periods do not coincide, e.g., the user u

played some songs in January and v played the same

songs in March, traditional comparison metrics return

low similarity between the two users.

The Euclidean distance between the coefﬁcients

of the two wavelets allows us to ignore potential time

shifts between listening behaviors. Moreover, it takes

into account the frequency of items, i.e., the times

a song has been played in a given period. In this

way, we are able to recognize similarities between

user habits analyzing different scales or approxima-

tions of the input signals produced by the wavelet tree

decomposition. This is a well-known characteristic of

the wavelet theory that is exploit several times in the

Content-based image retrieval domain.

4 HOW TO EVALUATE

We are currently devising a testbed that include

enough real usage data extracted by a public domain

datasets, i.e., Last.fm Dataset - 1k

, in order to com-

pare the performance with traditional recommender

approaches by means of standard evaluation mea-

sures. That dataset contains tuples in the following

form:

< user, timestamp, artist, song > (3)

collected from Last.fm website that correspond to

one song played by a speciﬁc user. There are 992

unique users and more than 19 Million entries, there-

fore enough data to represent user listening habits.

The pair < artist, song > is easily mapped to the track

variable used in the wavelet-based recommender. We

will looking to standard measures of prediction be-

tween the recommended songs and the actual songs

played by the users.

5 CONCLUSIONS

In this paper, we have discussed a recommender ap-

proach based on signals related to the user listening

behavior.

Further work will be done along several research

directions. Some factors that should be included in

the recommendation process are the novelty of songs

and the user authority. New songs have a higher po-

tential of being interesting than old songs. Moreover,

some collaborative approaches have tried to diversify

the ratings from users, identifying more authoritative

last.fm

WAVELET-BASEDMUSICRECOMMENDATION

401

users that should be taken more into consideration

when predictions have to be suggested. Serendipity

and negative preferences are further factors that mu-

sic recommenders should include in their predictive

analysis (Iaquinta et al., 2008; Musto et al., 2011).

REFERENCES

Acampora, G., Gaeta, M., and Loia, V. (2010a). Exploring

e-learning knowledge through ontological memetic

agents. Comp. Intell. Mag., 5:66–77.

Acampora, G., Gaeta, M., Loia, V., and Vasilakos, A. V.

(2010b). Interoperable and adaptive fuzzy services for

ambient intelligence applications. ACM Trans. Auton.

Adapt. Syst., 5:8:1–8:26.

Biancalana, C., Flamini, A., Gasparetti, F., Micarelli, A.,

Millevolte, S., and Sansonetti, G. (2011a). Enhanc-

ing traditional local search recommendations with

context-awareness. In Konstan, J. A., Conejo, R.,

Marzo, J. L., and Oliver, N., editors, User Model-

ing, Adaption and Personalization - 19th International

Conference, UMAP 2011, Girona, Spain, July 11-15,

2011. Proceedings, volume 6787 of Lecture Notes in

Computer Science, pages 335–340. Springer.

Biancalana, C., Gasparetti, F., Micarelli, A., Miola, A., and

Sansonetti, G. (2011b). Context-aware movie recom-

mendation based on signal processing and machine

learning. In Proceedings of the 2nd Challenge on

Context-Aware Movie Recommendation, CAMRa ’11,

pages 5–10, New York, NY, USA. ACM.

Breese, J. S., Heckerman, D., and Kadie, C. M. (1998). Em-

pirical analysis of predictive algorithms for collabora-

tive ﬁltering. In Cooper, G. F. and Moral, S., editors,

Proceedings of the 14

Conference on Uncertainty in

Artiﬁcial Intelligence, pages 43–52.

Celma, . (2010). Music Recommendation and Discovery -

The Long Tail, Long Fail, and Long Play in the Digital

Music Space. Springer.

Gabor, D. (1946). Theory of communication. J. Inst. Elect.

Eng., 93:429–457.

Gaeta, A., Gaeta, M., and Ritrovato, P. (2009). A grid based

software architecture for delivery of adaptive and per-

sonalised learning experiences. Personal Ubiquitous

Comput., 13:207–217.

Gasparetti, F. and Micarelli, A. (2007). Personalized search

based on a memory retrieval theory. International

Journal of Pattern Recognition and Artiﬁcial Intel-

ligence (IJPRAI): Special Issue on Personalization

Techniques for Recommender Systems and Intelligent

User Interfaces, 21(2):207–224.

Groh, G. and Ehmig, C. (2007). Recommendations in taste

related domains: collaborative ﬁltering vs. social ﬁl-

tering. In GROUP ’07: Proceedings of the 2007 inter-

national ACM conference on Supporting group work,

pages 127–136, New York, NY, USA. ACM.

Iaquinta, L., de Gemmis, M., Lops, P., Semeraro, G., Fi-

lannino, M., and Molino, P. (2008). Introducing

serendipity in a content-based recommender system.

In Xhafa, F., Herrera, F., Abraham, A., K¨oppen, M.,

and Ben´ıtez, J. M., editors, International Conference

on Hybrid Intelligent Systems (HIS 2008, pages 168–

173. IEEE Computer Society.

Micarelli, A., Gasparetti, F., and Biancalana, C. (2006).

Intelligent search on the internet. In Stock, O. and

Schaerf, M., editors, Reasoning, Action and Inter-

action in AI Theories and Systems, volume 4155 of

Lecture Notes in Computer Science, pages 247–264.

Springer.

Micarelli, A., Sciarrone, F., and Gasparetti, F. (2009). A

case-based approach to adaptive hypermedia naviga-

tion. IJWLTT, 4(1):35–53.

Musto, C., Semeraro, G., Lops, P., and de Gemmis, M.

(2011). Random indexing and negative user prefer-

ences for enhancing content-based recommender sys-

tems. In Huemer, C. and Setzer, T., editors, E-

Commerce and Web Technologies, volume 85 of Lec-

ture Notes in Business Information Processing, pages

270–281. Springer.

Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and

Riedl, J. (1994). Grouplens: an open architecture

for collaborative ﬁltering of netnews. In Proceedings

of the 1994 ACM conference on Computer supported

cooperative work, CSCW ’94, pages 175–186, New

York, NY, USA. ACM.

Schafer, J. B., Frankowski, D., Herlocker, J., and Sen, S.

(2007). Collaborative ﬁltering recommender systems.

In Brusilovsky, P., Kobsa, A., and Nejdl, W., edi-

tors, The Adaptive Web: Methods and Strategies of

Web Personalization, volume 4321 of Lecture Notes

in Computer Science, pages 291–324. Springer Berlin,

Heidelberg, Berlin, Heidelberg, and New York.

Semeraro, G., Lops, P., Basile, P., and de Gemmis, M.

(2009). Knowledge infusion into content-based rec-

ommender systems. In Bergman, L. D., Tuzhilin, A.,

Burke, R. D., Felfernig, A., and Schmidt-Thieme, L.,

editors, Proceedings of the 2009 ACM Conference on

Recommender Systems, RecSys 2009, pages 301–304.

ACM.

Shardanand, U. and Maes, P. (1995). Social information ﬁl-

tering: algorithms for automating word of mouth. In

Proceedings of the SIGCHI conference on Human fac-

tors in computing systems, CHI ’95, pages 210–217,

New York, NY, USA. ACM Press/Addison-Wesley

Publishing Co.

WEBIST2012-8thInternationalConferenceonWebInformationSystemsandTechnologies

402