A UNIFYING VIEW OF CONTEXTUAL ADVERTISING

AND RECOMMENDER SYSTEMS

Giuliano Armano and Eloisa Vargiu

Dept. of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy

Keywords:

Contextual advertising, Recommender systems, Information Retrieval.

Abstract:

From a general perspective, nothing prevents from viewing contextual advertising as a kind of Web recom-

mendation, aimed at embedding into a Web page the most relevant textual ads available for it. In fact, the task

of suggesting an advertising is a particular case of recommending an item (the advertising) to a user (the web

page), and vice versa. We envision that bringing ideas from contextual advertising could help in building novel

recommender systems with improved performance, and vice versa. To this end, in this paper, we propose a

unifying view of contextual advertising and recommender systems. To this end, we suggest: (i) a way to build

a recommender system inspired by a generic solution typically adopted to solve contextual advertising tasks

and (ii) a way to realize a collaborative contextual advertising system a la mode of collaborative ﬁltering.

1 INTRODUCTION

Let us note in advance that, in the literature, the term

“context” is referred to “keywords used in search en-

gines” in the area of contextual advertising, and to

“events which modify the user behavior” in the area

of recommender systems. In this paper we always ad-

here to the former interpretation. Therefore, we are

not interested in context-aware recommender systems

as in (Adomavicius and Tuzhilin, 2008) (Abbar et al.,

2009) (Ramaswamy et al., 2009).

As discussed in (Broder et al., 2007), contextual

advertising is an interplay of four players: (i) the ad-

vertiser, which provides the supply of ads; (ii) the

publisher, which is the owner of the web pages on

which the advertising is displayed; (iii) the ad net-

work, which, as mediator between advertiser and pub-

lisher, is in charge of selecting the ads to put in the

pages; and (iv) users, which visit the web pages of

the publisher and interact with the ads. Similarly,

a recommendation task may be described as an in-

terplay of four players: (i) the recommender, which

provides the supply of items to be recommended; (ii)

the publisher, which is the owner of the web pages

on which items are displayed for recommendation;

(iii) the recommender system, which, as a mediator

between recommender and publisher, is in charge of

selecting the items to be recommended to a speciﬁc

user; and (iv) users, which visit the web pages of

the publisher/recommender and interact with the sug-

Figure 1: A contextual advertising system.

gested items.

Although contextual advertising and recom-

mender systems have been usually studied separately,

they could be hypothesized as isomorphic structures,

in which a task on one side corresponds to a task on

the other side. For instance, the task of suggesting an

advertising to a web page could be viewed as the task

of recommending an item (the advertising) to a user

(the web page), and vice versa.

Starting from this insight, in this paper we propose

a unifying view presenting two novel approaches:

a content-based recommender system devised a la

mode of contextual advertising system and a collab-

463

Armano G. and Vargiu E..

A UNIFYING VIEW OF CONTEXTUAL ADVERTISING AND RECOMMENDER SYSTEMS.

DOI: 10.5220/0003097204630466

In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR-2010), pages 463-466

ISBN: 978-989-8425-28-7

 2010 SCITEPRESS (Science and Technology Publications, Lda.)

orative contextual advertising devised a la mode of

recommender system. To our best knowledge, this is

the ﬁrst attempt to combine the two approaches.

2 A RECOMMENDER SYSTEM A

LA MODE OF CONTEXTUAL

ADVERTISING

2.1 A Typical Contextual Advertising

System

In our view, a generic system devoted to perform con-

textual advertising could be designed as depicted in

Figure 1.

Pre-processor. Its main purpose is to transform an

HTML document (a web page or an advertising) into

an easy-to-process plain-text based document, while

preserving important information. In particular, the

main goal is to preserve the blocks of the original

HTML document, while removing HTML tags and

stop-words. Information about which phrases are part

of the anchor text of the hypertext links could also be

preserved.

Text summarizer. Text summarization techniques

are divided into extractive and non-extractive. The

input of a contextual advertiser being an HTML doc-

ument, contextual advertising systems typically rely

on the former. In particular, extraction-based tech-

niques are applied to the blocks that form a web page

–e.g., the title of the web page, the ﬁrst paragraph,

the paragraph with the highest title-word count. The

text summarizer outputs a vector representation of the

original HTML document, web page or advertising,

in terms of bag of words (BoW).

Classiﬁer. To alleviate possible harmful effects

of summarization, both page excerpts and advertis-

ings are classiﬁed according to a given taxonomy

(Anagnostopoulos et al., 2007). The corresponding

classiﬁcation-based features (CF) are then used in

conjunction with the original BoW.

Matcher. It suggests ads to the web page accord-

ing to a similarity score based on both BoW and CF.

2.2 The Proposed Recommender

System

As depicted in Figure 2, our proposal for building a

recommender system involves two steps: user proﬁl-

ing and recommendation (Addis et al., 2010). Given a

user, her/his proﬁle is generated from the correspond-

ing user history, i.e., from the set of documents s/he

rated as relevant.

Figure 2: The proposed recommender system.

User Proﬁling

The user proﬁler is composed by four main modules:

statistical document analyzer, semantic analyzer, se-

mantic net handler, and proﬁler (Addis et al., 2009).

Statistical document analyzer. While analyzing

documents rated as relevant by the user, this module

is devoted to create the BoW , which collects all terms

contained in the input documents, suitably weighted.

The statistical document analyzer removes from the

BoW all non-informative words such as prepositions,

conjunctions, pronouns, and very common verbs us-

ing a stop-word list. Subsequently, it calculates the

weight of each term adopting the TFIDF measure.

The statistical document analyzer calculates an over-

all TFDIF considering all documents in the user his-

tory. Furthermore, the weights resulting from TFIDF

undergo a cosine normalization. To reduce the di-

mensionality of the space, only the ﬁrst N terms of

the BoW are retained. The optimal value of N must

be calculated experimentally. Hereinafter, the set of

terms stored in the BoW will be called features. This

module corresponds to the preprocessor and to the

text summarizer adopted in the generic contextual ad-

vertising solution described previously.

Semantic words analyzer. This module creates

the bag of synsets (BoS), which collects all synsets

related to the selected features. To this end, the se-

mantic document analyzer queries an online lexical

database (e.g., WordNet (Miller, 1995)). After synset

extraction, the semantic document analyzer assigns

to each sysnset a weight according to the TFIDF of

all related terms. This module corresponds to a text

summarizer based on semantic information. In fact, a

semantic approach can be also adopted in contextual

advertising to improve the performances of the text

summarization task.

Semantic net handler. This module aims to (i)

build the semantic net from the BoS and (ii) extract its

KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval

464

most relevant nodes. First, a semantic net is built in

form of a graph, whose nodes are the synsets belong-

ing to the BoS and whose edges are semantic relations

between synsets. Four kinds of semantic relations are

taken into account: hyponymy and its inverse (hyper-

onymy); meronymy and its inverse (holonymy). The

semantic net handler is also in charge of pruning the

network by dropping irrelevant nodes, identiﬁed ac-

cording to their weight and to the number of connec-

tions with other nodes.

Proﬁler. This module is devoted to extract the

user proﬁle. To this end, it exploits a given taxonomy

(e.g., WordNet Domains Hierarchy (Magnini and

Cavagli, 2000)) and associates the proper category to

each selected node. Considering the selected nodes,

together with their weights, the proﬁler is able to

identify the real interests of the user in terms of the

given taxonomy. In particular, the user proﬁle is

represented as a set of pairs

, where c

is a

category and w

the corresponding weight in [0, 1].

The semantic net handler and the proﬁler correspond

to the classiﬁer adopted in the generic contextual

advertising solution previously described.

Recommendation

Once the user proﬁle has been generated, the sys-

tem can rank a new item i to determine whether it

could be of interest for a speciﬁc user u. This can be

done by measuring the distance between the vector-

based representations of i and u, say

V (i) and

V (u).

In particular, the textual information of an item i can

be processed in a way similar to proﬁle extraction:

a set of categories of the given taxonomy with the

corresponding relevance ratio are computed for the

item, and the cosine distance between u and i is eval-

uated. Items obtaining a score greater than 0.5 are

proposed to the user. It is easy to note that the rec-

ommender corresponds to the matcher adopted in the

generic solution for contextual advertising, previously

described.

3 A CONTEXTUAL

ADVERTISING SYSTEM A LA

MODE OF COLLABORATIVE

FILTERING

3.1 A Typical Collaborative

Recommender System

In our view, a generic system devoted to perform col-

laborative ﬁltering could be designed as depicted in

Figure 3.

Figure 3: A collaborative recommender system.

The recommendation problem can be formulated

as follows: let U be the set of all users and I be the set

of all possible items that can be recommended (e.g.,

books, movies, and restaurants). Let f be a utility

function that measures the usefulness of item i to user

u, i.e., f : U × I → R, where R is a totally ordered set

(e.g., non-negative integers or real numbers within a

given range). Then, for each user u ∈ U, we want to

choose the item i

∈ I that maximizes f . In recom-

mender systems, f is typically represented by ratings

and is initially deﬁned only on the items previously

rated by the users. For example, in an application

for recommending books (e.g., Amazon.com), users

initially rate a subset of books that they have read.

This information is stored in the user repository, as

sketched in Figure 3.

Peer user extractor. The main purpose of the peer

user extractor is, given a user u, to detect her/his

“peers”. Peer users are other users that have simi-

lar preferences and tastes. The underlying idea is that

only items that are most liked by the “peers” of user u

would be recommended to her/him.

Item-user analyzer. This module is devoted to

analyze the rates of the peers users and to build the

corresponding user-item rating matrix, in which each

row corresponds to a user (u and its peer users), each

column corresponds to an item, and each cell corre-

sponds to the rating given by that user to that item.

Ratings are typically speciﬁed on the scale of 1 to 5.

Matcher. The main purpose of this module is to

ﬁnd, starting from the user-item rating matrix, the set

of items to be recommended to u. Each correspond-

ing score is calculated by taking into account the rates

provided by users.

3.2 The Proposed Contextual

Advertising System

Our idea of a collaborative contextual advertising sys-

tem relies in suggesting ads to a web page p exploit-

ing the “collaboration” of p with its peer pages. Fig-

ure 4 depicts the proposed high-level architecture.

Inlink extractor. This module is devoted to ﬁnd,

given a page p, the peer pages. In our opinion, suit-

able peer pages could be all the inlinks of p, i.e., all

pages that link to p. First, this module creates the

A UNIFYING VIEW OF CONTEXTUAL ADVERTISING AND RECOMMENDER SYSTEMS

465

Figure 4: The proposed contextual advertising system.

bag of inlinks (BoI), which collects all the inlinks of a

given page. To this end, Google AJAX Search API

existing tools (such as Page Inlink Analyzer

) could

be used. It is easy to note that this module corre-

sponds to the peer user extractor previously described.

Item-advertising analyzer. First, this module

parses all the extracted inlinks and, for each inlink i,

extracts the corresponding list of ads weighting them

according to the position in i. Then, the module builds

the inlink-advertising matrix, whose generic element

i j

reports the weight for the inlink i and for the ad-

vertisement j. It is easy to note that this module cor-

responds to the peer user-item analyzer previously de-

scribed.

Matcher. This module is devoted to suggest ads

to the web page according to a similarity score. In

principle, any similarity measure can be adopted: cor-

relation, cosine-based, rated-based. This module cor-

responds to the matcher of the typical collaborative

recommender system previously described.

4 CONCLUSIONS AND FUTURE

DIRECTIONS

In this paper we proposed a unifying view of contex-

tual advertising and recommender systems. To our

best knowledge, this is the ﬁrst attempt to combine

these research ﬁelds.

As for future directions, we are currently setting

up experiments to validate the content-based recom-

mender system illustrated in Section 2. Furthermore,

we are starting the implementation of the collabora-

tive contextual advertising system illustrated in Sec-

tion 3.

ACKNOWLEDGEMENTS

This work has been partially supported by Hoplo srl.

We wish to thank, in particular, Ferdinando Licheri

and Roberto Murgia for their useful suggestions.

http://code.google.com/apis/ajaxsearch/

http://ericmiraglia.com/inlink/

REFERENCES

Abbar, S., Bouzeghoub, M., and Lopez, S. (2009). Context-

aware recommender systems: A service-oriented ap-

proach. In 3rd International Workshop on Personal-

ized Access, Proﬁle Management, and Context Aware-

ness in Databases.

Addis, A., Armano, G., Giuliani, A., and Vargiu, E. (2010).

A recommender system based on a generic contex-

tual advertising approach. In Proceedings of ISCC’10:

IEEE Symposium on Computers and Communica-

tions, pages 859–861.

Addis, A., Armano, G., and Vargiu, E. (2009). Proﬁling

users to perform contextual advertising. In Proceed-

ings of the 10th Workshop dagli Oggetti agli Agenti

(WOA 2009).

Adomavicius, G. and Tuzhilin, A. (2008). Context-aware

recommender systems. In RecSys ’08: Proceedings of

the 2008 ACM conference on Recommender systems,

pages 335–336, New York, NY, USA. ACM.

Anagnostopoulos, A., Broder, A. Z., Gabrilovich, E., Josi-

fovski, V., and Riedel, L. (2007). Just-in-time con-

textual advertising. In CIKM ’07: Proceedings of

the sixteenth ACM conference on Conference on infor-

mation and knowledge management, pages 331–340,

New York, NY, USA. ACM.

Broder, A., Fontoura, M., Josifovski, V., and Riedel, L.

(2007). A semantic approach to contextual advertis-

ing. In SIGIR ’07: Proceedings of the 30th annual in-

ternational ACM SIGIR conference on Research and

development in information retrieval, pages 559–566,

New York, NY, USA. ACM.

Magnini, B. and Cavagli, G. (2000). Integrating subject

ﬁeld codes into wordnet. In Gavrilidou M., Crayannis

G., Markantonatu S., Piperidis S. and Stainhaouer G.

(Eds.) Proceedings of LREC-2000, Second Interna-

tional Conference on Language Resources and Eval-

uation, pages 1413–1418.

Miller, G. A. (1995). Wordnet: A lexical database for en-

glish. Commun. ACM, 38(11):39–41.

Ramaswamy, L., Deepak, P., Polavarapu, R., Gunasekera,

K., Garg, D., Visweswariah, K., and Kalyanaraman,

S. (2009). Caesar: A context-aware, social recom-

mender system for low-end mobile devices. Mobile

Data Management, IEEE International Conference

on, 0:338–347.

KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval

466