TOURISM-KM

A Variant of MMKP Applied to the Tourism Domain

Romain Picot-Cl´emente, Florence Mendes, Christophe Cruz and Christophe Nicolle

Laboratoire Le2i, UFR Sciences et Techniques Universit de Bourgogne, B.P. 47870, 21078 Dijon, France

Keywords:

Knowledge management, Simulated annealing, Knapsack problem, Decision support.

Abstract:

We are interested in an original real-world problem coming from tourism ﬁeld. We describe a modelling of the

problem and propose a ﬁrst approach that mixes knowledge management and operational research methods.

Our algorithms have been implemented in order to produce tourism solutions that are not unique for a given

request but that take into account the preferences of the tourist user and provide a personalized solution. We

report computational results obtained on real-world instances.

1 INTRODUCTION

The tourism area has widely changed over last years,

thanks to innovative services accessible via internet

and smartphones. Tourism offer is so abundant that

the customer may feel lost if the information is not

clearly ordered. We present a method to help the

tourist to plan his stay, which uses knowledge man-

agement techniques to take into account his prefer-

ences and his proper speciﬁcations (in couple or with

children, dog, etc.). Our goal is to suggest an offer

that matches at best the expectations of the customer.

In this purpose, we take into account at the same time

adequacy between tourism offer and customer speciﬁ-

cations, interest of the customer for this offer, and also

geographical distance between the different elements

of the offer.

In this paper, we present the resolution of a hard

holiday scheduling problem by using a combination

of metaheuristics based on simulated annealing with a

semantic modelling of tourism knowledge focused on

users. This work takes place under the (CHECKSEM,

2011) project with the aim of providing a comprehen-

sive set of knowledge management methods and tools

focused on the user. Some of these methods have al-

ready been applied successfully in several domains

such as civil engineering (Cruz and Nicolle, 2006) or

archaeology (Karmacharya et al., 2009).

Our paper is organized as follows. In section 2,

we brieﬂy present the concepts related to the domain

models, the deﬁnition of a valid solution to the prob-

lem of best combination, as well as how to measure

the quality of a solution. In part 3, we present a

stochastic algorithm based on simulated annealing.

Finally, we discuss the effectiveness of our approach

and opportunities for further developments.

2 PROBLEM DEFINITION

The proposed system consists of four layers: a model

of goals, a domain model, a user model and an adap-

tation model. The user model represents the user in

the system, it consists of the goals of the user for his

future stay. The domain model is a domain ontology

in which all the tourism knowledge useful for the ap-

plication program is deﬁned. The goals model con-

tains all possible goals of users, they are linked with

the domain model by ﬁrst order rules. This layered

modelling is detailed in (Picot-Clemente et al., 2010).

In the following paragraphs, we deﬁne only the con-

cepts that are necessary to introduce our optimization

problem.

2.1 Items and Weights

A tourist is represented in our system by a user proﬁle,

composed of goals from the goals model and deﬁning

the objectives of the associated user. User proﬁles are

stocked into a users model. In the following, we will

consider only one user, supposing that the process to

produce a solution can be run independently for each

user of the system.

Let Ty bet the set of possible activity types in the

system. For example, restaurant, museum, or hotel,

are standard activity types.

421

Picot-Clémente R., Mendes F., Cruz C. and Nicolle C..

TOURISM-KM - A Variant of MMKP Applied to the Tourism Domain.

DOI: 10.5220/0003758704210426

In Proceedings of the 1st International Conference on Operations Research and Enterprise Systems (ICORES-2012), pages 421-426

ISBN: 978-989-8425-97-3

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

Let I be the set of all the elements that can be sup-

plied by the system in the ﬁnal recommendation. An

item i ∈ I is deﬁned by a vector composed of a name

i.name, a type i.type ∈ Ty, and geographical coordi-

nates i.x and i.y.

i = i.name,i.type,i.x,i.y (1)

The domain model represents the domain knowl-

edge, using an ontology composed of concepts, re-

lations between concepts, and individuals. Valid

items are integrated into the ontology as individuals.

For convenience in this paper we reduce the domain

model to the set I of valid items :

I = i

,...,i

nbi

(2)

The weight of an item i ∈ I, denoted w

, represents

the interest of this item for the user. The method used

to compute the weights of items consists of a propa-

gation of weights into the ontological domain model,

by following the different goals that are deﬁned by

the user (see (Picot-Clemente et al., 2010) for further

details).

2.2 Patterns and Combinations

When asking for a holiday proposal, the user chooses

a solution pattern, which deﬁnes the types of items

authorized in a solution.

P = (ty

,ty

,,ty

NbTy

/∀1 ≤ j ≤ NbTy,t

∈ Ty) (3)

A valid combination C

for a pattern P is a set of

items from the domain model, chosen under the con-

straints given by the combination pattern.

= (i

,,i

∀1 <= j <= Ni

.type = ty

,ty

∈ P

1 ≤ k ≤ NbC (4)

All valid combinations for P can be ordered from

to C

NbC

. The weight of a combination C

, de-

noted W

, equals the average weight of each item

of the combination. It represents the global interest

of the user for the combination, without considering

distance between items.

∑

j=0

(5)

In addition with the main combination pat-

tern, the user as the possibility to add more

requirements via the deﬁnition of subcombinations

patterns. Example : Let p

= Museum, Hotel and

= Restaurant, Hotel be two subcombination

patterns, corresponding to the main pattern P =

Museum,Restaurant,Hotel. The combination C

Louvremuseum,RestaurantduPalais,HotelJacques

can generate two subcombinations corre-

sponding to the two subpatterns : C

3,1

Louvremuseum,HotelJacques and C

3,2

RestaurantduPalais, HotelJacques

2.3 Dispersion and Relevance of a

Solution

The problem of ﬁnding the best combination of items

for a given user requires to introduce some geographic

properties to distinguish and compare the solutions.

Indeed, weight of an item is not sufﬁcient to evaluate

the quality of a combination. In this paragraph we

deﬁne a method to evaluate the combination relevance

by taking into account the distance between items.

For a given combination C

j,u

, the dispersion

σ(C

j,u

) equals the standard deviation of the coordi-

nates of combination items.

σ(C

j,u

) =

∑

|C|−1

i=0

((i.x−

∑

|C|−1

i=0

(i.x)

|C|

) + (i.y−

∑

|C|−1

i=0

(i.y)

)

|C|

))

|C|

(6)

So for each combination, the dispersion allows us

to quantify the geographical distance between combi-

nation items. A same dispersion can be considered

differently by two users, so we deﬁne a dispersion

tolerance, depending on the concerned user and the

combination pattern. The dispersion tolerance, de-

noted Tol(P) is a number representing the tolerance

of the user in terms of geographical distance between

items for pattern P.

The moderated dispersion is used to take into ac-

count the same dispersion value for combinations that

are in the same order of distance, according to the dis-

persion tolerance of the user :

mod

) = ⌈

σ(C

)

Tol(P)

⌉ (7)

By this way, two combinations with dispersion

values varying only by a few meters will not be dis-

tinguished.

As for combinations, each subcombination is as-

sociated with a dispersion value and a moderated dis-

persion value. The relevance of a combination C

an aggregate of W

and moderate dispersion of C

and its subcombinations :

ℜ(C

) =

mod

) +

∑

S−1

l=0

mod

k,l

)

(8)

ICORES 2012 - 1st International Conference on Operations Research and Enterprise Systems

422

where S is the number of subpatterns associated to

the pattern P.

Deﬁnition 1. Tourist Problem: Given a user, a set of

items and patterns to compose a valid combination of

items, the problem of ﬁnding the best items combina-

tion consists in ﬁnding for each user a combination

with maximal relevance :

maximise ℜ(C

with 1 ≤ k ≤ NbC, (9)

2.4 Links with other Optimization

Problems

This problem can be viewed as a Set Packing prob-

lem, as in (Avella et al., 2006). The authors propose

a simpliﬁed variant of tourist problem, they call the

Intelligent Tourist Problem, solved with a LP-based

heuristic. Their modelling include items and weights

for the items depending on user preferences, but they

only take into account one type of item (tourism ac-

tivities). So they don’t follow any pattern of combina-

tion, but add items to a combination if the time period

necessary for the activity matches a free time period

for the tourist.

Our problem has similarities with a combinato-

rial optimization problem called the Knapsack Prob-

lem (Karp., 1972). The knapsack problem derives its

name from the problem faced by someone who is con-

strained by a ﬁxed-size knapsack and must ﬁll it with

the most useful items. Each item as a weight and a

value. Items put in the knapsack have to maximise

the total value without exceeding a given maximum

weight. A common formulation of this problem, with

each item different from the others is as follows (0-1

Knapsack Problem):

maximise

∑

i=1

(10)

subject to

∑

i=1

≤ W,x

∈ 0,1 (11)

where v

and w

are respectively the value and the

weight of item i. W is the maximum weight of the

knapsack. x

equals 1 if item i is put into the knap-

sack, 0 otherwise.

This problem has been proven to be NP-complete

(Lagoudakis, 1996). Several variants of the knap-

sack problem have been studied : the multidimen-

sional knapsack, the quadratic knapsack problem, etc.

(see (Martello and Toth, 1990) for a survey). Our

tourism problem can be viewed as a variant from

the multi-choice multi-dimensional knapsack prob-

lem (MMKP). In standard MMKP, items are grouped

in function of their type and only one representant

of each type has to be chosen. In our problem sev-

eral items of the same type can be chosen, depend-

ing on the combination and subcombination patterns.

MMKP is NP-hard, so it would not be efﬁcient to

apply an exact method to solve it, especially for a

real-time decision making application (Chen et al.,

1999). Recent heuristic approaches use various tech-

niques, such as Local Search (Hiﬁ et al., 2004; Hiﬁ

et al., 2006), Tabu Search and Ant Colony Optimiza-

tion (Lau and Lim, 2004), or reductions of the search

space (Akbar et al., 2006), etc.

In the next part, we propose a simulated annealing

algorithm to ﬁnd a good solution in reasonable time.

3 SIMULATED ANNEALING

ALGORITHM

In order to solve the problem of ﬁnding the best com-

bination of items from the domain model for a given

user, we suggest the use of a stochastic algorithm

called simulated annealing (Kirkpatrick et al., 1983)

that is inspired from a method used in the steel indus-

try.

3.1 Algorithm Description

The simulated annealing algorithm used here is based

on this principle. At the beginning, the algorithm

chooses an initial random combination of individuals

following a given combination pattern (for instance, a

combination consisting of a hotel, two restaurants and

two activities). This combination has an energy E0,

called the initial energy, which represents the quality

of a combination. This energy of a combination is

based on the relevance of the combination:

ε(C

) =

ℜ(C

)

(12)

The lower the energy is, the better the combina-

tion is. A variable T, called temperature, decreases

in increments over time. At each level of tempera-

ture a certain number of elementary random changes

is tested on the current combination. A cost d f is

associated to each modiﬁcation; it is deﬁned as the

difference between the combinations energy after the

modiﬁcation and the one before. A negative cost sig-

niﬁes the current combination has a lesser energy than

the previous one (thus better by deﬁnition), it is then

kept. Conversely, a positive cost represents a bad

TOURISM-KM - A Variant of MMKP Applied to the Tourism Domain

423

change. Nevertheless, it can be kept according a given

probability (acceptance rate ) depending on the cur-

rent level of temperature and the cost. The higher the

temperature is, the higher the probability is. Thus,

over time, the number of changes allowed decreases

as the temperature decreases, until no longer accept-

ing any changes. Finally, the system is said frozen,

and the current combination becomes the ﬁnal com-

bination to be presented to the user. The acceptance

rate is deﬁned in (13) where Tk is the temperature at

the level k,k ∈ N.

= e

−

(13)

The temperature decrease is achieved through a

geometric decay at each level:

= g(T

k−1

) = coef ∗ T

k−1

= coef

∗ T

(14)

where k is the current level and 0 < coe f < 1.

Algorithm 1: Simulated annealing algorithm.

Input: I items and their weights for the current user,

combination patterns

Output: a valid combination with minimal Energy

= initial valid solution with energy ε0

T = T

k = 0

while the system is not frozen do

k+ +

Obtain C

by an elementary transformation of

k−1

Compute energy ε

d f = ε

− ε

k−1

if d f > 0 then

use an acceptance rate to randomly accept or

refuse the modiﬁed solution as current solu-

tion ;

else

The modiﬁed solution is accepted as current

solution ;

end if

the temperature is lowered ;

end while

3.2 Algorithm Parameters

The arbitrary deﬁnition of the various parameters of

simulated annealing algorithm is the main disadvan-

tage of this algorithm. First, we have to establish the

initial temperature T

. If T

is too high, the ﬁrst mod-

iﬁcations will all be accepted without considering the

quality of the solutions, which is a waste of time. In-

versely, if T

value is too small, the exploration of the

solutions space will be too limited. We have to ﬁnd

a medium value. A way to ﬁnd this value consists in

generating some expensive alterations and computing

the medium variation d

moy. A ﬁrst acceptation rate

is chosen (we use (0.9) in our experiments), the value

of T

is computed :

fmean

(15)

Then we have to choose the temperature decreas-

ing coefﬁcient and the number of iterations at each

step. The decreasing coefﬁcient is included between

0 and 1. The higher is the coefﬁcient, the slower is the

decreasing. A value of 0.6 has been chosen, allowing

us to obtain good results in reasonable time. The num-

ber of iterations at each step determines the number of

changes allowed for each temperature step. When this

number is reached, we say that the system is in a sta-

tistical equilibrium and we start the temperature de-

creasing to a new step. In our experiments, this num-

ber has been ﬁxed at 2000. Finally, the stop criteria

is important. We say that the system is frozen when

no more change is acceptable. In pratice, some re-

searchers choose to stop the algorithm when the sys-

tem reaches a ﬁxed temperature or when a maximal

number of steps has been exceeded. We choose to

stop the execution when no change have been done

during a ﬁxed number of modiﬁcations (2000).

3.3 Experimental Results

In order to justify the use of this simulated anneal-

ing algorithm, instead of a simpler algorithm like a

Hill-Climbing, some benchmarks have been realized

on real and random datasets, using the simulated al-

gorithm described in the previous part and a Hill-

Climbing algorithm. The Hill-Climbing algorithm

used is presented in algorithm 2

Algorithm 2: Hill-Climbing algorithm.

Input: I items and their weights for the current user,

combination patterns

Output: a valid combination with minimal Energy

R = initial valid solution randomly chosen

repeat

Obtain S by an elementary transformation of R

if ε

< ε

then

R=S

end if

until X transformations without a change or the

time limit is reached

return S

ICORES 2012 - 1st International Conference on Operations Research and Enterprise Systems

424

The real dataset includes 3724 items which are

composed of activities, 1008 restaurants and 727 ac-

commodations. The random dataset includes 30000

items composed of 10000 activities, 10000 restau-

rants and 10000 accommodations. The items coor-

dinates are deﬁned randomly into the bounding rect-

angle of a french department. The items weights are

also deﬁned randomly between 0 and 1000.

The table 1 compares the results obtained by the

Simulated Annealing algorithm (SA) with those ob-

tained by the Hill-Climbing algorithm (HC). It shows

the average values and the average times got on the

real and the random datasets. These averages are per-

formed on 100 iterations of the algorithms.

Table 1: Simulated annealing VS Hill-climbing.

Random Dataset Real Dataset

Energy Time Energy Time

∗10

(ms) ∗10

(ms)

SA 199.56 239 169.05 285

HC 812.26 8 310.31 9

For the tourist real application, we have a con-

straint consisting in needing a generation of the com-

binations in near real-time. This time constraint cor-

responds to an arbitrary tolerance of 500 ms. Thus,

given that the computation times are lower than this

threshold, the two methods are adequate. However,

the comparative table shows the average energy of the

combinations given by the simulated annealing is bet-

ter (lower) than the one given by the Hill-Climbing.

This difference of relevance is reﬂected in the propo-

sitions to the user. That is why the use of the simu-

lated annealing algorithm is justiﬁed for the near real-

time application we want.

4 CONCLUSIONS AND FUTURE

WORK

This paper presented the resolution of a hard holiday

scheduling problem by using a combination of meta-

heuristics based on simulated annealing with a seman-

tic modelling of tourism knowledge focused on users.

In order to solve the problem of ﬁnding the best com-

bination of items from the domain model for a given

user, we used a simulated annealing algorithm. This

work is a part of a more generic project which aims

to build a touristic recommender system. This work

combines the Semantic Web technologies (mainly on-

tologies), the model of the adaptive hypermedia sys-

tems, and combinatory algorithms in order to provide

recommendations. A recommendation is a combina-

tion of items formed according to a semantic pattern

deﬁned with the help of a domain ontology. This re-

search project was developed in cooperation with a

French tourism company called Cˆote d’or Tourisme.

Since June 2011, a smartphone application was free

of charge and available to users on the Apple store

or android market. Now, we work to improve our re-

sult by using a multi-objective approach. One of the

main difﬁculties of this improvement will be the ob-

tainment of workable results despite a very short exe-

cution time allowed for the smartphone application.

REFERENCES

Akbar, M., Rahman, M., Kaykobad, M., Manning, E.,

and Shoja, G. (2006). Solving the multidimensional

multiple-choice knapsack problem by constructing

convex hulls. Computers and Operations Research,

33(5):12591273.

Avella, P., Dauria, B., and Salerno, S. (2006). A lp-based

heuristic for a time-constrained routing problem. Eu-

ropean Journal of Operational Research.

CHECKSEM (2011). http://www.checksem.fr.

Chen, L., Khan, S., Li, K., and Manning, E. (1999).

Building an adaptive multimedia system using the

utility model. Lecture Notes in Computer Science,

1586:289–298.

Cruz, C. and Nicolle, C. (2006). Active3d : Vector of col-

laboration, between sharing and data exchange. IN-

FOCOMP, Jounal of Computer Science, 5(3):1–8.

Hiﬁ, M., Michrafy, M., and Sbihi, A. (2004). Heuristic

algorithms for the multiple-choice multidimensional

knapsack problem. Journal of Operational Research

Society, 55(12):13231332.

Hiﬁ, M., Michrafy, M., and Sbihi, A. (2006). A reactive

local search-based algorithm for the multiple-choice

multi-dimensional knapsack problem. Computational

Optimization and Applications, 33:271285.

Karmacharya, A., Cruz, C., Boochs, F., and Marzani, F.

(2009). Archaeokm : toward a better archaeological

spatial datasets. In Computer Applications and Quan-

titative Methods in Archaeology (CAA), Williamsburg,

Virginia, USA.

Karp., R. M. (1972). Reducibility among combinatorial

problems. Complexity of Computer Computations,

page 85103.

Kirkpatrick, S., Gelatt, C., and Vecchi, M. (1983). Opti-

mization by simulated annealing. Science.

Lagoudakis, M. (1996). The 0-1 knapsack problem: An in-

troductory survey. Technical report, The Center for

Andvanced Computer Studies, University of South-

western Louisiana.

Lau, H. and Lim, M. (2004). Multi-period multi-

dimensional knapsack problem and its applications to

available-to-promise. In Proceedings of the Interna-

TOURISM-KM - A Variant of MMKP Applied to the Tourism Domain

425

tional Symposium on Scheduling (ISS), Hyogo, Japan,

page 9499.

Martello, S. and Toth, P. (1990). Knapsack problems: algo-

rithms and computer implementations. John Wiley &

Sons, Inc., New York, NY, USA.

Picot-Clemente, R., Cruz, C., and Nicolle, C. (2010). A se-

mantic based recommender system using a simulated

annealing algorithm. In Fourth International Confer-

ence on Advances in Semantic Processing.

ICORES 2012 - 1st International Conference on Operations Research and Enterprise Systems

426