TOURISM-KM
A Variant of MMKP Applied to the Tourism Domain
Romain Picot-Cl´emente, Florence Mendes, Christophe Cruz and Christophe Nicolle
Laboratoire Le2i, UFR Sciences et Techniques Universit de Bourgogne, B.P. 47870, 21078 Dijon, France
Keywords:
Knowledge management, Simulated annealing, Knapsack problem, Decision support.
Abstract:
We are interested in an original real-world problem coming from tourism field. We describe a modelling of the
problem and propose a first approach that mixes knowledge management and operational research methods.
Our algorithms have been implemented in order to produce tourism solutions that are not unique for a given
request but that take into account the preferences of the tourist user and provide a personalized solution. We
report computational results obtained on real-world instances.
1 INTRODUCTION
The tourism area has widely changed over last years,
thanks to innovative services accessible via internet
and smartphones. Tourism offer is so abundant that
the customer may feel lost if the information is not
clearly ordered. We present a method to help the
tourist to plan his stay, which uses knowledge man-
agement techniques to take into account his prefer-
ences and his proper specifications (in couple or with
children, dog, etc.). Our goal is to suggest an offer
that matches at best the expectations of the customer.
In this purpose, we take into account at the same time
adequacy between tourism offer and customer specifi-
cations, interest of the customer for this offer, and also
geographical distance between the different elements
of the offer.
In this paper, we present the resolution of a hard
holiday scheduling problem by using a combination
of metaheuristics based on simulated annealing with a
semantic modelling of tourism knowledge focused on
users. This work takes place under the (CHECKSEM,
2011) project with the aim of providing a comprehen-
sive set of knowledge management methods and tools
focused on the user. Some of these methods have al-
ready been applied successfully in several domains
such as civil engineering (Cruz and Nicolle, 2006) or
archaeology (Karmacharya et al., 2009).
Our paper is organized as follows. In section 2,
we briefly present the concepts related to the domain
models, the definition of a valid solution to the prob-
lem of best combination, as well as how to measure
the quality of a solution. In part 3, we present a
stochastic algorithm based on simulated annealing.
Finally, we discuss the effectiveness of our approach
and opportunities for further developments.
2 PROBLEM DEFINITION
The proposed system consists of four layers: a model
of goals, a domain model, a user model and an adap-
tation model. The user model represents the user in
the system, it consists of the goals of the user for his
future stay. The domain model is a domain ontology
in which all the tourism knowledge useful for the ap-
plication program is defined. The goals model con-
tains all possible goals of users, they are linked with
the domain model by first order rules. This layered
modelling is detailed in (Picot-Clemente et al., 2010).
In the following paragraphs, we define only the con-
cepts that are necessary to introduce our optimization
problem.
2.1 Items and Weights
A tourist is represented in our system by a user profile,
composed of goals from the goals model and defining
the objectives of the associated user. User profiles are
stocked into a users model. In the following, we will
consider only one user, supposing that the process to
produce a solution can be run independently for each
user of the system.
Let Ty bet the set of possible activity types in the
system. For example, restaurant, museum, or hotel,
are standard activity types.
421
Picot-Clémente R., Mendes F., Cruz C. and Nicolle C..
TOURISM-KM - A Variant of MMKP Applied to the Tourism Domain.
DOI: 10.5220/0003758704210426
In Proceedings of the 1st International Conference on Operations Research and Enterprise Systems (ICORES-2012), pages 421-426
ISBN: 978-989-8425-97-3
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)
Let I be the set of all the elements that can be sup-
plied by the system in the final recommendation. An
item i I is defined by a vector composed of a name
i.name, a type i.type Ty, and geographical coordi-
nates i.x and i.y.
i = i.name,i.type,i.x,i.y (1)
The domain model represents the domain knowl-
edge, using an ontology composed of concepts, re-
lations between concepts, and individuals. Valid
items are integrated into the ontology as individuals.
For convenience in this paper we reduce the domain
model to the set I of valid items :
I = i
1
,...,i
nbi
(2)
The weight of an item i I, denoted w
i
, represents
the interest of this item for the user. The method used
to compute the weights of items consists of a propa-
gation of weights into the ontological domain model,
by following the different goals that are defined by
the user (see (Picot-Clemente et al., 2010) for further
details).
2.2 Patterns and Combinations
When asking for a holiday proposal, the user chooses
a solution pattern, which defines the types of items
authorized in a solution.
P = (ty
1
,ty
2
,,ty
NbTy
/1 j NbTy,t
j
Ty) (3)
A valid combination C
k
for a pattern P is a set of
items from the domain model, chosen under the con-
straints given by the combination pattern.
C
k
= (i
1
,i
2
,,i
N
),
1 <= j <= Ni
j
.type = ty
j
,ty
j
P
1 k NbC (4)
All valid combinations for P can be ordered from
C
1
to C
NbC
. The weight of a combination C
k
, de-
noted W
C
k
, equals the average weight of each item
of the combination. It represents the global interest
of the user for the combination, without considering
distance between items.
W
C
k
=
N
j=0
w
i
j
N
(5)
In addition with the main combination pat-
tern, the user as the possibility to add more
requirements via the definition of subcombinations
patterns. Example : Let p
1
= Museum, Hotel and
p
2
= Restaurant, Hotel be two subcombination
patterns, corresponding to the main pattern P =
Museum,Restaurant,Hotel. The combination C
3
=
Louvremuseum,RestaurantduPalais,HotelJacques
can generate two subcombinations corre-
sponding to the two subpatterns : C
3,1
=
Louvremuseum,HotelJacques and C
3,2
=
RestaurantduPalais, HotelJacques
2.3 Dispersion and Relevance of a
Solution
The problem of finding the best combination of items
for a given user requires to introduce some geographic
properties to distinguish and compare the solutions.
Indeed, weight of an item is not sufficient to evaluate
the quality of a combination. In this paragraph we
define a method to evaluate the combination relevance
by taking into account the distance between items.
For a given combination C
j,u
, the dispersion
σ(C
j,u
) equals the standard deviation of the coordi-
nates of combination items.
σ(C
j,u
) =
v
u
u
u
t
|C|−1
i=0
((i.x
|C|−1
i=0
(i.x)
2
|C|
) + (i.y
|C|−1
i=0
(i.y)
2
)
|C|
))
2
|C|
(6)
So for each combination, the dispersion allows us
to quantify the geographical distance between combi-
nation items. A same dispersion can be considered
differently by two users, so we define a dispersion
tolerance, depending on the concerned user and the
combination pattern. The dispersion tolerance, de-
noted Tol(P) is a number representing the tolerance
of the user in terms of geographical distance between
items for pattern P.
The moderated dispersion is used to take into ac-
count the same dispersion value for combinations that
are in the same order of distance, according to the dis-
persion tolerance of the user :
σ
mod
(C
k
) =
σ(C
k
)
Tol(P)
(7)
By this way, two combinations with dispersion
values varying only by a few meters will not be dis-
tinguished.
As for combinations, each subcombination is as-
sociated with a dispersion value and a moderated dis-
persion value. The relevance of a combination C
k
is
an aggregate of W
C
k
and moderate dispersion of C
k
and its subcombinations :
(C
k
) =
W
C
k
σ
mod
(C
k
) +
S1
l=0
σ
mod
(C
k,l
)
(8)
ICORES 2012 - 1st International Conference on Operations Research and Enterprise Systems
422
where S is the number of subpatterns associated to
the pattern P.
Definition 1. Tourist Problem: Given a user, a set of
items and patterns to compose a valid combination of
items, the problem of finding the best items combina-
tion consists in finding for each user a combination
with maximal relevance :
maximise (C
k
),
with 1 k NbC, (9)
2.4 Links with other Optimization
Problems
This problem can be viewed as a Set Packing prob-
lem, as in (Avella et al., 2006). The authors propose
a simplified variant of tourist problem, they call the
Intelligent Tourist Problem, solved with a LP-based
heuristic. Their modelling include items and weights
for the items depending on user preferences, but they
only take into account one type of item (tourism ac-
tivities). So they don’t follow any pattern of combina-
tion, but add items to a combination if the time period
necessary for the activity matches a free time period
for the tourist.
Our problem has similarities with a combinato-
rial optimization problem called the Knapsack Prob-
lem (Karp., 1972). The knapsack problem derives its
name from the problem faced by someone who is con-
strained by a fixed-size knapsack and must fill it with
the most useful items. Each item as a weight and a
value. Items put in the knapsack have to maximise
the total value without exceeding a given maximum
weight. A common formulation of this problem, with
each item different from the others is as follows (0-1
Knapsack Problem):
maximise
n
i=1
v
i
x
i
(10)
subject to
n
i=1
w
i
x
i
W,x
i
0,1 (11)
where v
i
and w
i
are respectively the value and the
weight of item i. W is the maximum weight of the
knapsack. x
i
equals 1 if item i is put into the knap-
sack, 0 otherwise.
This problem has been proven to be NP-complete
(Lagoudakis, 1996). Several variants of the knap-
sack problem have been studied : the multidimen-
sional knapsack, the quadratic knapsack problem, etc.
(see (Martello and Toth, 1990) for a survey). Our
tourism problem can be viewed as a variant from
the multi-choice multi-dimensional knapsack prob-
lem (MMKP). In standard MMKP, items are grouped
in function of their type and only one representant
of each type has to be chosen. In our problem sev-
eral items of the same type can be chosen, depend-
ing on the combination and subcombination patterns.
MMKP is NP-hard, so it would not be efficient to
apply an exact method to solve it, especially for a
real-time decision making application (Chen et al.,
1999). Recent heuristic approaches use various tech-
niques, such as Local Search (Hifi et al., 2004; Hifi
et al., 2006), Tabu Search and Ant Colony Optimiza-
tion (Lau and Lim, 2004), or reductions of the search
space (Akbar et al., 2006), etc.
In the next part, we propose a simulated annealing
algorithm to find a good solution in reasonable time.
3 SIMULATED ANNEALING
ALGORITHM
In order to solve the problem of finding the best com-
bination of items from the domain model for a given
user, we suggest the use of a stochastic algorithm
called simulated annealing (Kirkpatrick et al., 1983)
that is inspired from a method used in the steel indus-
try.
3.1 Algorithm Description
The simulated annealing algorithm used here is based
on this principle. At the beginning, the algorithm
chooses an initial random combination of individuals
following a given combination pattern (for instance, a
combination consisting of a hotel, two restaurants and
two activities). This combination has an energy E0,
called the initial energy, which represents the quality
of a combination. This energy of a combination is
based on the relevance of the combination:
ε(C
k
) =
1
(C
k
)
(12)
The lower the energy is, the better the combina-
tion is. A variable T, called temperature, decreases
in increments over time. At each level of tempera-
ture a certain number of elementary random changes
is tested on the current combination. A cost d f is
associated to each modification; it is defined as the
difference between the combinations energy after the
modification and the one before. A negative cost sig-
nifies the current combination has a lesser energy than
the previous one (thus better by definition), it is then
kept. Conversely, a positive cost represents a bad
TOURISM-KM - A Variant of MMKP Applied to the Tourism Domain
423
change. Nevertheless, it can be kept according a given
probability (acceptance rate ) depending on the cur-
rent level of temperature and the cost. The higher the
temperature is, the higher the probability is. Thus,
over time, the number of changes allowed decreases
as the temperature decreases, until no longer accept-
ing any changes. Finally, the system is said frozen,
and the current combination becomes the final com-
bination to be presented to the user. The acceptance
rate is defined in (13) where Tk is the temperature at
the level k,k N.
T
a
= e
d
f
T
k
(13)
The temperature decrease is achieved through a
geometric decay at each level:
T
k
= g(T
k1
) = coef T
k1
= coef
k
T
0
(14)
where k is the current level and 0 < coe f < 1.
Algorithm 1: Simulated annealing algorithm.
Input: I items and their weights for the current user,
combination patterns
Output: a valid combination with minimal Energy
C
0
= initial valid solution with energy ε0
T = T
0
k = 0
while the system is not frozen do
k+ +
Obtain C
k
by an elementary transformation of
C
k1
Compute energy ε
k
d f = ε
k
ε
k1
if d f > 0 then
use an acceptance rate to randomly accept or
refuse the modified solution as current solu-
tion ;
else
The modified solution is accepted as current
solution ;
end if
the temperature is lowered ;
end while
3.2 Algorithm Parameters
The arbitrary definition of the various parameters of
simulated annealing algorithm is the main disadvan-
tage of this algorithm. First, we have to establish the
initial temperature T
0
. If T
0
is too high, the first mod-
ifications will all be accepted without considering the
quality of the solutions, which is a waste of time. In-
versely, if T
0
value is too small, the exploration of the
solutions space will be too limited. We have to find
a medium value. A way to find this value consists in
generating some expensive alterations and computing
the medium variation d
f
moy. A first acceptation rate
is chosen (we use (0.9) in our experiments), the value
of T
0
is computed :
T
0
=
d
fmean
ln
1
T
a
(15)
Then we have to choose the temperature decreas-
ing coefficient and the number of iterations at each
step. The decreasing coefficient is included between
0 and 1. The higher is the coefficient, the slower is the
decreasing. A value of 0.6 has been chosen, allowing
us to obtain good results in reasonable time. The num-
ber of iterations at each step determines the number of
changes allowed for each temperature step. When this
number is reached, we say that the system is in a sta-
tistical equilibrium and we start the temperature de-
creasing to a new step. In our experiments, this num-
ber has been fixed at 2000. Finally, the stop criteria
is important. We say that the system is frozen when
no more change is acceptable. In pratice, some re-
searchers choose to stop the algorithm when the sys-
tem reaches a fixed temperature or when a maximal
number of steps has been exceeded. We choose to
stop the execution when no change have been done
during a fixed number of modifications (2000).
3.3 Experimental Results
In order to justify the use of this simulated anneal-
ing algorithm, instead of a simpler algorithm like a
Hill-Climbing, some benchmarks have been realized
on real and random datasets, using the simulated al-
gorithm described in the previous part and a Hill-
Climbing algorithm. The Hill-Climbing algorithm
used is presented in algorithm 2
Algorithm 2: Hill-Climbing algorithm.
Input: I items and their weights for the current user,
combination patterns
Output: a valid combination with minimal Energy
R = initial valid solution randomly chosen
repeat
Obtain S by an elementary transformation of R
if ε
S
< ε
R
then
R=S
end if
until X transformations without a change or the
time limit is reached
return S
ICORES 2012 - 1st International Conference on Operations Research and Enterprise Systems
424
The real dataset includes 3724 items which are
composed of activities, 1008 restaurants and 727 ac-
commodations. The random dataset includes 30000
items composed of 10000 activities, 10000 restau-
rants and 10000 accommodations. The items coor-
dinates are defined randomly into the bounding rect-
angle of a french department. The items weights are
also defined randomly between 0 and 1000.
The table 1 compares the results obtained by the
Simulated Annealing algorithm (SA) with those ob-
tained by the Hill-Climbing algorithm (HC). It shows
the average values and the average times got on the
real and the random datasets. These averages are per-
formed on 100 iterations of the algorithms.
Table 1: Simulated annealing VS Hill-climbing.
Random Dataset Real Dataset
Energy Time Energy Time
10
6
(ms) 10
6
(ms)
SA 199.56 239 169.05 285
HC 812.26 8 310.31 9
For the tourist real application, we have a con-
straint consisting in needing a generation of the com-
binations in near real-time. This time constraint cor-
responds to an arbitrary tolerance of 500 ms. Thus,
given that the computation times are lower than this
threshold, the two methods are adequate. However,
the comparative table shows the average energy of the
combinations given by the simulated annealing is bet-
ter (lower) than the one given by the Hill-Climbing.
This difference of relevance is reflected in the propo-
sitions to the user. That is why the use of the simu-
lated annealing algorithm is justified for the near real-
time application we want.
4 CONCLUSIONS AND FUTURE
WORK
This paper presented the resolution of a hard holiday
scheduling problem by using a combination of meta-
heuristics based on simulated annealing with a seman-
tic modelling of tourism knowledge focused on users.
In order to solve the problem of finding the best com-
bination of items from the domain model for a given
user, we used a simulated annealing algorithm. This
work is a part of a more generic project which aims
to build a touristic recommender system. This work
combines the Semantic Web technologies (mainly on-
tologies), the model of the adaptive hypermedia sys-
tems, and combinatory algorithms in order to provide
recommendations. A recommendation is a combina-
tion of items formed according to a semantic pattern
defined with the help of a domain ontology. This re-
search project was developed in cooperation with a
French tourism company called Cˆote d’or Tourisme.
Since June 2011, a smartphone application was free
of charge and available to users on the Apple store
or android market. Now, we work to improve our re-
sult by using a multi-objective approach. One of the
main difficulties of this improvement will be the ob-
tainment of workable results despite a very short exe-
cution time allowed for the smartphone application.
REFERENCES
Akbar, M., Rahman, M., Kaykobad, M., Manning, E.,
and Shoja, G. (2006). Solving the multidimensional
multiple-choice knapsack problem by constructing
convex hulls. Computers and Operations Research,
33(5):12591273.
Avella, P., Dauria, B., and Salerno, S. (2006). A lp-based
heuristic for a time-constrained routing problem. Eu-
ropean Journal of Operational Research.
CHECKSEM (2011). http://www.checksem.fr.
Chen, L., Khan, S., Li, K., and Manning, E. (1999).
Building an adaptive multimedia system using the
utility model. Lecture Notes in Computer Science,
1586:289–298.
Cruz, C. and Nicolle, C. (2006). Active3d : Vector of col-
laboration, between sharing and data exchange. IN-
FOCOMP, Jounal of Computer Science, 5(3):1–8.
Hifi, M., Michrafy, M., and Sbihi, A. (2004). Heuristic
algorithms for the multiple-choice multidimensional
knapsack problem. Journal of Operational Research
Society, 55(12):13231332.
Hifi, M., Michrafy, M., and Sbihi, A. (2006). A reactive
local search-based algorithm for the multiple-choice
multi-dimensional knapsack problem. Computational
Optimization and Applications, 33:271285.
Karmacharya, A., Cruz, C., Boochs, F., and Marzani, F.
(2009). Archaeokm : toward a better archaeological
spatial datasets. In Computer Applications and Quan-
titative Methods in Archaeology (CAA), Williamsburg,
Virginia, USA.
Karp., R. M. (1972). Reducibility among combinatorial
problems. Complexity of Computer Computations,
page 85103.
Kirkpatrick, S., Gelatt, C., and Vecchi, M. (1983). Opti-
mization by simulated annealing. Science.
Lagoudakis, M. (1996). The 0-1 knapsack problem: An in-
troductory survey. Technical report, The Center for
Andvanced Computer Studies, University of South-
western Louisiana.
Lau, H. and Lim, M. (2004). Multi-period multi-
dimensional knapsack problem and its applications to
available-to-promise. In Proceedings of the Interna-
TOURISM-KM - A Variant of MMKP Applied to the Tourism Domain
425
tional Symposium on Scheduling (ISS), Hyogo, Japan,
page 9499.
Martello, S. and Toth, P. (1990). Knapsack problems: algo-
rithms and computer implementations. John Wiley &
Sons, Inc., New York, NY, USA.
Picot-Clemente, R., Cruz, C., and Nicolle, C. (2010). A se-
mantic based recommender system using a simulated
annealing algorithm. In Fourth International Confer-
ence on Advances in Semantic Processing.
ICORES 2012 - 1st International Conference on Operations Research and Enterprise Systems
426