matter of fact, doesn’t pay any regard to the contents
of the meant items. Instead the CF only works on the
users’ ratings of the items and it is known as the
strong point of this CF type. Because of that, CF
wouldn’t be encountering with problems, such as
how to analyze the richness in items’ contents.
However this is also to reflecting the weak points of
CF type as well, simply because CF can also do
some unexpected recommendations in some
situations, in which items are to be considered
suitable to users, but they don’t relate to users’
profiles in fact. The problem then even turns into
more serious trouble when having to facing with too
many items which aren’t rated. It turns the rating
matrix into the spare one which is to containing
various missing values. In order to alleviate this
weakness of the CF type, there have been two
techniques which could be helpful, used for
improvements:
- The combinations of the CF and CBF types. This
technique is breaking into two stages. First, it
applies CBF to setting up a complete rating
matrix, and then the next step would be the CF
type, which is used to making predictions for
recommendations. This mentioned technique will
be positively useful to improve the predictions’
precision. But it does consuming more time
when the first stage plays the role of the filtering
step or pre-processing step while the content of
items must be fully represented as a requirement.
This technique is designed to requiring both, the
items’ content matrix, and the rating matrix.
- Compressing the rating matrix into a
representative model, which then is used to
predict all the missing data for recommendations.
This is a model-based approach for the CF type.
Note that to this CF type, there have been two
common approaches, such as the memory-based
and the model-based approaches. The model-
based approach applies statistical and machine
learning methods to mining the rating matrix.
The result of this mining task is the above
mentioned model.
Although the model-based approach doesn’t give
result which is as precise as the combination
approach, it can solve the problem of huge database
and sparse matrix. Moreover it can responds user’s
request immediately by making prediction on
representative model though instant inference
mechanism. So this paper focuses on model-based
approach for CF based on Bayesian network
inference. There are many other researches which
apply Bayesian network (BN) into CF. Authors
(Miyahara & Pazzani, 2000) propose the Simple
Bayesian Classifier for CF. Suppose rating values
range in the integer interval {1, 2, 3, 4, 5}, there is a
set of 5 respective classes {c
1
, c
2
, c
3
, c
4
, c
5
}. The
Simple Bayesian Classifier uses Naïve Bayesian
classification method (Miyahara & Pazzani, 2000, p.
4) to determine which class a given user belongs to.
Mentioned in (Su & Khoshgoftaar, 2009, p. 9), the
NB-ELR algorithm is an improvement of Simple
Bayesian Classifier, which combines Naïve
Bayesian classification and extended logistic
regression (ELR). ELR is a gradient-ascent
algorithm, which is a discriminative parameter-
learning algorithm that maximizes log conditional
likelihood (Su & Khoshgoftaar, 2009, p. 9). NB-
ELR algorithm gains high classification accuracy on
both complete and incomplete data. Author
(Langseth, 2009) assumes that there is a linear
mapping from the latent space of users and items to
the numerical rating scale. Such mapping which
conforms the full joint distribution over all ratings
constructs a BN. Parameters of joint distribution are
learned from training data, which are used for
predicting active users’ ratings. According to
(Campos, et al., 2010), the hybrid recommender
model is the BN that includes three main kinds of
nodes such as feature nodes, item nodes, and user
nodes. Each feature node represents an attribute of
item. Active users’ ratings are dependent on these
nodes.
In general, other researches focus on
classification based on BN, discovering latent
variables, and predicting active users’ ratings while
this research focuses on using BN to model users’
purchase pattern and taking advantages of inference
mechanism of BN. It is the potential approach
because it opens a new point of view about
recommendation domain. In section 2 I propose an
idea for the model-based CF algorithm based on
Bayesian network. Section 3 tells about the
enhancement of our method. Section 4 is the
evaluation and its results. Section 5 is the
conclusion.
2 A NEW CF ALGORITHM
BASED ON BAYESIAN
NETWORK
The basic idea of model-based CF is to try to find
out an optimal inference model which can give real-
time response. Besides, sparse matrix and black
sheep are considered as important problems which
need to be solved. I propose a new model-based CF