A Multi-Factor Approach to Measure User Preference Similarity in
Neighbor-Based Recommender Systems
Ho Thi Hoang Vy
1,2
, Tiet Gia Hong
1,2
, Vu Thi My Hang
1,2
, Cuong Pham-Nguyen
1,2
and Le Nguyen Hoai Nam
1,2,
*
1
Faculty of Information Technology, University of Science, Ho Chi Minh City, Vietnam
2
VietNam National University, Ho Chi Minh City, Vietnam
Keywords: Preference Similarity, Collaborative Filtering, Recommender System.
Abstract: Neighbor-based Collaborative filtering is one of the commonly applied techniques in recommender systems.
It is highly appreciated for its interpretability and ease of implementation. The effectiveness of neighbor-
based collaborative filtering depends on the selection of a user preference similarity measure to identify
neighbor users. In this paper, we propose a user preference similarity measure named Multi-Factor Preference
Similarity (MFPS). The distinctive feature of our proposed method is its efficient combination of the four key
factors in determining user preference similarity: rating commodity, rating usefulness, rating details, and
rating time. Our experiments have demonstrated that the combination of these factors in our proposed method
has achieved good results on both experimental datasets: Movielens 100K and Personality-2018.
1 INTRODUCTION
The number of online shoppers worldwide is rapidly
increasing. It is expected that the online shopping
industry will continue to experience rapid growth in
the near future. To increase their chances of attracting
customers to their online stores, businesses should
strive to understand their users' needs and improve
their user experience. Recommender systems can be
applied to online businesses to provide beneficial
recommendations for both suppliers and consumers,
reducing the time spent searching and selecting items
(Schafer et al., 2001; Jannach et al., 2019).
Collaborative filtering is a commonly used type
of recommender system that can be classified into
two classes: neighbor-based and model-based.
Model-based collaborative filtering collects feedback
from users and uses a machine learning model to
predict user preferences. Neighbor-based
collaborative filtering is an easy-to-implement
approach that generates interpretable
recommendations (Schafer et al., 2007; Shen et al.,
2013; Zhang et al., 2014; Ricci et al., 2015). It
searches for users with similar preferences to an
active user, also known as neighbors of the active
*
Corresponding author
user, and suggests items to the active user based on
those neighbors. Users with greater similarity exhibit
more similar preferences.
The main focus of a neighbor-based collaborative
filtering recommender system is to assess the
similarity between users to find the neighbor sets.
One of the highly effective methods for this task is the
Jaccard similarity measure (Ricci et al., 2015; Jain et
al., 2020; Fkih et al., 2021). It only relies on the
number of items that both related users have rated.
However, such an idea is too general. This leads to
low performance of neighbor-based collaborative
filtering recommendation systems using Jaccard.
With the above observation, in this paper, we propose
an improved preference similarity measure based on
Jaccard, namely Multi-Factor Preference Similarity
(MFPS). The contributions of MFPS are as follows:
To provide recommendations for an active
user, it is necessary to predict his/her unknown
preferences by aggregating neighbors’
observed preferences. Nevertheless, due to
sparse data, there are not enough observed
preferences of neighbors to achieve an accurate
prediction. Hence, in MFPS, a user is
considered similar to another user when their
observed preferences not only exhibit
532
Vy, H., Hong, T., Hang, V., Pham-Nguyen, C. and Nam, L.
A Multi-Factor Approach to Measure User Preference Similarity in Neighbor-Based Recommender Systems.
DOI: 10.5220/0012135500003541
In Proceedings of the 12th International Conference on Data Science, Technology and Applications (DATA 2023), pages 532-539
ISBN: 978-989-758-664-4; ISSN: 2184-285X
Copyright
c
2023 by SCITEPRESS Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
similarity but also significantly contribute to
predicting each other's unknown preferences.
Users' preferences are expressed in detail
through ratings corresponding to many
different states: very dislike, dislike, neutral,
like, and very like. Moreover, time plays a
significant role in shaping user preferences.
The more recent preferences, the greater
significance. Therefore, we take into account
the above rating details in MFPS.
The content of this paper will be presented in the
following sections: Section 1 introduces an overview
of our research; in Section 2, we review related
similarity measures; Section 3 outlines our objectives
in this paper; in Section 4, we propose improvements;
the proposed method is implemented and experiments
are conducted in Section 5; finally, conclusions and
future directions are discussed in Section 6. Table 1
presents the symbols that we will use in the next
sections.
Table 1: The used symbols.
S
y
mbol Decri
p
tion
𝑢 = {𝑢
, 𝑢
, ..., 𝑢
}
The set of users
𝑖 = {𝑖
, 𝑖
, ..., 𝑖
}
The set of items
𝑟

*
Observed rating of user 𝑢 for
item 𝑖
𝑟

= *
Unknown rating of user 𝑢 for
item 𝑖
𝑠𝑖𝑚
𝑢,𝑣
)
Preference similarity between
user 𝑢 and user 𝑣
𝑁
,
The neighbor set of user 𝑢 for
item 𝑖
𝐼
The set of items rated by user 𝑢
𝛿
Liking threshol
d
𝑡

The time when user 𝑢 perform
ratings on item 𝑖
𝑘
The size of neighbor set
𝛼
Influence coefficient of the time
difference
2 RELATED WORKS
2.1 Problem Definition
The task of recommending items is based on users'
previous item preferences, which are represented by
𝑟

* with 𝑢 = {𝑢
, 𝑢
, ..., 𝑢
} as the set of users
and 𝑖 = {𝑖
, 𝑖
, ..., 𝑖
} as the set of items. The ratings
𝑟

have values from 1 to 5 corresponding to
{strongly dislike, dislike, neutral, like, strongly like}.
The ratings 𝑟

= * represent the unknown ratings,
meaning that the user has not experienced the item
yet. For an active user, the recommendation system
needs to predict his/her unknown ratings based on the
known ratings (Ricci et al., 2015).
It can be seen from the example shown in Fig. 1
that user 𝑢
has not experienced items 𝑖
and 𝑖
( 𝑟
,
= * and 𝑟
,
= *). Therefore, to make
recommendations for 𝑢
, the system needs to predict
𝑟
,
and 𝑟
,
. The items with the highest predicted
ratings for 𝑢
will be selected for recommendation.
Figure 1: The user-item rating matrix.
2.2 Neighbor-Based Recommender
Systems
Neighborhood-based recommender systems operate
based on the assumption that when a user 𝑢 needs to
purchase an item 𝑖, he/she can consult opinions from
other users who have previously experienced 𝑖 .
Therefore, they will search for a set of users with
similar preferences to 𝑢, called the neighbors of 𝑢, to
analyze neighborsopinions and help 𝑢 make better
decisions (Ricci et al., 2015; Fkih et al., 2021).
With such an assumption, the advantage of
neighbor-based recommender systems is their high
interpretability. For instance, to interpret the
recommendation result of item 𝑖 for user 𝑢 , the
system will visualize the proportion of user 𝑢 ‘s
neighbors based on rating categories (like and
dislike). Using this statistical analysis, if the
proportion of neighbors who like item 𝑖 is high, it will
be easy to persuade user 𝑢 to decide to choose this
item. More specifically, the process of predicting an
unknown rating of a user 𝑢 for an item 𝑖 is
implemented as follows:
Step 1: Measure the similarity between user 𝑢
and each remaining user in the system, denoted
by 𝑠𝑖𝑚
𝑢,𝑣
)
where 𝑣=1𝑚
Step 2: Identify the set of users who have rated
item 𝑖 . Within this set, the users with the
ratingitemuser
400
310
520
430
501
311
441
402
312
322
432
203
113
343
404
214
-4534
4--35
-4334
3--12
---24
A Multi-Factor Approach to Measure User Preference Similarity in Neighbor-Based Recommender Systems
533
highest similarity to user 𝑢 will be selected,
denoted as the neighbor set 𝑁
,
.
Step 3: Calculate the rating of user 𝑢 for item 𝑖
by aggregating the observed ratings of the
neighbors for item 𝑖, as follows:
𝑟

=𝜇
+
𝑠𝑖𝑚
𝑢,𝑣
)
.𝑟
,
−𝜇
)
∈
,
∑|
𝑠𝑖𝑚
𝑢,𝑣
)|
∈
,
(1)
where 𝜇
and 𝜇
is the average rating of the user
𝑢 and 𝑣.
2.3 User Similarity Measure
As presented in section 2.2, a neighbor-based rating
prediction relies on the opinions of neighbors. The
accuracy of the neighbor sets depends entirely on the
selection of an appropriate similarity measure (Zhang
et al., 2014; Fkih et al., 2021).
Some commonly used similarity measures
include Cosine (COS), Pearson Correlation (COR),
Mean Squared Difference (MSD), and Jaccard (Jain
et al., 2020; Fkih et al., 2021). Many studies have
analyzed their drawbacks and proposed improved
versions by incorporating side information.
Regarding the Jaccard similarity measure, numerous
variations of it have been investigated (Sun et al.,
2012; Liu et al., 2014; Liang et al., 2015; Ayub et al.,
2018; Bag et al., 2019). Original Jaccard only
considers the number of items rated in common by
two relevant users (
|
𝐼
𝐼
|
) as follows:
𝑠𝑖𝑚
𝑢,𝑣
)

=
|
𝐼
𝐼
|
|
𝐼
𝐼
|
(2)
where 𝐼
,
and 𝐼
are the set of items rated by user 𝑢
and user 𝑣.
The Sorensen-Dice coefficient (SDC) (Verma et
al., 2020) improves the original Jaccard by adding a
quantity equal to the number of common ratings to
both the numerator and denominator as follows:
𝑟

=𝜇
+
𝑠𝑖𝑚
𝑢,𝑣
)
.𝑟
,
−𝜇
)
∈
,
∑|
𝑠𝑖𝑚
𝑢,𝑣
)|
∈
,
(3)
Relevant Jaccard (Bag et al., 2019) incorporates
MSD into Jaccard to improve its specificity in the
following manner:
𝑠𝑖𝑚𝑢,𝑣
)

=𝑠𝑖𝑚
𝑢,𝑣
)

×𝑀𝑆𝐷𝑢,𝑣
)
(4)
Similarly, Proximity-Significance-Singularity
(PSS) (Liu et al., 2014) is also integrated into Jaccard
as follows:
𝑠𝑖𝑚
𝑢,𝑣
)

= 𝑃𝑆𝑆
𝑢,𝑣
)
×𝑠𝑖𝑚
𝑢,𝑣
)

(5)
(Ayub et al., 2020) proposed an improvement to
Jaccard by using the ratio of the number of pairs of
equal ratings to the total number of common ratings,
as follows:
𝑠𝑖𝑚
𝑢,𝑣
)
=
|
𝑁
𝑢,𝑣
)|
|
𝐼
∩𝐼
|
(6)
3 MOTIVATION
The preference similarity computation step aims to
identify the set of users with the most similar
preferences to an active user. However, in practice,
several users in this set lack the necessary rating
information to accurately predict unknown ratings of
the active user. Observing the user-item rating matrix
in Fig. 2, we can see that 𝑢
and 𝑢
have 3 common
ratings, while 𝑢
and 𝑢
have only 2 common ratings.
Therefore, the Jaccard similarity between 𝑢
and 𝑢
is higher than the Jaccard similarity between 𝑢
and
𝑢
. However, 𝑢
has not experienced 𝑖
and 𝑖
yet, so
𝑢
cannot support 𝑢
in predicting preferences for
items 𝑖
and 𝑖
. On the contrary, even though 𝑢
is
less similar to 𝑢
, 𝑢
has experienced 𝑖
and 𝑖
, so 𝑢
can rely on this rating information to make decisions
on 𝑖
and 𝑖
.
To address this issue, it is necessary to revise the
concept of preference similarity used in similarity
measures in general and the Jaccard similarity
measures in particular. Specifically, in this paper, we
aim to incorporate the usefulness of a user into the
similarity formula. The concept of usefulness of a
user refers to his/her ability to provide rating
information for predicting unknown ratings of the
other user.
In that case, the more support two users
provide for each other's rating prediction, the higher
their similarity. Details will be presented in section
4.2.
Figure 2: The rating usefulness.
For better similarity computation, in sections 4.3
-4.4, we delve into the details of the common ratings
of the related users rather than just focusing on their
quantity as in the original Jaccard formulation. For
example, in Fig. 3, two users 𝑢
and 𝑢
have
provided up to 4 common ratings {𝑖
, 𝑖
, 𝑖
, 𝑖
}.
DATA 2023 - 12th International Conference on Data Science, Technology and Applications
534
However, 𝑢
likes items 𝑖
, 𝑖
, 𝑖
and dislikes 𝑖
while 𝑢
is the opposite. On the other hand, although
𝑢
and 𝑢
have only 2 common ratings, they both
completely like them. It is clear that the similarity
between 𝑢
and 𝑢
must be greater than the similarity
between 𝑢
and 𝑢
.
Figure 3: The rating details.
4 OUR PROPOSED METHOD
The main objective of this section is to propose
Multi-Factor Preference Similarity (MFPS), an
improved Jaccard similarity denoted by
𝑠𝑖𝑚
𝑢,𝑣
)

. In the following, we will analyze the
important factors defined in the MFPS formula: rating
commodity, rating usefulness, rating details, and
rating time.
4.1 Rating Commodity
Following the fundamental principle of the original
Jaccard, the MFPS similarity between a user 𝑢 and a
user 𝑣 (𝑠𝑖𝑚
𝑢,𝑣
)

) should be proportional to the
number of items that they both have rated (𝑐
=
|
𝐼
𝐼
|
), as follows:
𝑠𝑖𝑚
𝑢,𝑣
)

∝𝑐
=
|
𝐼
𝐼
|
(7)
4.2 Rating Usefullness
As explained in section 3, we consider how user 𝑣
contributes to the rating prediction of user 𝑢. It can be
seen that if user 𝑣 has rated a large number of items
that user 𝑢 has not yet rated
𝑠
=|𝐼
−𝐼
|
)
then
user 𝑣 contributes more to the rating prediction of
user 𝑢. This idea can be expressed as follows:
𝑠𝑖𝑚
𝑢,𝑣
)

∝𝑠
=
|
𝐼
−𝐼
|
(8)
4.3 Rating Details
In addition to depending on the number of commonly
rated items, the MFPS similarity between user 𝑢 and
user 𝑣 also directly relates to the number of items that
𝑢 and 𝑣 both like or dislike (𝑑
). Specifically, this
idea is described as follows:
𝑠𝑖𝑚
𝑢,𝑣
)

∝𝑑
=
𝑖∈
𝐼
∩𝐼
)
∧𝑟

>𝛿∧𝑟

>𝛿
+
𝑖∈
𝐼
∩𝐼
)
∧𝑟

<𝛿∧𝑟

<𝛿
(9)
where 𝛿 is the liking threshold on the rating scale.
4.4 Rating Time
User preferences may change over time. Therefore,
the closer the time when user 𝑢 and user 𝑣 perform
ratings, the more similar their preferences are. To
model the similarity based on the rating time, we use
the formula proposed in the study (Zhang, 2014).
Specifically, it is as follows:
𝑠𝑖𝑚
𝑢,𝑣
)

∝𝑡
=𝑒

|



|
∈
(10)
where α is the influence coefficient of the time
difference, which falls within the range [0, 1]; 𝑡

and
𝑡

respectively represent the time when user 𝑢 and
user 𝑣 perform ratings on item 𝑖.
4.5 Multi-Factor Preference Similarity
(MFPS)
The similarity between two users in neighbor-based
recommender systems is typically defined between 0
and 1. As this value approaches 1, the two users are
more similar, and vice versa. To comply with this
criterion, similar to the approach in (Bag et al., 2019),
we will utilize the sigmoid function in MFPS as
follows:
𝑠𝑖𝑚
𝑢,𝑣
)

=
1
1+1/𝑥
(11)
where 𝑥 is a factor proportional to 𝑠𝑖𝑚
𝑢,𝑣
)

,
i.e., rating commodity, rating usefulness, rating
details, and rating time. Therefore, the final formula
of MFPS is implemented as follows:
𝑠𝑖𝑚
𝑢,𝑣
)

=
1
1+
1
𝑐
+
1
𝑠
+
1
𝑑
+
1
𝑡
(12)
2544
5222
--44
A Multi-Factor Approach to Measure User Preference Similarity in Neighbor-Based Recommender Systems
535
5 EXPERIMENT
5.1 Datasets
In this study, we conducted experiments on two
datasets, Movielens and Personality-2018. Detailed
information on the two datasets is provided in Table
2.
Table 2: Two experimental datasets.
Datasets Descri
p
tion Ratin
g
scale
MovieLens 943 users,
1682 movies
100,000 ratin
g
s
[1,…,5]
Personality 2018 1819 users
35195 movies
1028752 ratings
[0.5,…,5]
5.2 Measurement
In this study, we use the F1-score to evaluate the
recommendation performance. F1-score is a
combination of two metrics: precision and recall.
Precision is the ratio of accurate recommendations in
the recommendation set, while recall is the ratio of
accurate recommendations in the truth set. The
accurate recommendation set is defined based on
items with predicted ratings greater than the liking
threshold 𝛿. The truth set includes items with testing
ratings greater than the liking threshold 𝛿 .
Specifically, F1-score is calculated as follows:
𝐹1 − 𝑠𝑐𝑜𝑟𝑒 =
2 × 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙
𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
(13)
5.3 Experiment Setup
In this section, we implement the user preferences
similarity methods shown in Table 3. The F1-score
results of the above methods will be compared with
our proposed method, MFPS presented in section 4.
These comparisons will be conducted on the testing
ratings, which account for 20% of the total ratings in
the experimental datasets.
5.4 Experiment Results and Discussion
Figures 4-9 depict the F1-score results on the
experimental datasets. We observed the F1-score
changes in various liking thresholds 𝛿 {3.5, 4, 4.5,
and personal - the average rating of each user} and
size of the neighbor set 𝑘 {5, 30, and 50}. It can be
seen that our proposed MFPS similarity measure
achieves comparable F1-score results, and even
performs better than other similarity metrics.
Table 3: Similarity measures in the experiment.
Similarity measures Denotation
Cosine Similarity (Verma , 2020) COS
Pearson’s Correlation (Verma,
2020)
COR
Constrained Pearson’s Correlation
(
Verma, 2020
)
CPC
Jaccard Similarit
y
(
Verma, 2020
)
JAC
Sorensen–Dice coefficient
(
Verma, 2020
)
SDC
Mean Square Distance (Verma,
2020)
MSD
Jaccard Mean Square Distance
(Bobadilla, 2010)
JMSD
Jaccard Proximity-Significance-
Sin
g
ularit
y
(
Liu, 2014
)
JPSS
Relevant Jaccard
(
Ba
g
, 2019
)
RJ
Relevant Jaccard Mean Square
Distance
(
Ba
g
, 2019
)
RJMDS
Jaccard Uniform Operator
Distance (Sun HF, 2012)
JUOD
JacLMHUOD
(
Lee, 2017
)
JLMHUOD
Triangle Multiplying Jaccard
(Fkih, 2021)
TMJ
JACLMH (Lee, 2017) JACLMH
Rating Jaccard - Rating Preference
Behavio
r
(Ayub, 2020)
RAJRPB
Rating Jaccar
d
(Ayub, 2018) RAJ
New Heuristic Similarity Model
(
Liu, 2014
)
NHSM
Figure 4: F1-score in the Movielens dataset at the size of
the neighbor set 𝑘 =5 and liking thresholds 𝛿 ={3.5, 4, 4.5,
and personal - the average rating of each user}.
All methods achieved the highest F1-score results
when the liking threshold 𝛿 was set to personal, i.e.
the average rating of each user. This is because
several users tend to rate more critically than others.
Therefore, using the fixed liking threshold 𝛿 for all
users would not be appropriate.
0.05
0.20
0.35
0.50
0.65
0.80
F1-score
Similarity measures
K = 5
3.5 4 4.5 personal
DATA 2023 - 12th International Conference on Data Science, Technology and Applications
536
When the size of the neighbor set 𝑘 increases, the
F1-score results also increase because more
neighbors are used in the rating prediction process. At
the largest value of 𝑘, i.e. 50, and the best value of 𝛿,
i.e. personal, our method MFPS achieved the best F1-
score result of 0.75949 in the Movielens dataset and
0.76912 in the personality-2018 dataset.
Figure 5: F1-score in the Movielens dataset at the size of
the neighbor set 𝑘 =30 and liking thresholds 𝛿 ={3.5, 4, 4.5,
and personal - the average rating of each user}.
Figure 6: F1-score in the Movielens dataset at the size of
the neighbor set 𝑘 =50 and liking thresholds 𝛿 ={3.5, 4, 4.5,
and personal - the average rating of each user}.
Figure 7: F1-score in the Personality-2018 dataset at the
size of the neighbor set 𝑘=5 and liking thresholds 𝛿 ={3.5,
4, 4.5, and personal - the average rating of each user}.
Figure 8: F1-score in the Personality-2018 dataset at the
size of the neighbor set 𝑘=30 and liking thresholds 𝛿 ={3.5,
4, 4.5, and personal - average rating of each user}.
Figure 9: F1-score in the Personality-2018 dataset at the
size of the neighbor set 𝑘=50 and liking thresholds 𝛿 ={3.5,
4, 4.5, and personal - average rating of each user}.
Table 4 presents the average F1-score of each
similarity measure across both experimental datasets
at the optimal parameters (the size of the neighbor set
𝑘 is 50 and the liking threshold 𝛿 is personal).
According to this table, the top 3 best methods are
MFPS, RJ, and NHSM. Our proposed similarity
measure MFPS achieves the highest average F1-
score. It can be seen that combined methods
consistently produce better results compared to
traditional methods. This finding further reinforces
the idea of combining multiple factors in proposing
similarity measures.
Figure 10 illustrates the F1-score results of our
proposed method MFPS when fixing 𝑘 at 5, the
liking threshold 𝛿 at personal, and decreasing
gradually the influence coefficient of time difference
α from 10

to 10

. As α decreases, the F1-score
results of experimental methods increase. This can be
explained as follows: In the movie recommendation
0.15
0.25
0.35
0.45
0.55
0.65
0.75
F1-score
Similarity measures
K = 30
3.5 4 4.5 personal
0.15
0.3
0.45
0.6
0.75
F1-score
Similarity measures
K = 50
3.5 4 4.5 personal
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
F1-score
Similarity measures
K = 5
3.5 4 4.5 personal
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
F1-score
Similarity measures
K = 30
3.5 4 4.5 personal
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
F1 - score
Similarity measures
K = 50
3.5 4 4.5 personal
A Multi-Factor Approach to Measure User Preference Similarity in Neighbor-Based Recommender Systems
537
domain, the time factor plays a significant role in
determining user preferences. Therefore, reducing α
implies placing more importance on the time factor in
the process of computing user preference similarity.
Table 4: The average F1-score of each similarity measure
across both experimental datasets at the optimal parameters
(the size of the neighbor set 𝑘 is 50 and the liking threshold
𝛿 is personal). Underline methods are the top 3 best
methods.
Similarity
measures
Movielens Personality
Average
F1-score
MFPS 0.75949 0.76912 0.76431
RJ 0.75741 0.76562 0.76152
NHSM 0.76002 0.75510 0.75756
JPSS 0.75975 0.75246 0.75610
RJMSD 0.75797 0.75119 0.75458
JLMHUOD 0.75259 0.75415 0.75337
JUOD 0.75278 0.75372 0.75325
JACLMH 0.75452 0.75021 0.75237
TMJ 0.74888 0.74558 0.74723
CTJ 0.74847 0.74307 0.74577
SDC 0.74701 0.74070 0.74386
JAC 0.74717 0.73974 0.74346
RAJRPB 0.71652 0.69756 0.70704
RAJ 0.71420 0.69962 0.70691
MSD 0.59985 0.59398 0.59691
COR 0.52590 0.59984 0.56287
CPC 0.56375 0.43058 0.49716
COS 0.47280 0.42764 0.45022
Figure 10: F1-score with the influence coefficient of time
difference α from 10

to 10

.
6 CONCLUSIONS
In this paper, we have proposed a similarity measure
named MFPS using the Jaccard principle The
distinctive feature of MFPS is an effective
combination of four key factors in determining the
preference similarity between two users: rating
commodity, rating usefulness, rating details, and
rating time. We conducted experiments on two
datasets, Movielens 100K and personality-2018. The
experimental results showed that MFPS produced
better results than other methods in both datasets.
In reality, user preferences are expressed
through not only ratings but also reviews, user
actions, and item descriptions that they are interested
in. Therefore, in the future, we will aim to combine
these factors into MFPS to enhance its effectiveness.
However, incorporating too much information may
increase the computational cost of calculating user
similarity. Therefore, it is necessary to design an
efficient implementation approach for the proposed
similarity measure.
ACKNOWLEDGEMENTS
This research is funded by University of Science,
VNUHCM under grant number CNTT 2022-06.
REFERENCES
Ayub, M., Ghazanfar, M. A., Khan, T., & Saleem, A.
(2020). An effective model for Jaccard coefficient to
increase the performance of collaborative
filtering. Arabian Journal for Science and
Engineering, 45(12), 9997-10017.
Ayub, M., Ghazanfar, M. A., Maqsood, M., & Saleem, A.
(2018, January). A Jaccard base similarity measure to
improve performance of CF based recommender
systems. In 2018 International conference on
information networking (ICOIN) (pp. 1-6). IEEE.
Bag, S., Kumar, S. K., & Tiwari, M. K. (2019). An efficient
recommendation generation using relevant Jaccard
similarity. Information Sciences, 483, 53-64.
Bobadilla, J., Serradilla, F., & Bernal, J. (2010). A new
collaborative filtering metric that improves the
behavior of recommender systems. Knowledge-Based
Systems, 23(6), 520-528.
Fkih, F. (2022). Similarity measures for Collaborative
Filtering-based Recommender Systems: Review and
experimental comparison. Journal of King Saud
University-Computer and Information Sciences, 34(9),
7645-7669.
Jain, G., Mahara, T., & Tripathi, K. N. (2020). A survey of
similarity measures for collaborative filtering-based
recommender system. In Soft Computing: Theories and
Applications: Proceedings of SoCTA 2018 (pp. 343-
352). Springer Singapore.
Jannach, D., & Jugovac, M. (2019). Measuring the business
value of recommender systems. ACM Transactions on
Management Information Systems (TMIS), 10(4), 1-23.
DATA 2023 - 12th International Conference on Data Science, Technology and Applications
538
Lee, S. (2017). Improving jaccard index for measuring
similarity in collaborative filtering. In Information
Science and Applications 2017: ICISA 2017 8 (pp. 799-
806). Springer Singapore.
Liang, S., Ma, L., & Yuan, F. (2015). A singularity-based
user similarity measure for recommender
systems. International journal of innovative computing
information and control, 11(5), 1629-1638.
Liu, H., Hu, Z., Mian, A., Tian, H., & Zhu, X. (2014). A
new user similarity model to improve the accuracy of
collaborative filtering. Knowledge-based systems, 56,
156-166.
Ricci, F., Rokach, L., & Shapira, B. (2015). Recommender
systems: introduction and challenges. Recommender
systems handbook, 1-34.
Schafer, J. B., Frankowski, D., Herlocker, J., & Sen, S.
(2007). Collaborative filtering recommender
systems. The adaptive web: methods and strategies of
web personalization, 291-324.
Schafer, J. B., Konstan, J. A., & Riedl, J. (2001). E-
commerce recommendation applications. Data mining
and knowledge discovery, 5, 115-153.
Sharma, R., Gopalani, D., & Meena, Y. (2017, February).
Collaborative filtering-based recommender system:
Approaches and research challenges. In 2017 3rd
international conference on computational intelligence
& communication technology (CICT) (pp. 1-6). IEEE.
Shen, J., Wei, Y., & Yang, Y. (2013). Collaborative
filtering recommendation algorithm based on two
stages of similarity learning and its optimization. IFAC
Proceedings Volumes, 46(13), 335-340.
Sun, H. F., Chen, J. L., Yu, G., Liu, C. C., Peng, Y., Chen,
G., & Cheng, B. (2012). JacUOD: a new similarity
measurement for collaborative filtering. Journal of
Computer Science and Technology, 27(6), 1252.
Verma, V., & Aggarwal, R. K. (2020). A comparative
analysis of similarity measures akin to the Jaccard index
in collaborative recommendations: empirical and
theoretical perspective. Social Network Analysis and
Mining, 10, 1-16.
Zhang, R., Liu, Q. D., & Wei, J. X. (2014, November).
Collaborative filtering for recommender systems.
In 2014 Second International Conference on Advanced
Cloud and Big Data (pp. 301-308). IEEE.
Zhang, X., He, K., Wang, J., Wang, C., Tian, G., & Liu, J.
(2014, June). Web service recommendation based on
watchlist via temporal and tag preference fusion.
In 2014 IEEE International Conference on Web
Services (pp. 281-288). IEEE.
A Multi-Factor Approach to Measure User Preference Similarity in Neighbor-Based Recommender Systems
539