Privacy and Fairness in Recommender Systems via Adversarial Training
of User Representations
Yehezkel S. Resheff
1
, Yanai Elazar
2,3
, Moni Shahar
1
and Oren Sar Shalom
3
1
Intuit Tech Futures, Israel
2
Bar Ilan University, Israel
3
Intuit, Israel
Keywords:
Privacy, Representations, Information Leakage.
Abstract:
Latent factor models for recommender systems represent users and items as low dimensional vectors. Privacy
risks of such systems have previously been studied mostly in the context of recovery of personal information
in the form of usage records from the training data. However, the user representations themselves may be used
together with external data to recover private user information such as gender and age. In this paper we show
that user vectors calculated by a common recommender system can be exploited in this way. We propose
the privacy-adversarial framework to eliminate such leakage of private information, and study the trade-off
between recommender performance and leakage both theoretically and empirically using a benchmark dataset.
An advantage of the proposed method is that it also helps guarantee fairness of results, since all implicit
knowledge of a set of attributes is scrubbed from the representations used by the model, and thus can’t enter
into the decision making. We discuss further applications of this method towards the generation of deeper and
more insightful recommendations.
1 INTRODUCTION
With the increasing popularity of digital content con-
sumption, recommender systems have become a ma-
jor influencer of online behavior, from what news we
read and what movies we watch, to what products we
buy. Recommender systems have revolutionized the
way items are chosen across multiple domains, mov-
ing users from the previous active search scenario to
the more passive selection of presented content as we
know it today.
A recommender system needs to fulfill two objec-
tives in order to be able to supply relevant recommen-
dations that will be helpful to users. Namely, it has
to accurately model both the users and the items. An
implication of the first objective is that recommenders
aim at revealing users’ preferences, their desires and
wills, and even learn when to suggest different items
(Adomavicius and Tuzhilin, 2015), and how many
times to recommend a given item. In order to be ef-
fective, a system must have an extensive view of a
user, embodied as a representation that includes a sub-
stantial amount of private information (which may be
either implicitly or explicitly encoded).
Indeed, in recent years we have seen a plethora
of advanced methods applied to improve the person-
alized recommendation problem. Furthermore, mod-
ern Collaborative Filtering approaches are becoming
increasingly more complex not only in algorithmic
terms but also in their ability to process and use ad-
ditional data. Arguably, demographic data is the most
valuable source of information for user modeling. As
such, it was used from the early days of recommen-
dation systems (Pazzani, 1999). New deep learn-
ing based state of the art methods also utilize de-
mographic information (Covington et al., 2016; Zhao
et al., 2014) in order to generate better and more rele-
vant predictions.
While this in-depth modeling of users holds great
value, it may also pose severe privacy threats. One
major privacy concern, especially when private in-
formation is explicitly used during training, is recov-
ery of records from the training data by an attacker.
This aspect of privacy has previously been studied in
the context of recommender systems using the frame-
work of differential privacy (see Section 2 below). A
related but different threat, which to the best of our
knowledge has not yet been addressed in this context,
is the recovery of private information that was not ex-
plicitly present in the training data. In the implicit pri-
476
Resheff, Y., Elazar, Y., Shahar, M. and Shalom, O.
Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations.
DOI: 10.5220/0007361204760482
In Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2019), pages 476-482
ISBN: 978-989-758-351-3
Copyright
c
2019 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
vate information attack, either the recommendations
or the user representations within a recommender sys-
tem are used together with an external information to
uncover private information about the users.
For example, an attacker with access to the gender
of some users could use their representations and a
supervised learning approach to infer the gender of
all users. In an extreme case, it would suffice for an
attacker to glance at a random individual’s computer
screen, see what content is being suggested to them by
the recommendation system, and immediately infer
many demographic, financial, and other information
pertaining to the individual.
In this paper we introduce the threat of implicit
private information leakage in recommender systems.
We show the existence of the leakage both theoret-
ically, and experimentally using a standard recom-
mender and benchmark dataset. Finally, we pro-
pose the privacy-adversarial method of constructing
a recommender from which the target private infor-
mation cannot be read out, using the adversarial train-
ing method (Ganin et al., 2016). We show the ability
of this method to conceal private information and the
trade-off between recommender performance and pri-
vate information leakage.
The contributions of this paper are as follows:
We describe and formalize the threat of implicit
information leakage in machine learning models
in general, and recommenders in particular.
We suggest the privacy-adversarial method (an
adaptation of adversarial training to our setting)
to eliminate the leakage, and validate the method
in an extensive set of experiments.
Finally, we discuss the tangential issue of fairness
in machine learning, and suggest that the privacy-
adversarial method we use in this paper for the
sake of privacy is capable of fixing some fairness
problems previously discussed in the context of
general machine learning models.
2 RELATED WORK
Since the release of the Netflix Prize dataset (Bennett
et al., 2007), a large body of work proved that raw his-
torical usage of users might reveal private information
about them (e.g., (Weinsberg et al., 2012; Narayanan
and Shmatikov, 2008)). That is, given historical usage
of an anonymized user, one can infer the user’s de-
mographics or even their identity. However, we argue
that even the users’ representations may reveal private
information, without access to the training data (and
in fact, the same is true about the recommendations
presented to the user i.e. the actual recommenda-
tion, which can be intercepted or seen by a third party,
includes private information that we may wish to con-
ceal).
Privacy has also been studied recently in the con-
text of recommender systems (Berlioz et al., 2015;
Nikolaenko et al., 2013; McSherry and Mironov,
2009; Friedman et al., 2016; Liu et al., 2015; Shen
and Jin, 2014). This growing body of work has been
concerned for the most part with differentially private
recommender systems, and always with the aim of
guaranteeing that the actual records used to train the
system are not recoverable. Unlike these works, we
are concerned with the leakage of private information
(such as demographics: age, gender, etc.) that was
not directly present during training, but was implic-
itly learned by the system in the process of generating
a useful user representation. These two ideas are not
mutually exclusive, and may be combined to achieve
a better privacy preserving recommender.
The problem of implicit private information stud-
ied here is closely related to the one studied in (Zemel
et al., 2013). In their work, they look for represen-
tations that achieve both group fairness and individ-
ual fairness in classification tasks. Individual fairness
means that two persons having a similar representa-
tion should be treated similarly. Group fairness means
that given a group we wish to protect (some proper
subset S P of the population P), the proportion of
people positively classified in S equals that in the en-
tire population. They achieved this goal by solving an
optimization problem to learn an intermediate repre-
sentation that obfuscates the membership in the pro-
tected group. In both cases the aim is to achieve good
results on the respective predictive tasks, while using
a representation that is agnostic to some aspects of the
implicit structure of the domain.
Several works apply adversarial training for the
purpose of creating a representation free of a specific
attribute (Beutel et al., 2017; Xie et al., 2017; Zhang
et al., 2018; Elazar and Goldberg, 2018) (in fact, the
original purpose of the method of adversarial training
can also be seen as such). To the best of our knowl-
edge, the current paper represents the first application
of these ideas in the domain of recommender systems.
Furthermore, while all the aforementioned applica-
tions experiment with solely one feature at a time, in
this work we aim to create a representation free of
multiple demographic features.
Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations
477
3 PRIVACY AND USER
REPRESENTATIONS
User representations in recommendation systems are
designed to encode the relevant information to deter-
mine user-item affinity. As such, to the extent that
item affinity is dependent on certain user character-
istics (such as demographics), the optimal user rep-
resentations must include this information as well in
order to have the necessary predictive power with re-
spect to recommendations. We formalize this intu-
ition using an information theoretic approach:
Theorem 1. Let ˆv = f (p
u
, q
i
) be an estimator of an
outcome variable v associated with a pair (p
u
, q
i
) of
user and item representations (we use p to denote
users, and q to denote items throughout the paper),
and d any variable associated with users (such as age,
gender, or marital status to name a few):
I(v; ˆv) > H[v|d] = I(ˆv;d) > 0
Proof.
H[v] = H[v|d] + I(v;d) = H[v| ˆv] + I(v; ˆv)
rearranging and using I(v; ˆv) > H[v|d] we have:
I(v; d) > H[v| ˆv]
and therefore:
I(v; d) + I(v; ˆv) > H[v| ˆv] + H[v] H[v| ˆv] = H[v]
= I( ˆv; d) > 0
To see how the final step follows, suppose on the con-
trary:
I( ˆv; d) = 0
then from the above and by the chain rule for infor-
mation we have:
H[v] < I(v; d)+I(v; ˆv) = I(v;d)+I(v; ˆv|d) = I(v; d, ˆv)
in contradiction to the H[x] I(x;·) relation of en-
tropy and mutual information.
Theorem 1 asserts that the predictions ˆv must con-
tain information about any relevant variable associ-
ated with users, if they are better than a certain thresh-
old determined by the relevance of the variable. For
example, if age is a strong determinant of the movies
a user is likely to want to watch, then by looking at
the recommendations for a user we will be a able to
extract some information about said user’s age. Next,
we show that the same is true about the user repre-
sentations used by the recommender, once the item
representations are fixed (i.e. after learning):
Theorem 2. Let ˆv = f (p
u
, q
i
) = g(p
u
) be an estima-
tor of outcome variable v, and d a variable associated
with users, then: I(p
u
;d) I(ˆv;d)
Proof. ˆv is a function of p
u
alone and so by the data
processing inequality we have d : I(p
u
;d) I( ˆv; d).
Corollary 1. As a result of Theorems 1 - 2, for any
meaningful recommendation system and user charac-
teristic there will be information leakage between the
user representation and the characteristic. Specifi-
cally, for any system ˆv = f (p
u
, q
i
), and user charac-
teristic d we have that:
I(p
u
;d) I(ˆv;d) > 0
This assertion, that we cannot have both per-
fect privacy and performance in our setting, naturally
leads to the question of the trade-off between the ex-
tent of information leakage, and the performance of
the recommendation system. While the precise point
selected on this trade-off curve is likely to be deter-
mined by the use-case, it would be reasonable to as-
sume that in any case we will not want to sacrifice
privacy unless we gain in performance. This can be
understood as a Pareto optimality requirement on the
multi-objective defined by the system and privacy ob-
jectives:
Definition 1. The privacy acceptable subset of a fam-
ily S of recommendation systems of the form: f :
[n
users
] × [n
items
] R
+
with respect to a recommen-
dation loss l and a privacy target h is the Pareto front
in S of the multi-objective (l, h).
While the method described in the rest of this
paper does not directly address the issue of select-
ing a privacy-acceptable system, we show that our
method is able to dramatically reduce information
leakage while maintaining the majority of system per-
formance. Future work will focus on methods and an-
alytical tools in the spirit of Theorem 1 to assert that
for a given system with a certain performance, there
does not exist a system (in the family under consider-
ation) with at least equal performance and better pri-
vacy.
3.1 Privacy-Adversarial
Recommendation Systems
In the previous section we showed that for any rec-
ommendation system where the user representations
capture enough to make good predictions, these rep-
resentations must reveal information about any perti-
nent user characteristic. In this section we describe a
ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods
478
method to reduce such information from the user rep-
resentations, in a way that allows to select along the
trade-off curve of performance and information leak-
age.
The method we use borrows the key idea from
domain-adversarial training (Ganin et al., 2016),
where the aim is to learn a representation that is
agnostic to the domain from which the example is
drawn. Adapted to the problem at hand, this method
enables us to construct a user representation from
which the private information can not be read out.
We start with an arbitrary latent factor recommen-
dation system (which we will assume is trained using
a gradient method). An additional construct is then
appended from the user vectors, the output of which
is the private field(s) we wish to censor. During train-
ing we follow two goals: (a) we would like to change
the recommender parameters to optimize the original
system objective, and (b) we would like to change
the user vectors only, in order to harm the readout
of the private information, while optimizing the read-
out parameters themselves with respect to the private
information readout target. This is achieved by ap-
plication of the gradient reversal trick introduced in
(Ganin et al., 2016), leading to the following update
rule for user representation p
u
:
p
u
p
u
α
h
loss
recsys
p
u
λ
i
loss
demographics
i
p
u
i
(1)
where α and λ are the general learning rate, and
the adversarial training learning rate respectively.
loss
recsys
is the recommendation system loss, and
loss
demographics
i
is the loss for the i th demographic
field prediction task.
Two special cases are noteworthy. First, for λ = 0
this formulation reduces back to the regular recom-
mendation system. Second, setting λ < 0 we get the
multi-task setting where we are trying to achieve both
the recommendation and the demographic prediction
task simultaneously.
The gradient descent update for the rest of the rec-
ommendation system parameters (namely, the item
representations and biases) is done by the regular
α
loss
recsys
∂θ
update rule. Likewise, the parameters for
the demographic field readouts are optimized in the
same way with respect to their respective classifica-
tion objectives.
4 A NOTE ABOUT FAIRNESS
The issue of fairness of models is a long standing de-
bate in the scientific community and beyond. Some
cases aspects of fairness are mandated by legislation
or social norms. For instance, when modeling risk for
the purpose of loans, in most countries the use of cer-
tain characteristics (such as gender or race) would be
strictly prohibited. However, while these protected
variables are not explicitly entered into the model,
how can we be sure that they did not affect the out-
come via correlations with variables which were in-
deed included?
Many methods have been devised to address this
question and guarantee fairness (see Section 2 for a
brief summary of some of these lines of work). The
ultimate solution to the problem of fairness is to be
able to guarantee that a set of variables did not have
an affect on the outcome of a model. In order to be
able to give such a guarantee we should be able to
assert that the information content of the forbidden
variables was not present in the model.
It is easy to see why just excluding protected vari-
ables from the model is not enough. Suppose for in-
stance we wish to exclude gender from our model in
order to have equality in the outcome of the model
with respect to this attribute. Namely, we want the
distributions of outcomes to be the same for all gen-
ders. The first step is to exclude this variable from
the model. However, including other seemingly be-
nign attributes such as occupation will allow gender
to be implicitly included in the model (since presum-
ably these variables are dependent, i.e. share infor-
mation that will allow the model to ’guess’ the gender
anyway).
The method of privacy-adversarial training of rec-
ommender systems presented in this paper is a step
towards fairness in machine learning models. By ap-
plying privacy-adversarial training with respect to a
set of protected attributes, we ensure that the repre-
sentation of the individual does not include these at-
tributes, and they therefore can’t affect the outcome.
However, we note that the method does not come with
a provable guarantee that this information was indeed
scrubbed from the model. Furthermore, in a recent
study (Elazar and Goldberg, 2018) it has been shown
that a similar technique is able to eliminate demo-
graphic classification during training in an NLP task,
but the text representations produced by the model
still contain demographic information that can be ex-
tracted via different classifiers. In the next section we
provide evidence that at least in the context of recom-
mendation systems the method of privacy-adversarial
training works well, and that the resulting represen-
tations do not contain information about gender and
age that can be read out even by additional classifier
(Table 1). That being said, it is important to remem-
ber that for critical issues such as privacy and fairness
Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations
479
we would ideally want provability of the properties of
the method.
5 RESULTS
Data. The experiments in this section were con-
ducted on the MovieLens 1M dataset (Harper and
Konstan, 2016). This extensively studied dataset (see
for example (Miller et al., 2003; Chen et al., 2010;
Jung, 2012; Peralta, 2007)) includes 1,000,209 rat-
ings from 6,040 users, on 3,706 movies. In addition,
demographic information in the form of age and gen-
der is provided for each user. Gender (male/female)
is skewed towards male with 71.7% in the male cate-
gory. Age is divided into 7 groups (0-18, 18-25, 25-
35, 35-45, 45-50, 50-56, 56-inf) with 34.7% in the
most popular age group, being 25-35. This means that
when absolutely no personal data is given about an ar-
bitrary user, the prediction accuracy of gender and age
group cannot exceed 71.7% and 34.7% respectively.
Recommendation System. We use the Bayesian
Personalized Ranking (BPR) recommendation system
(Rendle et al., 2009), a natural choice for ranking
tasks. The model is modified with adversarial de-
mographic prediction by appending a linear readout
structure from the user-vectors to each of the demo-
graphic targets (binary gender and 7-category age).
The gradient reversal layer (GRL) (Ganin et al., 2016)
is applied between the user-vectors and each of the de-
mographic readout structures, so that effectively dur-
ing training the user vectors are optimized with re-
spect to the recommendation task, but de-optimized
with respect to the gender prediction. At the same
time, the demographic readout is optimized to use the
current user representation to predict gender and age.
The result of this scheme is a user representation that
is good for recommendation but not good for demo-
graphic prediction (i.e. is purged of the information
we do not want it to contain). We note that the same
method could be applied to any type of recommenda-
tion system which includes user representations and
is trained using a gradient based method.
Evaluation. Recommendation systems were eval-
uated using a hold out set. For each user in the
MovieLens 1M Dataset, the final movie that they
watched was set aside for testing, and never seen dur-
ing training. The fraction of users for whom this
held out movie was in the top-k recommendation (Ko-
ren, 2008) is reported as model accuracy (we use
k = 10). Private information in the user representa-
tions was evaluated using both a neural-net predictor
Table 1: Verification of the inability to predict demographic
fields from user representations trained in the privacy-
adversarial method. Results in this table are given for rep-
resentations of size 10 with λ = 1.
classifier gender age
large class baseline 71.70 34.70
softmax neural net readout 71.70 34.47
SVM (linear; C=.1) 71.71 34.62
SVM (linear; C=1) 71.71 34.59
SVM (linear; C=10) 71.71 34.59
SVM (RBF kernel) 71.71 29.20
Decision Tree 64.00 25.44
Random Forrest 69.80 29.20
Gradient Boosting 71.79 33.48
Table 2: Gender prediction from user representations. First
column corresponds to the regular recommendation system,
and the following columns to privacy-adversarial training
with the prescribed value of λ. Rows correspond to the size
of user and item representations. Final row contains the
na
¨
ıve baseline reverting to the predicting the largest class.
size / λ 0 .01 .1 1 10
10 76.97 74.21 71.33 71.70 71.60
20 77.55 74.26 71.50 72.30 71.03
50 77.80 74.34 72.24 86.00 74.26
na
¨
ıve ··· 71.70% ···
of the same form used during adversarial training, and
a host of standard classifiers (SVMs with various pa-
rameters and kernels, Decision Trees, Random Forest
see Table 1). The rest of the results are shown for
the original neural classifier with a cross validation
procedure. Results are reported as accuracy.
Results-privacy. Private demographic information
does indeed exist in user representations in the stan-
dard recommendation system. Gender prediction (Ta-
ble 2, λ = 0 column) increases with size of user rep-
resentation to 77.8% (recall 71.7% are Male). Like-
wise, age bracket prediction also increases with size
of user representation and reaches 44.90% (largest
category is 34.7%). These results serve as the base-
line against which the adversarial training models are
tested against. Our aim in the privacy-adversarial set-
ting will be to reduce the classification results down
to the baseline, reflecting user representations were
purged of this private information.
Results-privacy-adversarial. In the privacy-
adversarial setting, overall prediction results for both
gender and age are diminished to the desired level of
the largest class. With λ = .1, for example, age pre-
ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods
480
Table 3: Age prediction from user representations. First
column corresponds to the regular recommendation system,
and the following columns to privacy-adversarial training
with the prescribed value of λ. Rows correspond to the size
of user and item representations. Final row contains the
na
¨
ıve baseline reverting to the predicting the largest class.
size / λ 0 .01 .1 1 10
10 41.29 36.28 34.29 34.47 34.31
20 44.16 36.34 33.97 35.70 38.01
50 44.90 36.46 34.62 70.27 52.7
na
¨
ıve ··· 34.70% ···
diction is eliminated completely (reducing effectively
to the 34.7% baseline) for all sizes of representation,
and likewise for gender with representation of size
10 20. For size 50 we see some residual predictive
power, though it is highly reduced relative to the
regular recommendation system.
For the large representation (size 50) and large
values of λ in the range of λ 1 we see an interest-
ing phenomenon of reversal of the effect, with demo-
graphic readout sometimes way above the regular rec-
ommendation system (e.g when the embedding size =
50 and λ = 1 the gender prediction achieves 86.0%).
We suspect this happens due to the relative high learn-
ing rate, which causes the system to diverge.
With respect to the trade-off between system per-
formance and privacy, results indicate (Table. 4) that
smaller user representations (size 10) are preferen-
tial for this small dataset. We see some degradation
with adversarial training, but nevertheless we are able
to eliminate private information almost entirely with
representations of size 10 and λ = .1 while sacrific-
ing only a small proportion of performance (accu-
racy@10 of 2.88% instead of the 3.05% for the regu-
lar system, gender information gap of 0.37% and age
information gap of 0.41% from the majority group).
Together, these results show the existence of the
privacy leakage in a typical recommendation system,
and the ability to eliminate it with privacy-adversarial
training while harming the overall system perfor-
mance only marginally.
Table 4: Recommendation System performance (accu-
racy@10) with privacy-adversarial training. First column
corresponds to the regular recommendation system, and the
following columns to privacy-adversarial training with the
prescribed value of λ. Rows correspond to the size of user
and item representations.
size / λ 0 .01 .1 1 10
10 3.05 2.76 2.88 2.43 2.67
20 3.00 2.68 2.04 2.20 2.07
50 2.65 2.38 2.22 2.22 2.15
6 CONCLUSIONS
In this paper we discuss information leakage from
user representations of Latent factor recommender
systems, and show that private demographic informa-
tion can be read-out even when not used in the train-
ing data. We adapt the adversarial training frame-
work in the context of privacy in recommender sys-
tems. An adversarial component is appended to the
model for each of the demographic variables we want
to obfuscate, so that the learned user representations
are optimized in a way that precludes predicting these
variables. We show that the proposed framework has
the desired privacy preserving effect, while having a
minimal overall adverse effect on recommender per-
formance, when using the correct value of the trade-
off parameter λ. Our experiments show that this value
should be determined for a given dataset, since values
too large lead to instability of the adversarial com-
ponent. But in any case, as suggested by (Elazar and
Goldberg, 2018), when concerened with sensitive fea-
tures, one should verify the amount of information in
the representation with additional post-hoc classifier.
The adversarial method can be used to obfuscate
any private variable known during training (in this
paper we discuss categorical variables, but the gen-
eralization to the continuous case is trivial). While
at first glance this may be seen as a shortcoming of
the approach, it is interesting to note that it would be
inherently infeasible to force the representation not
to include any factor implicitly associated with item
choice. Clearly, in such a case there would be no
information left to drive recommendations. The in-
tended use of the method is rather to hide a small set
of protected variables known during training, while
using the rest of the implicit information in the usage
data to drive recommendations.
An interesting topic for further research is the
amount of private information that is available in the
top-k recommendations themselves. Since the sole
reason private demographic information is present in
the user representations is to help drive recommenda-
tions, it stands to reason that it would be possible to
design a method of reverse-engineering in the form
of a readout from the actual recommendations. Such
a leakage, to the extent that it indeed exists, would
have much further reaching practical implications for
privacy and security of individuals.
Another topic for further research is the use of
privacy-adversarial training to boost the personaliza-
tion and specificity of recommendations. By elimi-
nating the demographic (or other profile related) in-
formation, suggested items are coerced out of stereo-
typical templates related to coarse profiling. It is our
Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations
481
hope that user testing will confirm that this leads to
deeper and more meaningful user models, and overall
higher quality recommendations.
REFERENCES
Adomavicius, G. and Tuzhilin, A. (2015). Context-aware
recommender systems. In Recommender systems
handbook, pages 191–226. Springer.
Bennett, J., Lanning, S., et al. (2007). The netflix prize.
In Proceedings of KDD cup and workshop, volume
2007, page 35. New York, NY, USA.
Berlioz, A., Friedman, A., Kaafar, M. A., Boreli, R., and
Berkovsky, S. (2015). Applying differential privacy
to matrix factorization. In Proceedings of the 9th
ACM Conference on Recommender Systems, pages
107–114. ACM.
Beutel, A., Chen, J., Zhao, Z., and Chi, E. H. (2017). Data
decisions and theoretical implications when adver-
sarially learning fair representations. arXiv preprint
arXiv:1707.00075.
Chen, Y., Harper, F. M., Konstan, J., and Li, S. X. (2010).
Social comparisons and contributions to online com-
munities: A field experiment on movielens. American
Economic Review, 100(4):1358–98.
Covington, P., Adams, J., and Sargin, E. (2016). Deep
neural networks for youtube recommendations. In
Proceedings of the 10th ACM Conference on Recom-
mender Systems, pages 191–198. ACM.
Elazar, Y. and Goldberg, Y. (2018). Adversarial removal
of demographic attributes from text data. In Proceed-
ings of the 2018 Conference on Empirical Methods in
Natural Language Processing.
Friedman, A., Berkovsky, S., and Kaafar, M. A. (2016).
A differential privacy framework for matrix factoriza-
tion recommender systems. User Modeling and User-
Adapted Interaction, 26(5):425–458.
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P.,
Larochelle, H., Laviolette, F., Marchand, M., and
Lempitsky, V. (2016). Domain-adversarial training of
neural networks. The Journal of Machine Learning
Research, 17(1):2096–2030.
Harper, F. M. and Konstan, J. A. (2016). The movielens
datasets: History and context. ACM Transactions on
Interactive Intelligent Systems (TiiS), 5(4):19.
Jung, J. J. (2012). Attribute selection-based recommenda-
tion framework for short-head user group: An empiri-
cal study by movielens and imdb. Expert Systems with
Applications, 39(4):4049–4054.
Koren, Y. (2008). Factorization meets the neighborhood:
a multifaceted collaborative filtering model. In Pro-
ceedings of the 14th ACM SIGKDD international con-
ference on Knowledge discovery and data mining,
pages 426–434. ACM.
Liu, Z., Wang, Y.-X., and Smola, A. (2015). Fast differen-
tially private matrix factorization. In Proceedings of
the 9th ACM Conference on Recommender Systems,
pages 171–178. ACM.
McSherry, F. and Mironov, I. (2009). Differentially private
recommender systems: Building privacy into the net-
flix prize contenders. In Proceedings of the 15th ACM
SIGKDD international conference on Knowledge dis-
covery and data mining, pages 627–636. ACM.
Miller, B. N., Albert, I., Lam, S. K., Konstan, J. A., and
Riedl, J. (2003). Movielens unplugged: experiences
with an occasionally connected recommender system.
In Proceedings of the 8th international conference on
Intelligent user interfaces, pages 263–266. ACM.
Narayanan, A. and Shmatikov, V. (2008). Robust de-
anonymization of large sparse datasets. In Security
and Privacy, 2008. SP 2008. IEEE Symposium on,
pages 111–125. IEEE.
Nikolaenko, V., Ioannidis, S., Weinsberg, U., Joye, M., Taft,
N., and Boneh, D. (2013). Privacy-preserving ma-
trix factorization. In Proceedings of the 2013 ACM
SIGSAC conference on Computer & communications
security, pages 801–812. ACM.
Pazzani, M. J. (1999). A framework for collaborative,
content-based and demographic filtering. Artificial in-
telligence review, 13(5-6):393–408.
Peralta, V. (2007). Extraction and integration of movielens
and imdb data. Laboratoire Prisme, Universit
´
e de Ver-
sailles, Versailles, France.
Rendle, S., Freudenthaler, C., Gantner, Z., and Schmidt-
Thieme, L. (2009). Bpr: Bayesian personalized rank-
ing from implicit feedback. In Proceedings of the
twenty-fifth conference on uncertainty in artificial in-
telligence, pages 452–461. AUAI Press.
Shen, Y. and Jin, H. (2014). Privacy-preserving person-
alized recommendation: An instance-based approach
via differential privacy. In Data Mining (ICDM), 2014
IEEE International Conference on, pages 540–549.
IEEE.
Weinsberg, U., Bhagat, S., Ioannidis, S., and Taft, N.
(2012). Blurme: Inferring and obfuscating user gen-
der based on ratings. In Proceedings of the sixth ACM
conference on Recommender systems, pages 195–202.
ACM.
Xie, Q., Dai, Z., Du, Y., Hovy, E., and Neubig, G. (2017).
Controllable invariance through adversarial feature
learning. In Advances in Neural Information Process-
ing Systems, pages 585–596.
Zemel, R., Wu, Y., Swersky, K., Pitassi, T., and Dwork, C.
(2013). Learning fair representations. In International
Conference on Machine Learning, pages 325–333.
Zhang, B. H., Lemoine, B., and Mitchell, M. (2018).
Mitigating unwanted biases with adversarial learning.
arXiv preprint arXiv:1801.07593.
Zhao, X. W., Guo, Y., He, Y., Jiang, H., Wu, Y., and
Li, X. (2014). We know what you want to buy: a
demographic-based system for product recommenda-
tion on microblogs. In Proceedings of the 20th ACM
SIGKDD international conference on Knowledge dis-
covery and data mining, pages 1935–1944. ACM.
ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods
482