Privacy and Fairness in Recommender Systems via Adversarial Training

of User Representations

Yehezkel S. Resheff

, Yanai Elazar

2,3

, Moni Shahar

and Oren Sar Shalom

Intuit Tech Futures, Israel

Bar Ilan University, Israel

Intuit, Israel

Keywords:

Privacy, Representations, Information Leakage.

Abstract:

Latent factor models for recommender systems represent users and items as low dimensional vectors. Privacy

risks of such systems have previously been studied mostly in the context of recovery of personal information

in the form of usage records from the training data. However, the user representations themselves may be used

together with external data to recover private user information such as gender and age. In this paper we show

that user vectors calculated by a common recommender system can be exploited in this way. We propose

the privacy-adversarial framework to eliminate such leakage of private information, and study the trade-off

between recommender performance and leakage both theoretically and empirically using a benchmark dataset.

An advantage of the proposed method is that it also helps guarantee fairness of results, since all implicit

knowledge of a set of attributes is scrubbed from the representations used by the model, and thus can’t enter

into the decision making. We discuss further applications of this method towards the generation of deeper and

more insightful recommendations.

1 INTRODUCTION

With the increasing popularity of digital content con-

sumption, recommender systems have become a ma-

jor inﬂuencer of online behavior, from what news we

read and what movies we watch, to what products we

buy. Recommender systems have revolutionized the

way items are chosen across multiple domains, mov-

ing users from the previous active search scenario to

the more passive selection of presented content as we

know it today.

A recommender system needs to fulﬁll two objec-

tives in order to be able to supply relevant recommen-

dations that will be helpful to users. Namely, it has

to accurately model both the users and the items. An

implication of the ﬁrst objective is that recommenders

aim at revealing users’ preferences, their desires and

wills, and even learn when to suggest different items

(Adomavicius and Tuzhilin, 2015), and how many

times to recommend a given item. In order to be ef-

fective, a system must have an extensive view of a

user, embodied as a representation that includes a sub-

stantial amount of private information (which may be

either implicitly or explicitly encoded).

Indeed, in recent years we have seen a plethora

of advanced methods applied to improve the person-

alized recommendation problem. Furthermore, mod-

ern Collaborative Filtering approaches are becoming

increasingly more complex not only in algorithmic

terms but also in their ability to process and use ad-

ditional data. Arguably, demographic data is the most

valuable source of information for user modeling. As

such, it was used from the early days of recommen-

dation systems (Pazzani, 1999). New deep learn-

ing based state of the art methods also utilize de-

mographic information (Covington et al., 2016; Zhao

et al., 2014) in order to generate better and more rele-

vant predictions.

While this in-depth modeling of users holds great

value, it may also pose severe privacy threats. One

major privacy concern, especially when private in-

formation is explicitly used during training, is recov-

ery of records from the training data by an attacker.

This aspect of privacy has previously been studied in

the context of recommender systems using the frame-

work of differential privacy (see Section 2 below). A

related but different threat, which to the best of our

knowledge has not yet been addressed in this context,

is the recovery of private information that was not ex-

plicitly present in the training data. In the implicit pri-

476

Resheff, Y., Elazar, Y., Shahar, M. and Shalom, O.

Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations.

DOI: 10.5220/0007361204760482

In Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2019), pages 476-482

ISBN: 978-989-758-351-3

vate information attack, either the recommendations

or the user representations within a recommender sys-

tem are used together with an external information to

uncover private information about the users.

For example, an attacker with access to the gender

of some users could use their representations and a

supervised learning approach to infer the gender of

all users. In an extreme case, it would sufﬁce for an

attacker to glance at a random individual’s computer

screen, see what content is being suggested to them by

the recommendation system, and immediately infer

many demographic, ﬁnancial, and other information

pertaining to the individual.

In this paper we introduce the threat of implicit

private information leakage in recommender systems.

We show the existence of the leakage both theoret-

ically, and experimentally using a standard recom-

mender and benchmark dataset. Finally, we pro-

pose the privacy-adversarial method of constructing

a recommender from which the target private infor-

mation cannot be read out, using the adversarial train-

ing method (Ganin et al., 2016). We show the ability

of this method to conceal private information and the

trade-off between recommender performance and pri-

vate information leakage.

The contributions of this paper are as follows:

• We describe and formalize the threat of implicit

information leakage in machine learning models

in general, and recommenders in particular.

• We suggest the privacy-adversarial method (an

adaptation of adversarial training to our setting)

to eliminate the leakage, and validate the method

in an extensive set of experiments.

• Finally, we discuss the tangential issue of fairness

in machine learning, and suggest that the privacy-

adversarial method we use in this paper for the

sake of privacy is capable of ﬁxing some fairness

problems previously discussed in the context of

general machine learning models.

2 RELATED WORK

Since the release of the Netﬂix Prize dataset (Bennett

et al., 2007), a large body of work proved that raw his-

torical usage of users might reveal private information

about them (e.g., (Weinsberg et al., 2012; Narayanan

and Shmatikov, 2008)). That is, given historical usage

of an anonymized user, one can infer the user’s de-

mographics or even their identity. However, we argue

that even the users’ representations may reveal private

information, without access to the training data (and

in fact, the same is true about the recommendations

presented to the user – i.e. the actual recommenda-

tion, which can be intercepted or seen by a third party,

includes private information that we may wish to con-

ceal).

Privacy has also been studied recently in the con-

text of recommender systems (Berlioz et al., 2015;

Nikolaenko et al., 2013; McSherry and Mironov,

2009; Friedman et al., 2016; Liu et al., 2015; Shen

and Jin, 2014). This growing body of work has been

concerned for the most part with differentially private

recommender systems, and always with the aim of

guaranteeing that the actual records used to train the

system are not recoverable. Unlike these works, we

are concerned with the leakage of private information

(such as demographics: age, gender, etc.) that was

not directly present during training, but was implic-

itly learned by the system in the process of generating

a useful user representation. These two ideas are not

mutually exclusive, and may be combined to achieve

a better privacy preserving recommender.

The problem of implicit private information stud-

ied here is closely related to the one studied in (Zemel

et al., 2013). In their work, they look for represen-

tations that achieve both group fairness and individ-

ual fairness in classiﬁcation tasks. Individual fairness

means that two persons having a similar representa-

tion should be treated similarly. Group fairness means

that given a group we wish to protect (some proper

subset S ⊂ P of the population P), the proportion of

people positively classiﬁed in S equals that in the en-

tire population. They achieved this goal by solving an

optimization problem to learn an intermediate repre-

sentation that obfuscates the membership in the pro-

tected group. In both cases the aim is to achieve good

results on the respective predictive tasks, while using

a representation that is agnostic to some aspects of the

implicit structure of the domain.

Several works apply adversarial training for the

purpose of creating a representation free of a speciﬁc

attribute (Beutel et al., 2017; Xie et al., 2017; Zhang

et al., 2018; Elazar and Goldberg, 2018) (in fact, the

original purpose of the method of adversarial training

can also be seen as such). To the best of our knowl-

edge, the current paper represents the ﬁrst application

of these ideas in the domain of recommender systems.

Furthermore, while all the aforementioned applica-

tions experiment with solely one feature at a time, in

this work we aim to create a representation free of

multiple demographic features.

Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations

477

3 PRIVACY AND USER

REPRESENTATIONS

User representations in recommendation systems are

designed to encode the relevant information to deter-

mine user-item afﬁnity. As such, to the extent that

item afﬁnity is dependent on certain user character-

istics (such as demographics), the optimal user rep-

resentations must include this information as well in

order to have the necessary predictive power with re-

spect to recommendations. We formalize this intu-

ition using an information theoretic approach:

Theorem 1. Let ˆv = f (p

, q

) be an estimator of an

outcome variable v associated with a pair (p

, q

) of

user and item representations (we use p to denote

users, and q to denote items throughout the paper),

and d any variable associated with users (such as age,

gender, or marital status to name a few):

I(v; ˆv) > H[v|d] =⇒ I(ˆv;d) > 0

Proof.

H[v] = H[v|d] + I(v;d) = H[v| ˆv] + I(v; ˆv)

rearranging and using I(v; ˆv) > H[v|d] we have:

I(v; d) > H[v| ˆv]

and therefore:

I(v; d) + I(v; ˆv) > H[v| ˆv] + H[v] − H[v| ˆv] = H[v]

=⇒ I( ˆv; d) > 0

To see how the ﬁnal step follows, suppose on the con-

trary:

I( ˆv; d) = 0

then from the above and by the chain rule for infor-

mation we have:

H[v] < I(v; d)+I(v; ˆv) = I(v;d)+I(v; ˆv|d) = I(v; d, ˆv)

in contradiction to the H[x] ≥ I(x;·) relation of en-

tropy and mutual information.

Theorem 1 asserts that the predictions ˆv must con-

tain information about any relevant variable associ-

ated with users, if they are better than a certain thresh-

old determined by the relevance of the variable. For

example, if age is a strong determinant of the movies

a user is likely to want to watch, then by looking at

the recommendations for a user we will be a able to

extract some information about said user’s age. Next,

we show that the same is true about the user repre-

sentations used by the recommender, once the item

representations are ﬁxed (i.e. after learning):

Theorem 2. Let ˆv = f (p

, q

) = g(p

) be an estima-

tor of outcome variable v, and d a variable associated

with users, then: I(p

;d) ≥ I(ˆv;d)

Proof. ˆv is a function of p

alone and so by the data

processing inequality we have ∀d : I(p

;d) ≥ I( ˆv; d).

Corollary 1. As a result of Theorems 1 - 2, for any

meaningful recommendation system and user charac-

teristic there will be information leakage between the

user representation and the characteristic. Speciﬁ-

cally, for any system ˆv = f (p

, q

), and user charac-

teristic d we have that:

I(p

;d) ≥ I(ˆv;d) > 0

This assertion, that we cannot have both per-

fect privacy and performance in our setting, naturally

leads to the question of the trade-off between the ex-

tent of information leakage, and the performance of

the recommendation system. While the precise point

selected on this trade-off curve is likely to be deter-

mined by the use-case, it would be reasonable to as-

sume that in any case we will not want to sacriﬁce

privacy unless we gain in performance. This can be

understood as a Pareto optimality requirement on the

multi-objective deﬁned by the system and privacy ob-

jectives:

Deﬁnition 1. The privacy acceptable subset of a fam-

ily S of recommendation systems of the form: f :

users

] × [n

items

] → R

with respect to a recommen-

dation loss l and a privacy target h is the Pareto front

in S of the multi-objective (l, h).

While the method described in the rest of this

paper does not directly address the issue of select-

ing a privacy-acceptable system, we show that our

method is able to dramatically reduce information

leakage while maintaining the majority of system per-

formance. Future work will focus on methods and an-

alytical tools in the spirit of Theorem 1 to assert that

for a given system with a certain performance, there

does not exist a system (in the family under consider-

ation) with at least equal performance and better pri-

vacy.

3.1 Privacy-Adversarial

Recommendation Systems

In the previous section we showed that for any rec-

ommendation system where the user representations

capture enough to make good predictions, these rep-

resentations must reveal information about any perti-

nent user characteristic. In this section we describe a

ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods

478

method to reduce such information from the user rep-

resentations, in a way that allows to select along the

trade-off curve of performance and information leak-

age.

The method we use borrows the key idea from

domain-adversarial training (Ganin et al., 2016),

where the aim is to learn a representation that is

agnostic to the domain from which the example is

drawn. Adapted to the problem at hand, this method

enables us to construct a user representation from

which the private information can not be read out.

We start with an arbitrary latent factor recommen-

dation system (which we will assume is trained using

a gradient method). An additional construct is then

appended from the user vectors, the output of which

is the private ﬁeld(s) we wish to censor. During train-

ing we follow two goals: (a) we would like to change

the recommender parameters to optimize the original

system objective, and (b) we would like to change

the user vectors only, in order to harm the readout

of the private information, while optimizing the read-

out parameters themselves with respect to the private

information readout target. This is achieved by ap-

plication of the gradient reversal trick introduced in

(Ganin et al., 2016), leading to the following update

rule for user representation p

← p

− α

∂loss

recsys

∂p

− λ

∑

∂loss

demographics

∂p

(1)

where α and λ are the general learning rate, and

the adversarial training learning rate respectively.

loss

recsys

is the recommendation system loss, and

loss

demographics

is the loss for the i − th demographic

ﬁeld prediction task.

Two special cases are noteworthy. First, for λ = 0

this formulation reduces back to the regular recom-

mendation system. Second, setting λ < 0 we get the

multi-task setting where we are trying to achieve both

the recommendation and the demographic prediction

task simultaneously.

The gradient descent update for the rest of the rec-

ommendation system parameters (namely, the item

representations and biases) is done by the regular

−α

∂loss

recsys

∂θ

update rule. Likewise, the parameters for

the demographic ﬁeld readouts are optimized in the

same way with respect to their respective classiﬁca-

tion objectives.

4 A NOTE ABOUT FAIRNESS

The issue of fairness of models is a long standing de-

bate in the scientiﬁc community and beyond. Some

cases aspects of fairness are mandated by legislation

or social norms. For instance, when modeling risk for

the purpose of loans, in most countries the use of cer-

tain characteristics (such as gender or race) would be

strictly prohibited. However, while these protected

variables are not explicitly entered into the model,

how can we be sure that they did not affect the out-

come via correlations with variables which were in-

deed included?

Many methods have been devised to address this

question and guarantee fairness (see Section 2 for a

brief summary of some of these lines of work). The

ultimate solution to the problem of fairness is to be

able to guarantee that a set of variables did not have

an affect on the outcome of a model. In order to be

able to give such a guarantee we should be able to

assert that the information content of the forbidden

variables was not present in the model.

It is easy to see why just excluding protected vari-

ables from the model is not enough. Suppose for in-

stance we wish to exclude gender from our model in

order to have equality in the outcome of the model

with respect to this attribute. Namely, we want the

distributions of outcomes to be the same for all gen-

ders. The ﬁrst step is to exclude this variable from

the model. However, including other seemingly be-

nign attributes such as occupation will allow gender

to be implicitly included in the model (since presum-

ably these variables are dependent, i.e. share infor-

mation that will allow the model to ’guess’ the gender

anyway).

The method of privacy-adversarial training of rec-

ommender systems presented in this paper is a step

towards fairness in machine learning models. By ap-

plying privacy-adversarial training with respect to a

set of protected attributes, we ensure that the repre-

sentation of the individual does not include these at-

tributes, and they therefore can’t affect the outcome.

However, we note that the method does not come with

a provable guarantee that this information was indeed

scrubbed from the model. Furthermore, in a recent

study (Elazar and Goldberg, 2018) it has been shown

that a similar technique is able to eliminate demo-

graphic classiﬁcation during training in an NLP task,

but the text representations produced by the model

still contain demographic information that can be ex-

tracted via different classiﬁers. In the next section we

provide evidence that at least in the context of recom-

mendation systems the method of privacy-adversarial

training works well, and that the resulting represen-

tations do not contain information about gender and

age that can be read out even by additional classiﬁer

(Table 1). That being said, it is important to remem-

ber that for critical issues such as privacy and fairness

Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations

479

we would ideally want provability of the properties of

the method.

5 RESULTS

Data. The experiments in this section were con-

ducted on the MovieLens 1M dataset (Harper and

Konstan, 2016). This extensively studied dataset (see

for example (Miller et al., 2003; Chen et al., 2010;

Jung, 2012; Peralta, 2007)) includes 1,000,209 rat-

ings from 6,040 users, on 3,706 movies. In addition,

demographic information in the form of age and gen-

der is provided for each user. Gender (male/female)

is skewed towards male with 71.7% in the male cate-

gory. Age is divided into 7 groups (0-18, 18-25, 25-

35, 35-45, 45-50, 50-56, 56-inf) with 34.7% in the

most popular age group, being 25-35. This means that

when absolutely no personal data is given about an ar-

bitrary user, the prediction accuracy of gender and age

group cannot exceed 71.7% and 34.7% respectively.

Recommendation System. We use the Bayesian

Personalized Ranking (BPR) recommendation system

(Rendle et al., 2009), a natural choice for ranking

tasks. The model is modiﬁed with adversarial de-

mographic prediction by appending a linear readout

structure from the user-vectors to each of the demo-

graphic targets (binary gender and 7-category age).

The gradient reversal layer (GRL) (Ganin et al., 2016)

is applied between the user-vectors and each of the de-

mographic readout structures, so that effectively dur-

ing training the user vectors are optimized with re-

spect to the recommendation task, but de-optimized

with respect to the gender prediction. At the same

time, the demographic readout is optimized to use the

current user representation to predict gender and age.

The result of this scheme is a user representation that

is good for recommendation but not good for demo-

graphic prediction (i.e. is purged of the information

we do not want it to contain). We note that the same

method could be applied to any type of recommenda-

tion system which includes user representations and

is trained using a gradient based method.

Evaluation. Recommendation systems were eval-

uated using a hold out set. For each user in the

MovieLens 1M Dataset, the ﬁnal movie that they

watched was set aside for testing, and never seen dur-

ing training. The fraction of users for whom this

held out movie was in the top-k recommendation (Ko-

ren, 2008) is reported as model accuracy (we use

k = 10). Private information in the user representa-

tions was evaluated using both a neural-net predictor

Table 1: Veriﬁcation of the inability to predict demographic

ﬁelds from user representations trained in the privacy-

adversarial method. Results in this table are given for rep-

resentations of size 10 with λ = 1.

classiﬁer gender age

large class baseline 71.70 34.70

softmax neural net readout 71.70 34.47

SVM (linear; C=.1) 71.71 34.62

SVM (linear; C=1) 71.71 34.59

SVM (linear; C=10) 71.71 34.59

SVM (RBF kernel) 71.71 29.20

Decision Tree 64.00 25.44

Random Forrest 69.80 29.20

Gradient Boosting 71.79 33.48

Table 2: Gender prediction from user representations. First

column corresponds to the regular recommendation system,

and the following columns to privacy-adversarial training

with the prescribed value of λ. Rows correspond to the size

of user and item representations. Final row contains the

ıve baseline reverting to the predicting the largest class.

size / λ 0 .01 .1 1 10

10 76.97 74.21 71.33 71.70 71.60

20 77.55 74.26 71.50 72.30 71.03

50 77.80 74.34 72.24 86.00 74.26

ıve ··· 71.70% ···

of the same form used during adversarial training, and

a host of standard classiﬁers (SVMs with various pa-

rameters and kernels, Decision Trees, Random Forest

– see Table 1). The rest of the results are shown for

the original neural classiﬁer with a cross validation

procedure. Results are reported as accuracy.

Results-privacy. Private demographic information

does indeed exist in user representations in the stan-

dard recommendation system. Gender prediction (Ta-

ble 2, λ = 0 column) increases with size of user rep-

resentation to 77.8% (recall 71.7% are Male). Like-

wise, age bracket prediction also increases with size

of user representation and reaches 44.90% (largest

category is 34.7%). These results serve as the base-

line against which the adversarial training models are

tested against. Our aim in the privacy-adversarial set-

ting will be to reduce the classiﬁcation results down

to the baseline, reﬂecting user representations were

purged of this private information.

Results-privacy-adversarial. In the privacy-

adversarial setting, overall prediction results for both

gender and age are diminished to the desired level of

the largest class. With λ = .1, for example, age pre-

ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods

480

Table 3: Age prediction from user representations. First

column corresponds to the regular recommendation system,

and the following columns to privacy-adversarial training

with the prescribed value of λ. Rows correspond to the size

of user and item representations. Final row contains the

ıve baseline reverting to the predicting the largest class.

size / λ 0 .01 .1 1 10

10 41.29 36.28 34.29 34.47 34.31

20 44.16 36.34 33.97 35.70 38.01

50 44.90 36.46 34.62 70.27 52.7

ıve ··· 34.70% ···

diction is eliminated completely (reducing effectively

to the 34.7% baseline) for all sizes of representation,

and likewise for gender with representation of size

10 − 20. For size 50 we see some residual predictive

power, though it is highly reduced relative to the

regular recommendation system.

For the large representation (size 50) and large

values of λ in the range of λ ≥ 1 we see an interest-

ing phenomenon of reversal of the effect, with demo-

graphic readout sometimes way above the regular rec-

ommendation system (e.g when the embedding size =

50 and λ = 1 the gender prediction achieves 86.0%).

We suspect this happens due to the relative high learn-

ing rate, which causes the system to diverge.

With respect to the trade-off between system per-

formance and privacy, results indicate (Table. 4) that

smaller user representations (size 10) are preferen-

tial for this small dataset. We see some degradation

with adversarial training, but nevertheless we are able

to eliminate private information almost entirely with

representations of size 10 and λ = .1 while sacriﬁc-

ing only a small proportion of performance (accu-

racy@10 of 2.88% instead of the 3.05% for the regu-

lar system, gender information gap of 0.37% and age

information gap of 0.41% from the majority group).

Together, these results show the existence of the

privacy leakage in a typical recommendation system,

and the ability to eliminate it with privacy-adversarial

training while harming the overall system perfor-

mance only marginally.

Table 4: Recommendation System performance (accu-

racy@10) with privacy-adversarial training. First column

corresponds to the regular recommendation system, and the

following columns to privacy-adversarial training with the

prescribed value of λ. Rows correspond to the size of user

and item representations.

size / λ 0 .01 .1 1 10

10 3.05 2.76 2.88 2.43 2.67

20 3.00 2.68 2.04 2.20 2.07

50 2.65 2.38 2.22 2.22 2.15

6 CONCLUSIONS

In this paper we discuss information leakage from

user representations of Latent factor recommender

systems, and show that private demographic informa-

tion can be read-out even when not used in the train-

ing data. We adapt the adversarial training frame-

work in the context of privacy in recommender sys-

tems. An adversarial component is appended to the

model for each of the demographic variables we want

to obfuscate, so that the learned user representations

are optimized in a way that precludes predicting these

variables. We show that the proposed framework has

the desired privacy preserving effect, while having a

minimal overall adverse effect on recommender per-

formance, when using the correct value of the trade-

off parameter λ. Our experiments show that this value

should be determined for a given dataset, since values

too large lead to instability of the adversarial com-

ponent. But in any case, as suggested by (Elazar and

Goldberg, 2018), when concerened with sensitive fea-

tures, one should verify the amount of information in

the representation with additional post-hoc classiﬁer.

The adversarial method can be used to obfuscate

any private variable known during training (in this

paper we discuss categorical variables, but the gen-

eralization to the continuous case is trivial). While

at ﬁrst glance this may be seen as a shortcoming of

the approach, it is interesting to note that it would be

inherently infeasible to force the representation not

to include any factor implicitly associated with item

choice. Clearly, in such a case there would be no

information left to drive recommendations. The in-

tended use of the method is rather to hide a small set

of protected variables known during training, while

using the rest of the implicit information in the usage

data to drive recommendations.

An interesting topic for further research is the

amount of private information that is available in the

top-k recommendations themselves. Since the sole

reason private demographic information is present in

the user representations is to help drive recommenda-

tions, it stands to reason that it would be possible to

design a method of reverse-engineering in the form

of a readout from the actual recommendations. Such

a leakage, to the extent that it indeed exists, would

have much further reaching practical implications for

privacy and security of individuals.

Another topic for further research is the use of

privacy-adversarial training to boost the personaliza-

tion and speciﬁcity of recommendations. By elimi-

nating the demographic (or other proﬁle related) in-

formation, suggested items are coerced out of stereo-

typical templates related to coarse proﬁling. It is our

Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations

481

hope that user testing will conﬁrm that this leads to

deeper and more meaningful user models, and overall

higher quality recommendations.

REFERENCES

Adomavicius, G. and Tuzhilin, A. (2015). Context-aware

recommender systems. In Recommender systems

handbook, pages 191–226. Springer.

Bennett, J., Lanning, S., et al. (2007). The netﬂix prize.

In Proceedings of KDD cup and workshop, volume

2007, page 35. New York, NY, USA.

Berlioz, A., Friedman, A., Kaafar, M. A., Boreli, R., and

Berkovsky, S. (2015). Applying differential privacy

to matrix factorization. In Proceedings of the 9th

ACM Conference on Recommender Systems, pages

107–114. ACM.

Beutel, A., Chen, J., Zhao, Z., and Chi, E. H. (2017). Data

decisions and theoretical implications when adver-

sarially learning fair representations. arXiv preprint

arXiv:1707.00075.

Chen, Y., Harper, F. M., Konstan, J., and Li, S. X. (2010).

Social comparisons and contributions to online com-

munities: A ﬁeld experiment on movielens. American

Economic Review, 100(4):1358–98.

Covington, P., Adams, J., and Sargin, E. (2016). Deep

neural networks for youtube recommendations. In

Proceedings of the 10th ACM Conference on Recom-

mender Systems, pages 191–198. ACM.

Elazar, Y. and Goldberg, Y. (2018). Adversarial removal

of demographic attributes from text data. In Proceed-

ings of the 2018 Conference on Empirical Methods in

Natural Language Processing.

Friedman, A., Berkovsky, S., and Kaafar, M. A. (2016).

A differential privacy framework for matrix factoriza-

tion recommender systems. User Modeling and User-

Adapted Interaction, 26(5):425–458.

Ganin, Y., Ustinova, E., Ajakan, H., Germain, P.,

Larochelle, H., Laviolette, F., Marchand, M., and

Lempitsky, V. (2016). Domain-adversarial training of

neural networks. The Journal of Machine Learning

Research, 17(1):2096–2030.

Harper, F. M. and Konstan, J. A. (2016). The movielens

datasets: History and context. ACM Transactions on

Interactive Intelligent Systems (TiiS), 5(4):19.

Jung, J. J. (2012). Attribute selection-based recommenda-

tion framework for short-head user group: An empiri-

cal study by movielens and imdb. Expert Systems with

Applications, 39(4):4049–4054.

Koren, Y. (2008). Factorization meets the neighborhood:

a multifaceted collaborative ﬁltering model. In Pro-

ceedings of the 14th ACM SIGKDD international con-

ference on Knowledge discovery and data mining,

pages 426–434. ACM.

Liu, Z., Wang, Y.-X., and Smola, A. (2015). Fast differen-

tially private matrix factorization. In Proceedings of

the 9th ACM Conference on Recommender Systems,

pages 171–178. ACM.

McSherry, F. and Mironov, I. (2009). Differentially private

recommender systems: Building privacy into the net-

ﬂix prize contenders. In Proceedings of the 15th ACM

SIGKDD international conference on Knowledge dis-

covery and data mining, pages 627–636. ACM.

Miller, B. N., Albert, I., Lam, S. K., Konstan, J. A., and

Riedl, J. (2003). Movielens unplugged: experiences

with an occasionally connected recommender system.

In Proceedings of the 8th international conference on

Intelligent user interfaces, pages 263–266. ACM.

Narayanan, A. and Shmatikov, V. (2008). Robust de-

anonymization of large sparse datasets. In Security

and Privacy, 2008. SP 2008. IEEE Symposium on,

pages 111–125. IEEE.

Nikolaenko, V., Ioannidis, S., Weinsberg, U., Joye, M., Taft,

N., and Boneh, D. (2013). Privacy-preserving ma-

trix factorization. In Proceedings of the 2013 ACM

SIGSAC conference on Computer & communications

security, pages 801–812. ACM.

Pazzani, M. J. (1999). A framework for collaborative,

content-based and demographic ﬁltering. Artiﬁcial in-

telligence review, 13(5-6):393–408.

Peralta, V. (2007). Extraction and integration of movielens

and imdb data. Laboratoire Prisme, Universit

e de Ver-

sailles, Versailles, France.

Rendle, S., Freudenthaler, C., Gantner, Z., and Schmidt-

Thieme, L. (2009). Bpr: Bayesian personalized rank-

ing from implicit feedback. In Proceedings of the

twenty-ﬁfth conference on uncertainty in artiﬁcial in-

telligence, pages 452–461. AUAI Press.

Shen, Y. and Jin, H. (2014). Privacy-preserving person-

alized recommendation: An instance-based approach

via differential privacy. In Data Mining (ICDM), 2014

IEEE International Conference on, pages 540–549.

IEEE.

Weinsberg, U., Bhagat, S., Ioannidis, S., and Taft, N.

(2012). Blurme: Inferring and obfuscating user gen-

der based on ratings. In Proceedings of the sixth ACM

conference on Recommender systems, pages 195–202.

ACM.

Xie, Q., Dai, Z., Du, Y., Hovy, E., and Neubig, G. (2017).

Controllable invariance through adversarial feature

learning. In Advances in Neural Information Process-

ing Systems, pages 585–596.

Zemel, R., Wu, Y., Swersky, K., Pitassi, T., and Dwork, C.

(2013). Learning fair representations. In International

Conference on Machine Learning, pages 325–333.

Zhang, B. H., Lemoine, B., and Mitchell, M. (2018).

Mitigating unwanted biases with adversarial learning.

arXiv preprint arXiv:1801.07593.

Zhao, X. W., Guo, Y., He, Y., Jiang, H., Wu, Y., and

Li, X. (2014). We know what you want to buy: a

demographic-based system for product recommenda-

tion on microblogs. In Proceedings of the 20th ACM

SIGKDD international conference on Knowledge dis-

covery and data mining, pages 1935–1944. ACM.

ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods

482