Using Demographic Features for the Prediction of Basic Human Values

Underlying Stakeholder Motivation

Adam Szekeres, Pankaj Shivdayal Wasnik and Einar Arthur Snekkenes

Department of Information Security and Communication Technology, Norwegian University of Science and Technology,

Keywords:

Information Security Risk Management, Stakeholder Motivation, Psychological Perspective, Motivational

Proﬁles.

Abstract:

Human behavior plays a signiﬁcant role within the domain of information security. The Conﬂicting Incentives

Risk Analysis (CIRA) method focuses on stakeholder motivation to analyze risks resulting from the actions of

key decision makers. In order to enhance the real-world applicability of the method, it is necessary to charac-

terize relevant stakeholders by their motivational proﬁle, without relying on direct psychological assessment

methods. Thus, the main objective of this study was to assess the utility of demographic features-that are

observable in any context-for deriving stakeholder motivational proﬁles. To this end, this study utilized the

European Social Survey, which is a high-quality international database, and is comprised of representative

samples from 23 European countries. The predictive performances of a pattern-matching algorithm and a

machine-learning method are compared to establish the ﬁndings. Our results show that demographic features

are marginally useful for predicting stakeholder motivational proﬁles. These ﬁndings can be utilized in settings

where interaction between a stakeholder and an analyst is limited, and the results provide a solid benchmark

baseline for other methods, which focus on different classes of observable features for predicting stakeholder

motivational proﬁles.

1 INTRODUCTION

Information security is considered to be a highly

technical domain, where research on the human el-

ement gets relatively low attention, given the involve-

ment and impact of individuals on the system’s safety

and security. However, “...people are responsible for

stealing passwords, committing intellectual property

crimes, skimming ﬁnancial accounts, selling informa-

tion to competitors, breaking into databases, cyber-

snooping, and committing a host of other offenses

against organizations and their systems. Ironically,

the disciplines that assess, evaluate, and solve human

based problems have not been an integral part of the

information security measures used to protect data...”

(Gudaitis, 1998). It is suggested that there is a need

for synthesis between various disciplines in order to

improve on the attempts that aim to protect against

threats to information systems. More than a decade

later, Greitzer and Hohimer (Greitzer and Hohimer,

2011) concluded that insider threats ranked among

the most problematic cyber-security challenges that

threaten government and industry information infras-

tructures. Furthermore, they identiﬁed that there were

no systematic methods that provided a complete and

effective approach to preventing undesirable actions

(e.g. data leakage, espionage, and sabotage).

More recent incidents (e.g. using technical exper-

tise and insider privileges to reprogram Smart Meters

(Krebs, 2012), cheating with emission rates (Arora,

2017), ﬁnancial misreporting (Kulik et al., 2008), cre-

ating abusive websites (Franklin, 2014), etc.) also call

for methods that incorporate intentional, deliberate

human behavior into risk assessments. While the spe-

ciﬁc details of the enumerated incidents vary greatly,

they are still united by some common features:

• It is possible to identify a person or a group who

had a strong motivation to take certain actions.

• It is possible to identify a person or a group who

suffered the consequences of those actions but

who were unintentionally exposed to those trans-

actions.

Such situations are recognized in the economic liter-

ature as negative externalities (Liebowitz and Margo-

lis, 1994) and the concept has been applied within the

domain of information security, where motivated ac-

tors have the potential to exert a negative inﬂuence on

Szekeres, A., Wasnik, P. and Snekkenes, E.

Using Demographic Features for the Prediction of Basic Human Values Underlying Stakeholder Motivation.

DOI: 10.5220/0007694203770389

In Proceedings of the 21st International Conference on Enterprise Information Systems (ICEIS 2019), pages 377-389

ISBN: 978-989-758-372-8

377

a large number of other stakeholders who have little

inﬂuence on the outcome of those actions (Anderson

and Moore, 2009).

Assessing stakeholder motivation could be the key

to preparing against such events, since motivation is a

central concept in understanding human behavior; it

aims to answer the question concerning why people

do the things they do (Forbes, 2011). During the past

centuries, researchers have generated a vast number

of theoretical constructs and systems which vary in

the level of the analysis (e.g. instincts, biologically

determined drives, needs, social and cognitive moti-

vations), the scope (e.g. general principles vs. task-

speciﬁc motivations), and the terminology. Through

describing stakeholder motivation we can enable the

prediction of future behaviors and check whether the

likely behavior is in alignment with the goals of other

affected stakeholders. However, people are not ex-

pected to cooperate in any analysis that aims to assess

their motivations for risk-analysis purposes. There-

fore, the main goal of the present study is to contribute

to the information security risk management literature

by investigating the utility of demographic features

for deriving stakeholder motivational proﬁles in con-

texts where no direct interaction between the subject

and analyst is assumed.

Following the Problem Statement and Research

Questions, Section 2 describes the risk analysis

method under development, and it’s connection to the

theory of basic human values. Section 3 explains how

a publicly available high-quality dataset was utilized

in the study, which is followed by describing the re-

sults in Section 4. Section 5 provides an overview of

the conducted work, and Section 6 concludes with di-

rections for future work.

1.1 Problem Statement

The main objective of this work is to investigate how

stakeholder motivation can be predicted by utiliz-

ing publicly observable individual characteristics (e.g.

demographic variables). The end goal is the develop-

ment of a predictive model that can be utilized by an

observer to derive the motivational proﬁle of a pre-

viously unknown subject by collecting and aggregat-

ing various forms of publicly observable features con-

nected to the subject.

1.2 Research Questions

To address the problem statement, the following re-

search questions have been formulated:

1. To what extent can demographic features be uti-

lized to construct stakeholder motivational pro-

ﬁles?

2. How well do different predictive models perform

in terms of inferring stakeholder motivational pro-

ﬁles?

2 RELATED WORK

This section provides an overview of the risk-analysis

method under development, the motivational theory,

and the related constructs that were included in the

study.

2.1 Conﬂicting Incentives Risk Analysis

The importance of understanding stakeholder moti-

vation is emphasized within the Conﬂicting Incen-

tives Risk Analysis (CIRA) method (Rajbhandari and

Snekkenes, 2013). This method identiﬁes the stake-

holders (i.e. individuals), the actions that can be taken

by the stakeholders, as well as the consequences of

these actions. A stakeholder is a physical person who

has some interest in the outcomes of his actions. The

procedure identiﬁes two types of stakeholders: the

Strategy owner (the person who is capable of execut-

ing an action) and the Risk owner (whose perspective

is taken-the person at risk). Each stakeholder’s moti-

vation is modeled on the concept of utility, which en-

tails the consideration of the beneﬁt of the action per-

formed from the perspective of the stakeholder. This

cumulative utility encompasses several utility factors,

each representing aspects of life considered impor-

tant by the corresponding stakeholders. Two types of

risks are identiﬁed in the method: Threat risk refers

to the perceived decrease in the total utility of the risk

owner and Opportunity Risk refers to the lack of po-

tential increase in utility because the strategy owner

is not motivated enough to take actions that would be

beneﬁcial for the Risk owner. Therefore, risk is con-

ceptualized as a misalignment of incentives between

these two classes of stakeholders, and risk identiﬁca-

tion is about uncovering activities that would be ben-

eﬁcial for the Strategy owner, and potentially harmful

for the Risk owner, or vice versa (Snekkenes, 2013).

Therefore, Threat risk closely resembles the concept

of moral hazard; it captures a wide range of behav-

iors that are beneﬁcial for one party and detrimental

for another (i.e. the strategy owner inﬂicting negative

externalities on the risk owner) (Dembe and Boden,

2000). Previous work explored the feasibility of in-

ferring key stakeholders’ motivational proﬁles based

on the linguistic analysis of interviews given by inac-

cessible subjects (Szekeres and Snekkenes, 2018).

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

378

2.2 Theory of Basic Human Values

The theory of basic human values, developed by

Schwartz, (Schwartz, 1994) identiﬁes ten distinct val-

ues that are universally recognized across various cul-

tures, and it provides a uniﬁed and comprehensive

view on human motivation. The theory incorporates

several previous approaches that emphasized the cen-

trality of values in human behavior (e.g. Hofstede and

Rokeach on cultural differences (Schwartz, 1992)).

Values both represent desirable end-goals and pre-

scribe desirable ways of acting. Schwartz summarizes

the six core features that characterize values:

• “Values are beliefs linked to affect.

• Values refer to desirable goals that motivate ac-

tions.

• Values transcend speciﬁc actions and situations.

• Values serve as standards or criteria.

• Values are ordered by importance.

• The relative importance of multiple values guide

actions.”

Furthermore, all of the ten distinct values in the

theory encapsulate one of the three key motivational

aspects that are grounded in the universal require-

ments of human existence: the needs of individuals

as biological organisms, the requisites of coordinated

social interaction, and the survival and welfare needs

of groups. Values guide behavior, given that the de-

cision context, or situation activates the relevant val-

ues. The ten values form a circular structure that cap-

tures a motivational continuum, where adjacent val-

ues are compatible with each other, while opposing

values are in conﬂict. The ten values are grouped un-

der four higher dimensions, as represented by Figure

1 (Schwartz, 2012).

Goldberg, Sweeney, Merenda, and Hughes (Gold-

berg et al., 1998) describe how one of the most en-

during topics in the history of psychometrics is the

strength of association between group and individ-

ual differences, and the many controversies centered

around the issue of how various demographically de-

ﬁned groups differ in terms of important human at-

tributes. In their study, they investigated the dif-

ferences between the Big Five personality traits and

four demographic variables (i.e. gender, age, ed-

ucation, and ethnic status). The study concluded

that most demographic-personality associations are of

trivial size, with an average correlation of 0.08 (across

the four demographic variables and the ﬁve person-

ality dimensions included in the study). However,

these results are not directly comparable to the value-

demographic association yet, they nevertheless pro-

Figure 1: Circular value structure, with 4 higher dimensions

comprising of the 10 basic human values.

vided some initial insights into the strength of asso-

ciations between demographic features and psycho-

logical variables. Schwartz (Schwartz, 2007) dis-

cusses the reciprocal relationship between value pri-

orities and life circumstances and provides empirical

evidence on the hypothetical relationships. Choices

guided by values inﬂuence the life circumstances, but

certain life circumstances (e.g. the type of profes-

sion, raising children, etc.) also affect the possibility

of, and constraints placed upon, enacting particular

choices. People tend to adapt their values to ﬁt into

their life circumstances by upgrading the importance

of values that are readily attainable, while downgrad-

ing the importance of values of which the pursuit

is blocked. As people’s demographic variables (e.g.

age, gender, education, income level, etc.) largely

impact the circumstances to which they are exposed,

these differences are expected to have a direct effect

on the value priorities. Based on the value system’s

structure, the following subsections present validated

and hypothesized relationships between demographic

variables and value priorities based on (Schwartz,

2007).

2.2.1 Age

Due to the general decline of physical strength and

cognitive abilities, aging is expected to increase the

importance of Security values, as the capacity to deal

with change declines. Therefore, the opposing Stim-

ulation value might decrease in importance as novelty

and risk is viewed as increasingly threatening. Con-

formity and Tradition values might increase in impor-

tance, while Hedonism could potentially decrease due

to the dulling of the senses. Achievement and Power

values may also decrease in importance since older

people become less able to perform demanding tasks

Using Demographic Features for the Prediction of Basic Human Values Underlying Stakeholder Motivation

379

and obtain social approval.

2.2.2 Life Stages

In early adulthood people are primarily concerned

with establishing themselves within the domains of

work and family. The pursuit of Achievement and

Stimulation values comes at the expense of the Se-

curity, Conformity, and Tradition values. Later, the

motivation shifts to preserving the status already at-

tained, both in the professional and in the family do-

mains. The possibility of radical change narrows and

responsibilities constrain the opportunities for risk-

taking. Taking these factors into consideration, it is

expected that people in their middle adulthood ex-

press a stronger preference for values encompassed

in the Conservation category. At later stages, close to

retirement, the opportunities for expressing Achieve-

ment, Power, Stimulation, and Hedonism values fur-

ther decrease.

2.2.3 Gender

In a cross-cultural, large scale study, Schwartz and

Rubel investigated gender differences in value pri-

orities (Schwartz and Rubel, 2005). The ﬁndings

suggest that men attribute more importance to Self-

enhancement and Openness to change values than

women do, while for Self-transcendence values, the

reverse is true. The differences are generally small,

and account for less variance than age and culture do,

for example.

2.2.4 Education

An explanation for the association between the level

of education and the values is offered in (Schwartz,

2007). According to the hypothesis education re-

quires intellectual openness, and ﬂexibility that is as-

sociated with Self-direction values. Challenging ex-

isting views and norms can be linked to a lower im-

portance assigned to Conservation values, as they pro-

mote conformity and tradition. Furthermore, there

might be a positive correlation with Achievement val-

ues as performance and meeting external standards is

increasingly important as the level of education rises.

2.2.5 Country

The challenges faced by nations in organizing hu-

man activities are similar, but nations differ in the

importance they attribute to certain values (Schwartz,

2013). When values are analyzed at the societal level,

three bipolar dimensions can be identiﬁed based on

the alternative resolutions to each of the problems af-

fecting all societies: Embeddedness vs. Autonomy

(affective and intellectual), Hierarchy vs. Egalitari-

anism, and Mastery vs. Harmony. The importance

assigned by various countries to the previous dimen-

sions gives rise to eight distinct cultural regions, rep-

resenting vague differences among cultures: Western

Europe, East-Central Europe, Eastern Europe, Latin

America, English-Speaking, Confucian, South-East

Asia, and Africa-Middle East.

2.2.6 Occupation

Another study by Knafo and Sagiv (Knafo and Sagiv,

2004) investigated the relationship between values

and occupational choices. The survey-based study

showed that the 32 occupations under investigation

clustered according to the motivational proﬁles of the

individuals within the profession, and that these clus-

ters ﬁt well into Holland’s work typology. Universal-

ism values negatively correlated with the Enterprising

work environment, while Social environments corre-

lated positively with both Universalism and Benev-

olence values, and correlated negatively with power

and Achievement values. Artistic work environments

correlated negatively with Conformity values while

the Investigative environments correlated positively

with Openness to change values.

These results suggest that there are meaningful

and detectable differences among various groups of

people. However, to our knowledge, there is no ex-

isting study that investigates how well the motiva-

tional proﬁle can be predicted when solely based upon

demographic features. Therefore, this study aims

to establish predictive models from a high-quality

database that contains representative samples from 23

European countries.

3 MATERIALS AND METHODS

3.1 Sample and Procedure

The European Social Survey (ESS), round 8, edition

2.0, (N.A., 2018) served as the main source of an-

swers to the research questions. The high-quality cu-

mulative dataset contains individual-level data from

23 countries (Austria, Belgium, the Czech Repub-

lic, Estonia, Finland, France, Germany, Hungary, Ice-

land, Ireland, Israel, Italy, Lithuania, the Netherlands,

Norway, Poland, Portugal, the Russian Federation,

Slovenia, Spain, Sweden, Switzerland, and the United

Kingdom), gathered using strict probability sampling

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

380

methods. The survey’s main objectives are to moni-

tor and interpret changing public attitudes in Europe,

to investigate relevant societal issues, and to establish

social indicators across Europe. The original dataset

contains a total of (n = 44 387) individual respondents

with 536 variables. The ESS has been conducted ev-

ery two years since 2001 across European many coun-

tries. The survey consists of two main parts:

• The core module - covers a wide range of top-

ics (e.g. politics, social trust, household, socio-

demographics, human values, etc.) that largely

remain the same in each round to allow for lon-

gitudinal observations.

• The rotating module - increases the scope of the

survey by focusing on speciﬁc topics between dif-

ferent times of administration (e.g. immigration,

economic morality, justice, democracy, climate

change, etc.)

3.2 Measures

In order to address the research questions, the fol-

lowing preparation procedures were conducted on the

original cumulative dataset. In the ﬁrst step, the

complete list of variables (N

vars

= 536) was screened

and then it was sorted into four main categories (de-

mographics, attitudes, behaviors, and others). The

next step focused on identifying the demographic at-

tributes that met the inclusion criteria (i.e. the predic-

tor variables should be publicly observable and easily

identiﬁable by an observer). This resulted in a list of

demographic variables being included in the present

analysis (N

vars

= 14), accompanying the basic human

values. Table 1 contains the list of independent vari-

ables selected for the analysis. We aimed at maxi-

mizing the number of subjects with valid responses,

therefore, the next step was to investigate the num-

ber of missing values in the sample. Since our ob-

jective was to analyze the predictability of the moti-

vational proﬁles of individuals who are actively em-

ployed we used a listwise deletion of subjects with

missing values on any of the remaining variables. The

listwise removal of data is justiﬁed by the fact that

most of the missing data was attributed to four vari-

ables associated with employment relations (the last

four variables in Table 1), with a not-applicable la-

bel (e.g. the not actively working age-group) which

contributed to a total of 7255 subjects with miss-

ing data, while the remaining missing data (n = 385)

was distributed among the ten other independent vari-

ables (with the labels: refusal, do not know, no an-

swer, not available). While it was not possible to

determine whether the data was missing at random,

Table 1: List of observable features used as predictors.

Categorical

variable

(Yes/No)

Number of

categories

Country Y 23

Gender Y 2

Age N -

Domicile Y 5

Belonging

to religion

Y 2

Belonging

to a minority

ethnic group

Y 2

Number of people

living in the same

household

N -

Living with partner Y 2

Ever had a divorce Y 2

Highest level

of education

N -

Employment relation

Y 3

Supervising others at work

Y 2

Type of industry

working in (NACE rev.2)

Y 21

Type of organization

working for

Y 6

completely at random, or not at random for the re-

maining small number of cases, the relatively small

number enabled deletion without introducing a bias

into the models. Additionally, the 89 levels of vari-

able "Type of industry working for" were grouped ac-

cording to the NACE rev. 2. section codes, result-

ing in 21 higher level groups (Eurostat, 2008) pro-

viding larger groups within occupational categories.

The ESS dataset contains raw responses for the Hu-

man Values Scale, which is a 21-item survey instru-

ment designed for self-assessment. In order to com-

pute ground-truth scores from the raw item-level re-

sponses, we followed the procedures described in the

accompanying manual (Schwartz, 2016). Finally, all

dependent variables (the ten basic values) were nor-

malized to a range of [0-1] through the following

method: X

X−X

min

max

−X

min

, since it is provides a linear

transformation and keeps the relationships among the

original data (Patro and Sahu, 2015).

4 RESULTS

This section describes the experiments conducted on

the ESS dataset and the results obtained from two dif-

ferent types of analytic techniques. All subjects with

valid responses on the 14 features were included in

the ﬁnal analyses (n = 36 747): 48.5% of the subjects

were males and the mean age of all respondents was

Using Demographic Features for the Prediction of Basic Human Values Underlying Stakeholder Motivation

381

50.41 years (SD = 17.55). Furthermore, the database

was randomized and divided into three sets:

• Training set: 60%

• Development set: 20%

• Testing set: 20%

4.1 Multiple Linear Regression

Approach

Several multiple linear regressions (LRs) were con-

ducted to identify the most suitable set of features that

can be utilized for predicting the human value scores

based on the observable features presented in Table

1. This part of the analysis was conducted using IBM

SPSS 25’s automatic linear modeling module, which

includes supervised merging of the categories, out-

lier detection, and several feature-selection methods

(Yang, 2013). For each of the ten basic values, the

ﬁrst step involved the assessment of the maximum

possible predictive accuracy by using all the features,

which aided us in providing an estimate of the highest

potential accuracy achievable. Next, predictors were

entered into the models using the forward stepwise

selection algorithm. At each step, variables not yet

included in the model were tested for inclusion until

no variables met the inclusion criteria, using a limit

of 4 as the maximum number of effects in the ﬁnal

model. This reﬂects a decision to trade-off a marginal

improvement in accuracy for a simpler model with

lower costs in terms of data collection. The procedure

resulted in two models for each of the ten values, as

shown in Table 2. Performance was measured by the

(coefﬁcient of determination), ranging between 0-

1, which is a well-established, common measure of

the success of predicting the dependent variable from

the independent variables (Nagelkerke et al., 1991).

Formula: R

= 1−

res

tot

, where SS

res

is the sum of the

residual squares and SS

tot

is the total sum of squares.

This procedure enabled us to assess the observable

feature’s utility in terms of predicting the ten basic

values, and to identify an optimal set of features that

can sufﬁciently cover all the basic human values con-

sidering the added utility of each feature relative to

what is already included in the model.

Table 2: Statistics of R

values for the Linear Regression

approach. In the last column, values in parentheses repre-

sent the number of features used in the ﬁnal model.

Max

possible R

Final

Achievement 0.23 0.16 (2)

Benevolence 0.22 0.16 (2)

Conformity 0.17 0.11 (2)

Hedonism 0.22 0.18 (2)

Power 0.24 0.18 (1)

Security 0.20 0.12 (3)

Self-Direction 0.16 0.09 (3)

Stimulation 0.16 0.09 (2)

Tradition 0.24 0.14 (4)

Universalism 0.18 0.13 (3)

Figure 2 presents each dependent variable with the

best set of demographic variables, that account for the

largest amount of explained variance (see the ’Final

’ column from Table 2 for the corresponding mod-

els). The colored bars represent demographic features

that were included in the ﬁnal models and their length

represents the amount of variance explained by the

corresponding variable. The white bars represent the

amount of unexplained variance for each value, and

as such, they express the amount of remaining uncer-

tainty regarding a subject’s motivational proﬁle. Fig-

ure 4 and Figure 5 in the Appendix provides the de-

83.5%

83.7%

88.5%

82.4%

87.8%

90.6%

91.0%

85.5%

87.0%

0% 25% 50% 75% 100%

Achievement

Benevolence

Conformity

Hedonism

Power

Security

Self-Direction

Stimulation

Tradition

Universalism

Country Age Religion Gender Education NACE Employment relation Unexplained Variance

Figure 2: Feature importance for predicting the 10 basic human values from observable features by the LR approach relative

to unexplained variance expressed in terms of R

scores.

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

382

tails of all the ﬁnal regression models for each of the

ten values.

4.2 Machine Learning Approach

This experiment utilized a machine learning (ML) ap-

proach for the prediction of the same set of basic

human values. The regression models were trained

using the H

O.ai API, which is an open-source ML

platform (H2O.ai, 2018b). The Distributed Random

Forest (DRF) regression algorithm was chosen for

building models for each of the ten values separately,

since the algorithm can properly handle categorical

variables with several levels (H2O.ai, 2018a), and

also provides useful internal estimates of error, cor-

relation, and variable importance metrics (Breiman,

2001). Furthermore, when given a training dataset,

the DRF creates a forest of classiﬁcation (or regres-

sion trees) instead of a single tree.

4.2.1 DRF Training

During the training stage, the models were trained

using a 5-fold cross validation procedure to obtain

the ﬁnal model of the training set. Table 3 presents

the mean and the standard deviation of the root-mean

square error (RMSE) scores for all of the ﬁve folds.

Table 3: Mean and SD of RMSE and R

for 5 fold cross

validation training.

RMSE R

Dependent

Variable Mean SD Mean SD

Achievement 0.128 0.0002 0.141 0.0090

Benevolence 0.098 0.0009 0.126 0.0096

Conformity 0.127 0.0005 0.097 0.0041

Hedonism 0.106 0.0005 0.139 0.0134

Power 0.120 0.0004 0.159 0.0033

Security 0.112 0.0004 0.109 0.0095

Self-Direction 0.113 0.0013 0.072 0.0034

Stimulation 0.114 0.0008 0.074 0.0064

Tradition 0.104 0.0006 0.122 0.0092

Universalism 0.102 0.0008 0.106 0.0070

The RMSE scores indicate the absolute ﬁt of the

model as it is the square root of the variance of the

residuals in the prediction model. As such it is a

good measure of the model’s predictive accuracy. The

RMSE can be interpreted as the standard deviation

of the unexplained variance and it has the same unit

as the dependent variable (Grace-Martin, 2008). The

models were tuned on the hyperparameter ’number of

trees’ using the development set. The hyperparame-

ter tuning favoured a higher number of trees. How-

ever, increasing the number of trees beyond 50 did

not result in a signiﬁcant improvement in terms of the

RMSE. Therefore, for all of the ten models, 50 tree-

solutions were selected.

4.2.2 DRF Testing

In the testing phase, the accuracy of the trained mod-

els was veriﬁed using the testing set. Table 4 reports

the RMSE and R

performance metrics for each vari-

able with additional comparisons between random

guessing and speciﬁcally guessing the mean values

for each of the dependent variables. This part of the

experiment enabled an assessment of the model’s su-

periority over various types of educated guesses.

Table 4: RMSE score comparison for each variable between

Machine Learning model (ML), Mean Guessing (MG), and

random guessing (RG).

Dependent Variable

ML MG RG

Achievement 0.1282 0.1376 0.1393

Benevolence 0.0974 0.1046 0.1485

Conformity 0.1267 0.1328 0.1454

Hedonism 0.1056 0.1133 0.1134

Power 0.1195 0.1293 0.1293

Security 0.1134 0.1195 0.1515

Self-Direction 0.1146 0.1180 0.1303

Stimulation 0.1144 0.1182 0.1244

Tradition 0.1031 0.1100 0.1445

Universalism 0.1017 0.1081 0.1086

Furthermore, Figure 3 reports the mean impor-

tance of the features across all of the ten basic hu-

man values based on the average contribution of each

feature to the overall explained variance. Since these

scores represent the average contributions across all

of the values, it should be noted that certain values can

be predicted with higher and lower accuracy, and the

cost of obtaining certain demographic features should

be considered during data collection.

4.3 Comparison of Approaches

Finally, a comparison between the predictive perfor-

mance of the two approaches is presented in Table

5, across all of the dependent variables in terms of

both the R

and RMSE scores. Since the interpreta-

tion of R

scores is relatively straightforward as the

percentage of variability explained in the dependent

variable by the independent variables, for the pur-

pose of comparison, this measure of goodness of ﬁt is

used. In the case of both approaches, the predictabil-

ity of Power is the highest, implying that Power can

Using Demographic Features for the Prediction of Basic Human Values Underlying Stakeholder Motivation

383

Age, 2.09%

IndustryTypeNACE2Reduced, 1.78%

Country, 1.40%

Domicile, 0.91%

LevelOfEducation, 0.91%

NumberOfPplLivingInHouse, 0.79%

TypeOfOrganisation, 0.75%

BelongToReligion, 0.41%

SupervisionOfOthers, 0.37%

LivesWithPartner, 0.37%

Gender, 0.34%

EverHadDivorce,

0.33%

EmploymentRelation, 0.32%

BelongToEthnicGroup,

0.22%

Unexplained

variance:

88.99%

Explained

variance:

11.01%

Figure 3: Mean feature importance for predicting the 10 basic human values from observable features by ML approach.

be predicted with the highest accuracy from the avail-

able set of demographic variables. On the other hand,

Self-direction and Stimulation values are at the low-

est end of predictability, which indicates that demo-

graphic features are less useful for inferring these par-

ticular values. While the LR approach shows slightly

better performance than the ML approach in terms of

scores across all of the dependent variables, both

data-analytic approaches converge on similar overall

results in terms of predictive performance, which fur-

ther consolidates the ﬁndings.

Table 5: Predictive performance comparison of machine

learning (ML) and linear regression (LR) approaches in

terms of R

and RMSE scores.

ML approach LR approach

Dependent

Variable R

RMSE R

RMSE

Achievement 0.13 0.128 0.16 0.127

Benevolence 0.14 0.097 0.16 0.095

Conformity 0.09 0.127 0.11 0.126

Hedonism 0.12 0.106 0.18 0.104

Power 0.15 0.120 0.18 0.118

Security 0.08 0.113 0.12 0.113

Self-Direction 0.07 0.115 0.09 0.113

Stimulation 0.08 0.114 0.09 0.114

Tradition 0.12 0.103 0.14 0.102

Universalism 0.11 0.102 0.13 0.101

5 DISCUSSION

The main objective of this study was to assess the util-

ity of demographic features in predicting stakeholder

motivation, operationalized as the basic human val-

ues. We have shown through a set of experiments how

these observable attributes can be utilized for predict-

ing a subject’s motivational proﬁle. The results sug-

gest that the overall predictability of these psycholog-

ical variables from demographic features is relatively

low, but that the usefulness of such assessments is

highly dependent on the context in which the results

are to be used. In cases where no prior information

is available, even a slight reduction in uncertainty can

be signiﬁcant and worth the effort of gathering addi-

tional, easily observable features.

A study by Kosinski, Stillwell, and Graepelsing

(Kosinski et al., 2013) has demonstrated how a set

of psychological constructs (the Big 5 traits) can be

predicted from online behavioral traces. Firstly, the

study showed that certain differences can be expected

among the Big 5 traits in their level of predictability:

Openness (r = 0.43), Extraversion (r = 0.40), Neu-

roticism and Agreeableness (r = 0.3), and Conscien-

tiousness (r = 0.29), covering a range between 8.41

and 18.49 in terms of the R

. Considering that the

present study only relied on demographic features, the

level of predictability matched closely, even though

behavioral features might convey a lot more informa-

tion about latent traits. Furthermore, the aforemen-

tioned study compared the predictive accuracy ob-

tainable from observable features, to the predictive

accuracy achievable by administering the same psy-

chometric instrument for the same respondent at two

points in time. The correlation between these scores

(test-retest reliability) varies between r = 0.55-0.75,

indicating a possible upper bound in terms of the pre-

dictability of relatively stable psychological traits by

standard, validated instruments.

The experiments conducted with the ML approach

established that the model’s performance is superior

to random guessing, as well as educated guessing (e.g.

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

384

a guess of the group means), and that the LR ap-

proach had a higher level of performance when us-

ing different combinations of predictor variables, but

also that most of these differences are only marginal.

The differences could be attributed to the automated

data preparation in the case of the LR approach,

which shows the implementation’s additional useful-

ness during the analysis of complex survey data.

In sum, country, age, and type of industry one is

working for are the most important features that can

be easily obtained and used for the prediction of the

majority of basic values from the available set of fea-

tures included in the ESS dataset. Therefore, identi-

ﬁcation and inclusion of other demographic features

(which might be more difﬁcult to obtain) do not nec-

essarily provide additional predictive utility. This is

important knowledge for an analyst when consider-

ing the cost-beneﬁt of gathering a greater amount of

descriptive data with the intention of achieving higher

accuracy. In order to identify potentially more useful

predictor variables, further studies will focus on fea-

tures that reﬂect previous choices in a subject’s his-

tory.

5.1 Legal and Ethical Considerations

It should be noted that there are important legal and

ethical aspects when human subjects are involved

both in research and in the real-world application of

the described proﬁling method. For this reason it is

necessary to outline and separate the conditions un-

der which the method’s application can be considered

ethical or legal. While the distinction between law

and ethics is often unclear, they are fundamentally

different (Hvinden et al., 2016). Both are normative,

but ethical norms are formulated as guidelines rather

than as prescriptions and prohibitions. Ethics is a col-

lection of fundamental concepts and guidelines that

informs individuals about desirable actions in certain

situations. Legislation, on the other hand, refers to

a systematic body of rules and regulations in writ-

ten form that aim to govern the behavior of individu-

als within the boundaries of a particular organization

(e.g. country) and unlawful activities are penalized

and sanctioned. The difference between ethics and

law is also expressed in the corresponding documents.

Ethical guidelines (e.g. the Guidelines for Re-

search Ethics in the Social Sciences, Humanities,

Law, and Theology (Hvinden et al., 2016)) developed

for conducting research with human participants re-

quire: respect for human dignity, privacy, safeguard-

ing against harm, compliance with the duty to inform,

and the obtaining of the participant’s consent, espe-

cially in cases where sensitive personal data is col-

lected. There are also exceptions from the main rule

concerning informed consent e.g. observation in pub-

lic arenas, public ﬁgures, if the research does not in-

volve direct contact with the participants, and in cases

where information cannot be provided before the re-

search is initiated because it would affect the out-

comes of the experiment. These exceptions must be

justiﬁed by proving the they add value to the research

and by demonstrating the lack of alternative options.

Laws vary with time and across territories; there-

fore, it is crucial to have an up-to-date and contextual

understanding of the legal regulations concerning any

activity. Different laws have been developed for the

collection and protection of personal data across na-

tions. Member states of the European Union (EU) and

the European Economic Area (EEA) have opted for

an all-encompassing regulation named the European

General Data Protection Regulation (GDPR) (Euro-

pean Union, 2016). The GDPR requires that the pro-

cessing of personal (linkable to a person) and sensi-

tive data (health, race or ethnic background, sexuality,

political, or religious beliefs) should be done with free

and informed consent, and that data processors are

required to protect the privacy of respondents, and,

therefore, ensure conﬁdentiality. A different approach

is used by the United States, which implements vari-

ous sector-speciﬁc data protection laws that work to-

gether with state level legislation (e.g. HIPAA, NIST

800-171, the Gramm-Leach-Bliley Act, the Federal

Information Security Management Act) (Coos, 2018).

The overview on the legal and ethical aspects

aimed to highlight some important issues that have

to be taken into consideration when it comes to either

the development or the application of any proﬁling

method.

6 CONCLUSIONS

This study aimed at increasing the real-world appli-

cability of the CIRA method that addresses human-

related risks within the domain of information se-

curity. The method focuses on stakeholder motiva-

tion and requires the inference of motivational pro-

ﬁles without direct involvement of the stakeholders.

Therefore, we investigated the usefulness of easily

observable demographic features for inferring stake-

holder motivational proﬁles. By analyzing a high-

quality dataset from representative European sam-

ples, and utilizing various data-analytic approaches,

we showed that demographic features have some lim-

ited usefulness in terms of deriving stakeholder mo-

tivation. While the analysis was limited to respon-

dents from European countries, cultural differences

Using Demographic Features for the Prediction of Basic Human Values Underlying Stakeholder Motivation

385

account for the majority of variances explained. In

sum, these results are useful for characterizing indi-

viduals’ motivational proﬁles especially, when lim-

ited access to subjects is assumed, and in cases where

subjects might be motivated to answer dishonestly to

direct questions. While the primary application of

these results is the CIRA method of risk analysis,

other domains could beneﬁt from predicting inacces-

sible subject’s motivational proﬁles, especially where

decisions are characterized by trade-offs between var-

ious objectives and have great potential impact (e.g.

intelligence analysis, operations research, etc.). Fu-

ture work may expand the analysis to include other re-

gions of the world (e.g. USA, Eastern-cultures) to in-

vestigate whether the predictability of value proﬁles is

affected by deeper cultural differences. Finally, these

ﬁndings provide a solid benchmarking baseline for

other future work, which will investigate other classes

of observable features for inferring motivational pro-

ﬁles. More speciﬁcally, observables that represent the

outcome of a conscious decision process (e.g. own-

ership of items, style, etc.) will be analyzed in terms

of their capability to provide insight into the decision-

maker’s value structure.

ACKNOWLEDGEMENTS

The authors would like to thank the anonymous re-

viewers for their constructive comments and sugges-

tions for improving the paper.

This work was partially supported by the project

IoTSec – Security in IoT for Smart Grids, with

number 248113/O70 part of the IKTPLUSS program

funded by the Norwegian Research Council.

REFERENCES

Anderson, R. and Moore, T. (2009). Information security:

where computer science, economics and psychology

meet. Philosophical Transactions of the Royal Soci-

ety of London A: Mathematical, Physical and Engi-

neering Sciences, 367(1898):2717–2727.

Arora, J. (2017). Corporate governance: a farce at volkswa-

gen? The CASE Journal, 13(6):685–703.

Breiman, L. (2001). Random forests. Machine learning,

45(1):5–32.

Coos, A. (2018). Eu vs us: How do their data protection

regulations square off?

Dembe, A. E. and Boden, L. I. (2000). Moral hazard:

a question of morality? New Solutions: A Jour-

nal of Environmental and Occupational Health Policy,

10(3):257–279.

European Union (2016). Regulation (EU) 2016/679 of the

European Parliament and of the Council of 27 April

2016 on the protection of natural persons with re-

gard to the processing of personal data and on the

free movement of such data, and repealing Directive

95/46/EC (General Data Protection Regulation). Ofﬁ-

cial Journal of the European Union, L119:1–88.

Eurostat, N. (2008). Rev. 2–statistical classiﬁcation of eco-

nomic activities in the european community. Ofﬁce

for Ofﬁcial Publications of the European Communi-

ties, Luxemburg.

Forbes, D. L. (2011). Toward a uniﬁed model of human

motivation. Review of general psychology, 15(2):85.

Franklin, Z. (2014). Justice for revenge porn victims: Le-

gal theories to overcome claims of civil immunity by

operators of revenge porn websites. California Law

Review, pages 1303–1335.

Goldberg, L. R., Sweeney, D., Merenda, P. F., and

Hughes Jr, J. E. (1998). Demographic variables and

personality: The effects of gender, age, education, and

ethnic/racial status on self-descriptions of personal-

ity attributes. Personality and Individual differences,

24(3):393–403.

Grace-Martin, K. (2008). Assessing the ﬁt of regression

models. [Online; accessed 05-July-2018].

Greitzer, F. L. and Hohimer, R. E. (2011). Modeling hu-

man behavior to anticipate insider attacks. Journal of

Strategic Security, 4(2):25–48.

Gudaitis, T. M. (1998). The missing link in information se-

curity: Three dimensional proﬁling. CyberPsychology

& Behavior, 1(4):321–340.

H2O.ai (2018a). Distributed Random Forest (DRF).

H2O.ai (2018b). H2O.ai.

Hvinden, B., Johanne Bang, K., Fjørtoft, K., Holand, I.,

Johnsen, R., Kolstad, I., Monsen, T., Nevøy, A.,

Sandmo, E., Skilbrei, M.-L., Staksrud, E., Tande,

K. M., Ulleberg, P., Øyum, L., and Enebakk, V.

(2016). Guidelines for research ethics in the social

sciences, humanities, law and theology.

Knafo, A. and Sagiv, L. (2004). Values and work environ-

ment: Mapping 32 occupations. European Journal of

Psychology of Education, 19(3):255–273.

Kosinski, M., Stillwell, D., and Graepel, T. (2013). Pri-

vate traits and attributes are predictable from digital

records of human behavior. Proceedings of the Na-

tional Academy of Sciences, page 201218772.

Krebs, B. (2012). Fbi: Smart meter hacks likely to spread.

[Online; accessed 26-June-2018].

Kulik, B. W., O’Fallon, M. J., and Salimath, M. S. (2008).

Do competitive environments lead to the rise and

spread of unethical behavior? parallels from enron.

Journal of Business Ethics, 83(4):703–723.

Liebowitz, S. J. and Margolis, S. E. (1994). Network exter-

nality: An uncommon tragedy. Journal of economic

perspectives, 8(2):133–150.

N.A. (2018). European social survey round 8 data. NSD -

Norwegian Centre for Research Data, Norway - Data

Archive and distributor of ESS data for ESS ERIC.

Nagelkerke, N. J. et al. (1991). A note on a general deﬁ-

nition of the coefﬁcient of determination. Biometrika,

78(3):691–692.

Patro, S. and Sahu, K. K. (2015). Normalization: A prepro-

cessing stage. arXiv preprint arXiv:1503.06462.

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

386

Rajbhandari, L. and Snekkenes, E. (2013). Using the con-

ﬂicting incentives risk analysis method. In IFIP Inter-

national Information Security Conference, pages 315–

329. Springer.

Schwartz, S. (2016). Computing scores for the 10 human

values.

Schwartz, S. H. (1992). Universals in the content and struc-

ture of values: Theoretical advances and empirical

tests in 20 countries. In Advances in experimental so-

cial psychology, volume 25, pages 1–65. Elsevier.

Schwartz, S. H. (1994). Are there universal aspects in the

structure and contents of human values? Journal of

social issues, 50(4):19–45.

Schwartz, S. H. (2007). Basic human values: theory, meth-

ods, and application. Risorsa Uomo.

Schwartz, S. H. (2012). An overview of the schwartz theory

of basic values. Online readings in Psychology and

Culture, 2(1):11.

Schwartz, S. H. (2013). Culture matters: National value

cultures, sources, and consequences. In Understand-

ing Culture, pages 137–160. Psychology Press.

Schwartz, S. H. and Rubel, T. (2005). Sex differences in

value priorities: Cross-cultural and multimethod stud-

ies. Journal of personality and social psychology,

89(6):1010.

Snekkenes, E. (2013). Position paper: Privacy risk analy-

sis is about understanding conﬂicting incentives. In

IFIP Working Conference on Policies and Research in

Identity Management, pages 100–103. Springer.

Szekeres, A. and Snekkenes, E. A. (2018). Unobtrusive

psychological proﬁling for risk analysis. In Proceed-

ings of the 15th International Joint Conference on e-

Business and Telecommunications - Volume 1: SE-

CRYPT,, pages 210–220. INSTICC, SciTePress.

Yang, H. (2013). The case for being automatic: introduc-

ing the automatic linear modeling (linear) procedure

in spss statistics. Multiple Linear Regression View-

points, 39(2):27–37.

Using Demographic Features for the Prediction of Basic Human Values Underlying Stakeholder Motivation

387

APPENDIX

regression residual Age

Achievement

11 22,036 390.72 0.16 0.52 0.08 Finland (0)

0.15 United Kingdom (1)

0.15 Lithuania (2)

0.14 Netherlands (3)

0.01 Sweden (4)

0.08 Belgium, Switzerland (5)

0.06 Spain, Poland (6)

0.10 Austria, Estonia, Italy, Russian Federation (7)

0.11 Czech Republic, Ireland (8)

Iceland, Norway (9)

0.05 Hungary, Slovenia (10)

Benevolence

13 22,034 329.04 0.16 0.64 -0.10 Finland (0) -0.03 Male (1)

-0.04 United Kingdom (1)

Female (2)

-0.06 Lithuania (2)

0.03 Netherlands (3)

-0.12 Sweden (4)

-0.10 Belgium, Switzerland (5)

-0.05 Spain, Poland (6)

-0.03 Austria, Estonia, Italy, Russian Federation (7)

-0.02 Czech Republic, Ireland (8)

Iceland, Norway (9)

0.01 Hungary, Slovenia (10)

-0.08 Germany, France, Israel, Portugal (11)

Conformity

9 22,038 303.35 0.11 0.51 -0.10 Finland (0)

-0.03 United Kingdom (1)

-0.07 Lithuania (2)

-0.05 Netherlands (3)

-0.09 Sweden (4)

-0.06 Belgium, Switzerland (5)

-0.08 Spain, Poland (6)

-0.11 Austria, Estonia, Italy, Russian Federation (7)

Czech Republic, Ireland (8)

Hedonism

13 22,034 341.90 0.18 0.60 -0.01 Finland (0)

-0.11 United Kingdom (1)

-0.07 Lithuania (2)

-0.13 Netherlands (3)

-0.05 Sweden (4)

0.01 Belgium, Switzerland (5)

-0.06 Spain, Poland (6)

0.00 Austria, Estonia, Italy, Russian Federation (7)

-0.03 Czech Republic, Ireland (8)

Iceland, Norway (9)

-0.04 Hungary, Slovenia (10)

0.02 Germany, France, Israel, Portugal (11)

Power

9 22,038 510.15 0.18 0.46 0.09 Finland (0)

0.13 United Kingdom (1)

0.05 Lithuania (2)

-0.03 Netherlands (3)

0.12 Sweden (4)

0.02 Belgium, Switzerland (5)

0.06 Spain, Poland (6)

-0.05 Austria, Estonia, Italy, Russian Federation (7)

-0.02 Czech Republic, Ireland (8)

Iceland, Norway (9)

Note.

reference variable; all SE B < .005; for all included variables p < .05

-0.002

0.002

-0.002

adjusted R

Intercept

Unstandardized Beta

Country (coded as)

Gender (coded as)

Figure 4: Final regression models for each dependent variable (1/2).

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

388

regression residual Age

Security

12 272,374 267.72 0.12 0.51 0.03 Finland (0)

0.10 United Kingdom (1)

-0.02 Lithuania (2)

-0.04 Netherlands (3)

0.08 Sweden (4)

0.08 Belgium, Switzerland (5)

0.02 Spain, Poland (6)

0.04 Austria, Estonia, Italy, Russian Federation (7)

0.05 Czech Republic, Ireland (8)

Iceland, Norway (9)

0.07 Hungary, Slovenia (10)

Self-Direction

11 22,036 212.09 0.09 0.47 0.01 Finland (0) 0.04 Self-employed (0)

-0.01 United Kingdom (1)

Employee, Working for own family business (1)

0.03 Lithuania (2)

0.06 Netherlands (3)

0.06 Sweden (4)

-0.04 Belgium, Switzerland (5)

0.04 Spain, Poland (6)

0.04 Austria, Estonia, Italy, Russian Federation (7)

0.01 Czech Republic, Ireland (8)

Iceland, Norway (9)

Stimulation

6 22,041 361.72 0.09 0.56 -0.02 Finland (0)

-0.04 United Kingdom (1)

-0.03 Lithuania (2)

-0.01 Netherlands (3)

-0.04 Sweden (4)

Belgium, Switzerland (5)

Tradition

9 22,038 415.14 0.14 0.54 -0.03 Finland (0) 0.04 Yes (1)

-0.05 United Kingdom (1)

No (2)

-0.05 Lithuania (2)

-0.01 Netherlands (3)

-0.02 Sweden (4)

-0.03 Belgium, Switzerland (5)

Spain, Poland (6)

Universalism

16 22,031 209.70 0.13 0.51 -0.02 Finland (0) 0.01 Mining and quarrying (0)

-0.09 United Kingdom (1) Water supply; sewerage, waste management and remediation activities, Information and communication (2)

-0.08 Lithuania (2) Professional, scientific and technical activities, Administrative and support service activities (2)

-0.04 Netherlands (3) 0.02 Agriculture, forestry and fishing, Manufacturing, Accommodation and food service activities (3)

-0.05 Sweden (4) Public administration and defence; compulsory social security, Education (4)

-0.04 Belgium, Switzerland (5) Other service activities, Activities of extraterritorial organisations and bodies (4)

-0.10 Spain, Poland (6) -0.004 Human health and social work activities, Arts, entertainment and recreation, Activities of households as employers (5)

-0.01 Austria, Estonia, Italy, Russian Federation (7) Electricity, gas, steam and air conditioning supply, Wholesale and retail trade; repair of motor vehicles and motorcycles (6)

-0.04 Czech Republic, Ireland (8) Transportation and storage, Financial and insurance activities, Real estate activities (6)

Iceland, Norway (9)

-0.07 Hungary, Slovenia (10)

Note.

reference variable; all SE B < .005; for all included variables p < .05

0.01

-0.002

0.001

-0.01

0.001

-0.01

0.03

Employment Relation (coded as)

0.001

-0.01

adjusted R

Intercept

Unstandardized Beta

Country (coded as)

Religion (coded as)

Level of education

NACE classification of economic activities (coded as)

Figure 5: Final regression models for each dependent variable (2/2).

Using Demographic Features for the Prediction of Basic Human Values Underlying Stakeholder Motivation

389