Daily Pain Prediction in Workplace Using Gaussian Processes

Chetanya Puri

1 a

, Stijn Keyaerts

2,3 b

, Maxwell Szymanski

4,5 c

, Lode Godderis

2,3 d

Katrien Verbert

4 e

, Stijn Luca

6,∗ f

and Bart Vanrumste

1,∗ g

eMedia Lab and STADIUS, Department of Electrical Engineering (ESAT), KU Leuven, Belgium

Knowledge, Information and Research Center (KIR), Group Idewe (External Service for Prevention and

Protection at Work), Leuven, Belgium

Department of Public Health and Primary Care, KU Leuven, Leuven, Belgium

Department of Computer Science, KU Leuven, Belgium

Human-computer Interaction research group, Department of Computer Science, KU Leuven, Belgium

Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium

Keywords:

Pain Management, Public Health Informatics, Time Series Forecasting, Bayesian Prediction.

Abstract:

Work-related Musculoskeletal disorders (MSDs) account for 60% of sickness-related absences and even per-

manent inability to work in the Europe. Long term impacts of MSDs include “Pain chroniﬁcation” which is

the transition of temporary pain into persistent pain. Preventive pain management can lower the risk of chronic

pain. It is therefore important to appropriately assess pain in advance, which can assist a person in improving

their fear of returning to work. In this study, we analysed pain data acquired over time by a smartphone ap-

plication from a number of participants. We attempt to forecast a person’s future pain levels based on his or

her prior pain data. Due to the self-reported nature of the data, modelling daily pain is challenging due to the

large number of missing values. For pain prediction modelling of a test subject, we employ a subset selection

strategy that dynamically selects a closest subset of individuals from the training data. The similarity between

the test subject and the training subjects is determined via dynamic time warping-based dissimilarity measure

based on the time limited historical data until a given point in time. The pain trends of these selected subset

subjects is more similar to that of the individual of interest. Then, we employ a Gaussian processes regression

model for modelling the pain. We empirically test our model using a leave-one-subject-out cross validation to

attain 20% improvement over state-of-the-art results in early prediction of pain.

1 INTRODUCTION

Musculoskeletal disorders (MSD) are presently a

widespread type of work-related health problem and

a leading cause for absenteeism from work across

all sectors and occupations. Around 60% of all the

health related problems in Europe (EU) are work-

related MSDs that account for 60% of sickness related

https://orcid.org/0000-0002-3474-4898

https://orcid.org/0000-0001-6431-092X

https://orcid.org/0000-0002-6506-3198

https://orcid.org/0000-0003-4764-8835

https://orcid.org/0000-0001-6699-7710

https://orcid.org/0000-0002-6781-7870

https://orcid.org/0000-0002-9409-935X

∗

These Authors Contributed Equally.

absences and even permanent inability to work (Com-

munication from the Commission to the European

Parliament, the Council, the European Economic and

Social Committee and the Committee of the Regions,

2017). This creates a ﬁnancial burden on individuals,

businesses, and society (Kok et al., 2020). Prevention

of MSDs from the outset of a person’s career will al-

low for an extended work life and better job satisfac-

tion (Kim, 2018). MSD prevention can also address

the long-term implications of demographic ageing, as

outlined in the objectives of the Europe 2020 strategy

for smart, sustainable, and inclusive growth. Conse-

quently, MSDs are not only an occupational burden,

but also a public health and societal challenge (Kok

et al., 2020).

Long-term impacts of MSDs include “Pain

chroniﬁcation”, which is the transformation of tran-

Puri, C., Keyaerts, S., Szymanski, M., Godderis, L., Verbert, K., Luca, S. and Vanrumste, B.

Daily Pain Prediction in Workplace Using Gaussian Processes.

DOI: 10.5220/0011611200003414

In Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 5: HEALTHINF, pages 239-247

ISBN: 978-989-758-631-6; ISSN: 2184-4305

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

239

sient pain into permanent pain as a result of re-

current physical strain sustained while doing work-

related activities (Morlion et al., 2018). Other than

physical pain experience, there is vast amount of

evidence on the importance of pain coping strate-

gies, cognitive appraisals (e.g. catastrophizing, high

threat values, and fear-avoidance believes), nega-

tive emotions and expectations (Moseley and Arntz,

2007), (Nijs et al., 2011), (Edwards et al., 2016).

These factors inﬂuence how sensory information is

processed in the spinal cord and the brain. There

are several models that integrate different biopsy-

chosocial factors to the perception of pain such

as the fear-avoidance model (Vlaeyen and Lin-

ton, 2000), avoidance-endurance model (Hasenbring

et al., 2012), and the common sense model (Bunzli

et al., 2017). These models illustrate how various

persons experience pain, which results in the subjec-

tivity of pain assessments. Thus, predicting pain in

a personalised manner for early intervention is essen-

tial for preventing pain persistence. It is crucial that

both patients and medical practitioners have the ed-

ucation and abilities necessary to manage pain cor-

rectly (Morlion et al., 2018).

Currently, pain management is done based on the

initial patient evaluation (history, physical examina-

tion) which is followed by prompt treatment based

on the level of the pain (Morlion et al., 2018). This

is especially true for the acute stages of pain. In

cases of chronic pain, evidence suggests that thera-

pies should be directed less by current pain levels and

more by participation in valued activities despite dis-

comfort (McCracken and Eccleston, 2005). There-

fore, appropriately measuring pain in early stages

can aid in pain management by evaluating medica-

tion efﬁcacy, comprehending the complicated rela-

tionship between pain and personal/contextual fac-

tors, and preparing patients and healthcare providers

for a challenging period with ﬂare-ups. Preventative

pain management can also reduce the likelihood of it

becoming chronic. Multiple pain management appli-

cations exists but are only limited to maintaining logs

of the level/intensity of the pain (Lalloo et al., 2015).

However, for more successful pain management, it is

essential to accurately estimate the pain in advance,

preferably several days ahead, which can help a per-

son moderate his or her expectations and anxieties

about returning to work. Additionally, pain prediction

can provide healthcare practitioners with a better un-

derstanding of the required treatment and assist with

individualised planning.

This study attempts to forecast the pain experi-

enced by workers from various industries several days

ahead based on the daily recorded history of pain. To

this end, workers were asked to record their daily lev-

els of satisfaction and pain in a smartphone applica-

tion on a scale from 0 (no pain) to 100 (worst possible

pain). Then, we attempt to forecast future pain levels

by modelling individual pain levels recorded until a

certain day. This is technically difﬁcult because the

data is self-reported, and app users did not always in-

dicate their pain levels on a daily basis, resulting in

missing data. Fig. 1 shows how an individual’s pain

data looks like with respect to time. With the study

data of over 300 days, more than 70% of the partici-

pants lack 66% of the daily data (< 100 entries).

0 50 100 150 200

100

Pain [0-100]

(a)

0 100 200 300

time [days]

100

(b)

Figure 1: Two distinct users documented their pain levels

over a 300-day period using (a) as few as 22 samples and

(b) as many as 228 samples.

The lack of observations in an individual’s time

series data restricts the application of traditional time

series modelling techniques such as the autoregres-

sive integrated moving average model (ARIMA),

which requires uniformly sampled data (Shumway

and Stoffer, 2017).

Authors in (L

otsch and Ultsch, 2018) provide an

extensive survey of the application of modern ma-

chine learning techniques for the estimation or detec-

tion of pain. The majority of the pain experiences

discussed in the literature (L

otsch and Ultsch, 2018)

are related to a hospital or post-operative scenario,

rather than persistent workplace-related pain. Fur-

thermore, pain forecasting models that use machine

learning are built on clinical data (e.g. drugs ad-

ministered, patient comorbidity data) collected during

pain experiences post a surgical operation, enriching

the information available for modelling (Tighe et al.,

2015), (Lee et al., 2019).

Deep learning (DL) is a branch of machine learn-

ing that, when given massive volumes of data, may

automatically learn representations from raw data

to achieve a speciﬁc objective, such as classiﬁca-

tion or regression (Lai et al., 2018), (Laptev et al.,

2017), (De Brouwer et al., 2019), (Liu et al., 2021).

DL has been used in multiple healthcare related ap-

plications that can predict the health of an individ-

HEALTHINF 2023 - 16th International Conference on Health Informatics

240

ual from the time series data. For example, detecting

cardiac abnormality (Strodthoff and Wagner, 2020) or

forecasting glucose levels (Li et al., 2019) in individ-

uals. Works such as (Lipton et al., 2016), (Futoma

et al., 2017) developed deep learning techniques that

can address the non-uniformly sampled time-series

data when a large training dataset is available. Tra-

ditional machine learning strategies, however, outper-

form deep learning strategies when training data is in-

sufﬁcient (Makridakis et al., 2018).

In this work, we follow the method proposed

in (Puri et al., 2019), where a subset of the train-

ing data is selected followed by learning a regression

model based on Gaussian processes (GP). Here, we

would like to showcase the efﬁcacy of the subset se-

lection approach followed by GP based regression to

model an individual’s pain measurements until a cer-

tain day. The subset selection approach works by ﬁrst

selecting individuals from training data that resemble

closely the progression of pain over time to that of

the target individual. The number of individuals to be

selected is chosen dynamically based on the similari-

ties across individuals. The dynamically chosen sub-

set along with the avaiable data from the target subject

is then used to train a regression model for improved

prediction performance. However, directly applying

the method of (Puri et al., 2019) doesn’t give the best

results owing to the subjective nature of the pain mea-

surements. Hence, we add a pre-processing treatment

of the data prior to subset selection and learning the

GP model. We explain the need to do so as follows,

1. Pain measurements of an individual vary a lot

across time. This might be because of the pain

persistence over time or by the number of individ-

ual days with more stress resulting in more pain.

Hence, unlike a general increasing trend in gesta-

tional weight gain (Puri et al., 2019), it is difﬁcult

to ﬁnd a pattern in the pain measurements over

time. Thus, there are anomalous instances in the

pain measurements that can result in an inaccurate

general model.

2. Pain measurements are self-reported and are

highly subjective in nature. This means that indi-

viduals have certain biases to only rate their pain

(scored between [0-100]) around a ﬁxed baseline,

e.g., a person with a baseline reported pain of 20

will seldom report a pain of 80. Thus, scaling

individual pain measurements for modelling is a

necessary step.

The objective of this work is to study if:

• It is possible to estimate an individual’s pain lev-

els from a small number of non-uniformly col-

lected historical pain measurements.

• Given the subjective nature of pain data, is it pos-

sible to use previous pain measurements of other

individuals in a training dataset to enhance pain

prediction?

The main contributions of this paper are:

1. We develop models of daily pain data to forecast

and manage pain level trends over time.

2. We propose a two-step pre-processing strategy to

enhance pain prediction modelling. This is ac-

complished by smoothing the pain time series in

training data and self-normalising the target indi-

vidual’s pain data with the few measurements pro-

vided.

3. We use a subset-selection strategy to generate the

most informative subset of training data for a

given target individual. Individuals in this closest

subset exhibit similar pain trends to the individual

of interest.

4. We devise modelling based on the selected subset

using Gaussian processes for multi-step forecast-

ing of pain up to n-days ahead in time.

The dataset is described in section 2, followed by the

proposed methodology in section 3. In section 4, we

describe the experiments conducted to generate the

results. Results and their implications are discussed

in greater detail in section 5, and concluding remarks

are presented in section 6. Section 7 concludes the pa-

per by discussing potential future directions and con-

straints.

2 DATA

In this study, 340 participants were recruited from

various work sectors. At the start of the study, par-

ticipants were asked about different work-related fac-

tors, their pain complaints, pain-related perceptions,

coping strategies, and other contextual factors (in-

deed, such as physical activity and time spent sitting).

From January 2021 to May 2022, they were required

to maintain a daily journal in which they recorded

their overall pain levels, mood (not with yes/no ques-

tions), stress levels, and satisfaction along with base-

line questions such as age, gender, height, weight,

and industry of employment. The pain levels were

recorded on a scale of 0 (best) to 100 (worst) using an

mHealth smartphone application.

. Yes/No questions

such as mood (sad, angry, happy, fatigued, cheerful)

were also part of the daily journal.

190 participants were excluded because they did

not record daily pain values at all. In addition, 51

https://www.idewe.be/health-empower

Daily Pain Prediction in Workplace Using Gaussian Processes

241

Healthcare

Food

Government

Education

Ofﬁces

Logistics

Industry

Construction

Other

Working sector

Number of subjects

Female

Other

Male

Figure 2: 21 males and 77 females participated in the study

with majority (62 out of 99) working in the healthcare in-

dustry providing care.

more individuals were removed based on the criterion

of not having more than 10 daily pain values recorded,

with more than 2 values separated by 1 week. The re-

maining 99 participants’ data were used to develop

pain prediction models. Fig. 2 presents the gender-

wise distribution of participants in different indus-

tries.

This study was conducted within the context of

the Personal Health Empowerment project, which fo-

cused on investigating and developing new monitor-

ing and treatment options for employees with MSDs.

The PHE project and corresponding studies were ap-

proved by the Social Ethics Commission of KU Leu-

ven (G-2019081713) and carried out according to the

Belgian and international privacy and ethical legis-

lation. The Belgian occupational service for protec-

tion and prevention at work (IDEWE) was responsible

for the recruitment. They distributed the information

about the project amongst their clients and employ-

ees. Interested employees had to provide informed

consent to participate.

3 METHODOLOGY

Let’s assume pain levels measured across time are

available for N subjects as ‘training data’ D =

{(x

, y

), ...(x

, y

)}, where x

= [t

··· t

] rep-

resents the input variable ‘time’ up to a certain day

and y

= [y

·· · y

] represents the output vari-

able ‘pain’ for the i

subject, where y

= y(t

In addition, data from a person of inter-

est, henceforth referred to as the target individ-

ual, are provided till a certain day t

as S =

{(t

, y

), (t

, y

), · ·· , (t

, y

)}.

We try to learn a mapping f from the training and

target data, such that,

= f (t

) + ε. (1)

where ε ∼ N (0, σ

) is independent and identically

distributed (i.i.d) gaussian.

Using the learnt model f , the target individual’s

pain measurements are then predicted at time t

y(t

) = f (t

3.1 Smoothing

Let’s begin by discussing the smoothing opera-

tion. Given a time series in training data y

·· · y

], a moving average (MA) of order w

can be used to obtain a smoothed time series

[ ˆy

ˆy

·· · ˆy

]. This w-MA can be written as

ˆy

w−1

∑

j=−

w−1

t+ j

, (2)

where w is an odd integer. Moreover, ⌊

⌋ zeros are

padded to the beginning and end of the given time

series y

to obtain same m number of observations

in the derived w-MA time series in eq. 2. If the w-

length time window contains missing observations for

a given non-uniformly sampled time series, just the

available points are used to calculate the moving av-

erage.

3.2 Self-Normalisation

We normalise a given time-series with its avail-

able individual information. A time-series y

·· · y

] is normalised using mean µ

and

standard deviation σ

calculated as follows:

∑

j=1

∑

j=1

− µ

)

(3)

The j

observation (y

) of normalised time-series

= [y

·· · y

] is obtained from the time-series

− µ

(4)

The normalised data can be rescaled to original scale

as y

= y

× σ

+ µ

HEALTHINF 2023 - 16th International Conference on Health Informatics

242

3.3 Regression

We use Gaussian Processes (GP) as they are the

state-of-the-art time series modelling methods when

dealing with missing data. GP is deﬁned as a set

of random variables, such that any ﬁnite number of

them have a joint Gaussian distribution (Rasmussen,

2004). ‘ f ’ from eq. (1) is deﬁned as a GP f (t) ∼

GP (m(t), k(t,t

′

)), with mean function m(t) and co-

variance function k(t, t

′

). We assume the data is noisy

with i.i.d gaussian noise, having noise covariance σ

and choose a squared exponential kernel as the gaus-

sian covariance function to model the closeness of

two observations,

k(t,t

′

) = σ

exp



−(t −t

′

)



. (5)

As is evident from eqn. 5, the similarity between two

observations decreases exponentially as t begins to

differ from t

′

, i.e the similarity is highest when t = t

′

Thus, when two observations are far apart in time, the

kernel considers them more dissimilar than when they

are closer together in time.

Given

y = [y

, · ·· , y

]

and K as a

matrix of entries K

p,q

= k(t

), ∀t

∈ D. We op-

timise the hyper-parameters {σ

, l, σ

} by maximis-

ing the marginal likelihood p(

y|D; {σ

, l, σ

}) (Ras-

mussen, 2004). The prediction at time t

is given as

a gaussian distribution whose mean, µ and variance,

are given by

µ(t

) = k

(K + σ

−1

σ(t

) = k(t

) − k

(K + σ

−1

(6)

where k

= k(t

), k(t

) = [k(t

), · ·· , k(t

)]

Gaussian process prediction is hampered by the

fact that the computing complexity of inference and

likelihood evaluation is O(n

), where n is the input

size, making it impractical for bigger data sets. Next,

we will explore subset selection, which can minimise

computing complexity while enhancing prediction ac-

curacy.

3.4 Subset Selection

We follow the subset selection approach from (Puri

et al., 2019) to ﬁnd a smaller but informative subset

(

D) of the training data for a given target individual’s

data. Particularly, a subset

D with M(<< N) individ-

uals’ data is found from the given training data D,

D = {(x

, y

), · ·· , (x

, y

)}



, y

), · ·· , (t

, y

), · ·· , (t

, y

)



(7)

such that the individuals selected in the subset are

similar to target individual’s pain trend.

Using a subset

D with M(<< N) individuals’ data

gives a computational advantage over considering N

subjects, as the time complexity of GPs training and

inference is proportional to the cubic power of the

number of observations. Furthermore, if the most in-

formative subset is selected, the prediction capability

is improved. This is due to the fact that, during train-

ing, observations from M patients with a similar trend

in pain are close to each other and have less variabil-

ity at any given time t. Due to inter-subject variances,

this variability (at time t) is high when all N individ-

uals are considered for training Gaussian processes.

To ﬁnd the closeness between two time series, we

use the Dynamic Time warping (DTW) as the dis-

tance metric. The choice of DTW metric as a dis-

tance measure is due to its capability to index time

series with unequal lengths (Keogh and Ratanama-

hatana, 2005).

The subset selection is a two-step process in which

(i) distances between the target time series and time

series in training data is calculated, and then (ii) the

nearest subset is dynamically selected based on the

calculated distances.

Distances between the target data S =

{(t

, y

), (t

, y

)..., (t

, y

)} and individual time

series in training data D are calculated using the

dynamic time warping (DTW) distance metric. Let’s

denote the DTW distance between target time series

(denoted by +) and i

time series in training data by

. Remark that that target data is only available

until t

but the time-series in training data are present

until t

(>> t

). Therefore, the data for time series

in training data are considered only until day t

calculate the distance λ

. If the data at t

is not

available, the nearest time point < t

is chosen. The

distance vector Λ

= [λ

·· · λ

N+

] is calculated

between target time series and all the time series in

training data.

Subset selection is dynamically done based on

the distance vector Λ

. First, Λ

is sorted in as-

cending order. This ensures that the subjects are ar-

ranged in order of their closeness to the target subject,

= [

·· ·

N+

], such that

≤

(k+1)+

∀k =

1, 2, ·· · , N. Second, turning points at index ‘k’ are

calculated, such that,



(k−1)+

−

(k−2)+



≤



−

(k−1)+



≥



(k+1)+

−



Multiple such turning points can exist at different in-

dexes in

vector. Third, the value at the distance

value at the index k where the ﬁrst turning point oc-

curs (λ

) is chosen as the distance threshold to calcu-

late the closest subset. i

time series in D is selected

in the subset if λ

< λ

. Choosing the ﬁrst turning

point enables the dynamic selection of the smallest

Daily Pain Prediction in Workplace Using Gaussian Processes

243

Subset selection

Training data

Target data

Moving averaged

Self-Normalisation

GP− >Rescaling

training data

Figure 3: An illustration of our methodology. Moving averaging is performed on the training data to smoothen it. Target data

is available until a day t

(dotted green line). Subset selection is performed on moving-averaged training data that shares

similar temporal pattern to the target observations. Each time series (target or training) is self-normalised with its available

observations before being fed to Gaussian Processes. A prediction on target data is made (red dotted line).

and most informative subset. Fig. 3 showcases the

processing pipeline where the moving average based

smoothing is done on the training data before subset

selection. Since pain levels in each individual series

are normalized using self-data, all training time series

are scaled to the same level prior to being fed into the

Gaussian process model. We will observe that this

enhances the reliability of the predictions.

4 EXPERIMENTS

We perform leave-one-subject-out (LOSO) cross-

validation to evaluate the performance of our pro-

posed approach. In each iteration, a unique individ-

ual’s data is treated as target data and rest of the sub-

jects’ data are the training data. We ﬁrst smoothen

the training data and target data using a moving av-

eraging of order ﬁve (w = 5). Then, in each itera-

tion, a closest subset is evaluated dynamically with

respect to the target data followed by self-normalising

each time series (target data and selected subset) using

eq. 3 and 4. Note that our subset selection approach

dynamically selects a threshold in each iteration (i.e

for each target data). A GP based regression is per-

formed to forecast the future values for the target sub-

ject. The performance of regression was computed

using Mean Absolute Error (MAE) averaged over N

subjects. MAE for prediction at a time t

is given as

MAE(t

) =

∑

i=1

pred

) − y

orig

)|.

4.1 State-of-the-Art

• Baseline: A baseline was created to judge the per-

formance of the algorithms. This baseline was

created by using the last available value of the tar-

get subject as future prediction of the daily pain

value.

• ARIMA: Auto-Regressive Integrated Moving

Average (ARIMA) has remained a state-of-the-

art time series forecasting approach with uni-

formly spaced samples of time series (Box et al.,

2015). Through linear interpolation, uniformity

was introduced into the sparsely sampled pain

time series of the subject of interest. Then, an

ARIMA(p, d, q) model was ﬁt on the uniformly

sampled target time series. In order to ﬁnd the

optimal autoregressive order (p), degree of differ-

encing (d), and moving average order (q), a grid-

search was performed to ﬁnd the optimal hyperpa-

rameters following (Shibata, 1976). The learned

model is then used to make a multi-step-ahead

prediction of pain levels using the optimized hy-

perparameters.

• LSTM: Long short-Term Memory networks

(LSTM) are deep learning techniques that can

produce exceptional prediction performance by

implementing gates (forget, memory, and output)

that regulate the ﬂow of information during train-

ing (Hochreiter and Schmidhuber, 1997). We fol-

low a similar approach as with ARIMA approach

where the avaiable data from a target subject is

uniformly sampled by linear interpolation. We

evaluate an LSTM network with 10 hidden units

and the training is done using ADAM’s optimi-

sation to minimise mean absolute error (Kingma

and Ba, 2015).

• Maximum-a-Posteriori (MAP) Estimation: A

order polynomial can be ﬁt using available tar-

get data to estimate the polynomial coefﬁcients

, ∀i ∈ {1, 2, ·· · , l} (Puri et al., 2019). More-

over, subjects from training data can be used

to create priors over the polynomial coefﬁcients

to get a better estimate known as maximum-a-

posterior (MAP) estimate (Puri et al., 2019). We

HEALTHINF 2023 - 16th International Conference on Health Informatics

244

test with polynomial of different orders (order 1 to

5) to ﬁnd that the ﬁrst order polynomial produces

the least mean absolute error in LOSO cross-

validation.

5 RESULTS & DISCUSSION

In this research, we investigate whether it is possible

to estimate a person’s pain levels using a small num-

ber of non-uniformly collected historical pain mea-

surements. As pain data is subjective and varies

amongst individuals, we also intended to determine

if we might improve pain prediction by incorporat-

ing the prior pain measurements of other individuals

into the training dataset. For this reason, we study the

performance of various algorithms presented in this

paper when predicting pain levels of an individual in

future. In Fig. 4, we present the Mean Absoute Error

(MAE) when predicting the pain 7 days ahead on the

y-axis. On the x-axis in Fig.4, the availability of tar-

get data until a certain day is presented. Subset selec-

0 50 100 150 200 250 300

training data available until [days]

MAE

GP+SS

GP+SS : MA

GP+SS : MA +SN

Figure 4: Mean absolute error (MAE) is measured with

respect to availability of target data. Different combina-

tions of subset selection (SS) followed by Gaussian pro-

cesses (GP) were performed with proposed pre-processing

components such as moving averaging (MA) and/or self-

Normalisation (SN).

tion (SS) along with moving averaging (MA) and/or

self-Normalisation (SN) were performed and Gaus-

sian processes was used as a regression model. It is

evident from Fig. 4 that the performance of subset se-

lection (SS) followed by Gaussian processes (GP) is

demonstrably superior to that of Gaussian processes

alone. This is a result of the inclusion of an infor-

mative subset of participants in training who exhibit a

comparable trend in pain to the target data. Addition-

ally, subset selection on the moving averaged (MA)

time series of the training data, followed by self-

normalisation and subsequently the Gaussian process,

performed the best, particularly when predicting for

less available target data.

We hypothesised that pain data is subjective and

that self-reported pain measurements are biased be-

cause individuals can only compare their current pain

feelings to their past pain experiences. Therefore,

self-normalisation with respect to the historical pain

measurements of an individual provides this signiﬁ-

cant performance improvement. In addition, as the

availability of personal pain data increases over time,

so does the accuracy of prediction. We believe that

as more training data becomes available from an in-

dividual, the selected subset will consist of subjects

whose patterns resemble that of the target subject

more closely than when there are only a few data

points. Thus, the variance in the training data avail-

able for regression is less and thus the prediction im-

proves. This is evident by the decreasing trend in

MAE when more training data becomes available. We

also tested with self-normalization prior to subset se-

lection and found no signiﬁcant performance differ-

ences. This may be due to the fact that the DTW dis-

tance comparison for subset selection compares the

relative difference in distances between two time se-

ries and picks more or less similar individuals with or

without self-normalisation.

Next, we present the comparison of the proposed

approach (GP+SS:MA+SN) with state-of-the-art ap-

proaches presented in section 4.1 when predicting

pain values [0 − 100]. Fig. 5 shows that the proposed

approach’s performance is best when it comes to early

prediction using only few available data points (until

day 100).

0 50 100 150 200 250 300

training data available until [days]

MAE

Baseline

ARIMA

LSTM

MAP

GP+SS : MA+SN

Figure 5: Comparison of the proposed approach with state-

of-the-art approaches. When little training data is available

(until day 100), the proposed method beats SOTA, and when

more training data becomes available, it performs compara-

bly or even better.

The performance becomes comparable (if not bet-

ter) with the state-of-the-art approaches (MAP) as

Daily Pain Prediction in Workplace Using Gaussian Processes

245

more data in time becomes available for a given indi-

vidual. On the basis of a paired t-test with equal vari-

ances, the performance differences between the pro-

posed approach and other SOTA methods are statisti-

cally signiﬁcant at 5% level of signiﬁcance (until day

50). We discovered no statistically signiﬁcant differ-

ence between the proposed method and MAP-based

polynomial estimate when training with data for more

than 100 days. Given the simplicity of the dataset,

it seems intuitive that when more pain data becomes

available, simple polynomial-based estimating algo-

rithms will perform better.

We also observed that the state-of-the-art ap-

proaches (except LSTM) perform worse than the

baseline when the availability of individual training

data is limited (at least until day 50). Remark that

the baseline is simply the previous observed value of

pain carried forward for the prediction of future val-

ues. This is due to the difﬁculty of modelling sparsely

sampled time series with few observations. Our pro-

posed method, on the other hand, overcomes this dif-

ﬁculty by incorporating the subjective nature of pain

experience and modelling information rich subset se-

lection along with personal data.

6 CONCLUSION

We proposed a novel Gaussian processes estimator

and information-rich preprocessing to model an indi-

vidual’s workplace-related pain experiences. When

time series data is irregularly sampled, the pro-

posed approach outperforms state-of-the-art time-

series forecasting algorithms for early prediction.

This can aid in the development of interventions for

managing pain in the workplace, thereby reducing the

possibility of ‘pain chroniﬁcation’.

7 LIMITATIONS & FUTURE

WORK

A limitation of our approach is the scalability of

Gaussian processes as we believe that considering a

large number of subjects (N > 10

) will result in a

larger subset (high value of M) of training data, in-

creasing the computational complexity of our method.

Sparse GPs are model approximation techniques that,

when applied to a large number of subjects, can fur-

ther reduce complexity (Rasmussen, 2004).

In the future, we hope to broaden the modality

of the input data in order to obtain more objective

feedback on pain experiences. Finding an associa-

tion of pain with physical activity data measured by a

wearable, for example, can help as another meaning-

ful feature to improve prediction performance. Simi-

lar to the maximum-a-posteriori approach, priors on

the normalisation constants can be generated from

training data and used to adjust the self-normalisation

mean and standard deviation.

ACKNOWLEDGMENT

Chetanya Puri has received funding from the Euro-

pean Union’s Horizon 2020 research and innovation

programme under the Marie Sklodowska-Curie grant

agreement No 766139. This work is part of the re-

search project Personal Health Empowerment (PHE)

with project number HBC.2018.2012, ﬁnanced by

Flanders Innovation & Entrepreneurship. This pub-

lication reﬂects only the authors’ view and the REA

is not responsible for any use that may be made of the

information it contains.

REFERENCES

Box, G. E., Jenkins, G. M., Reinsel, G. C., and Ljung, G. M.

(2015). Time series analysis: forecasting and control.

John Wiley & Sons.

Bunzli, S., Smith, A., Sch

utze, R., Lin, I., and O’Sullivan,

P. (2017). Making sense of low back pain and pain-

related fear. journal of orthopaedic & sports physical

therapy, 47(9):628–636.

Communication from the Commission to the European Par-

liament, the Council, the European Economic and So-

cial Committee and the Committee of the Regions

(2017). Safer and Healthier Work for All — Mod-

ernisation of the EU Occupational Safety and Health

Legislation and Policy. COM.

De Brouwer, E., Simm, J., Arany, A., and Moreau, Y.

(2019). GRU-ODE-bayes: Continuous modeling of

sporadically-observed time series. In Advances in

Neural Information Processing Systems, pages 7379–

7390.

Edwards, R. R., Dworkin, R. H., Sullivan, M. D., Turk,

D. C., and Wasan, A. D. (2016). The role of psychoso-

cial processes in the development and maintenance of

chronic pain. The Journal of Pain, 17(9):T70–T92.

Futoma, J., Hariharan, S., and Heller, K. (2017). Learn-

ing to detect sepsis with a multitask gaussian process

rnn classiﬁer. In International conference on machine

learning, pages 1174–1182. PMLR.

Hasenbring, M. I., Hallner, D., Klasen, B., Streitlein-

ohme, I., Willburger, R., and Rusche, H. (2012).

Pain-related avoidance versus endurance in primary

care patients with subacute back pain: psychological

characteristics and outcome at a 6-month follow-up.

Pain, 153(1):211–217.

HEALTHINF 2023 - 16th International Conference on Health Informatics

246

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term

memory. Neural computation, 9(8):1735–1780.

Keogh, E. and Ratanamahatana, C. A. (2005). Exact index-

ing of dynamic time warping. Knowledge and infor-

mation systems, 7(3):358–386.

Kim, D. (2018). Effect of musculoskeletal pain of care

workers on job satisfaction. Journal of Physical Ther-

apy Science, 30:164–168.

Kingma, D. P. and Ba, J. (2015). Adam: A method for

stochastic optimization. ICLR.

Kok, J., Vroonhof, P., Snijders, J., Roullis, G., Clarke,

M., Peereboom, K., Dorst, P., Isusi, I., and European

Agency for Safety and Health at Work (2020). Work-

related musculoskeletal disorders : prevalence, costs

and demographics in the EU. Publications Ofﬁce.

Lai, G., Chang, W.-C., Yang, Y., and Liu, H. (2018). Mod-

eling long-and short-term temporal patterns with deep

neural networks. In The 41st International ACM SI-

GIR Conference on Research & Development in In-

formation Retrieval, pages 95–104.

Lalloo, C., Jibb, L. A., Rivera, J., Agarwal, A., and Stinson,

J. N. (2015). There’s a pain app for that. The Clinical

journal of pain, 31(6):557–563.

Laptev, N., Yosinski, J., Li, L. E., and Smyl, S. (2017).

Time-series extreme event forecasting with neural net-

works at uber. In International Conference on Ma-

chine Learning, volume 34, pages 1–5.

Lee, J., Mawla, I., Kim, J., Loggia, M. L., Ortiz, A., Jung,

C., Chan, S.-T., Gerber, J., Schmithorst, V. J., Ed-

wards, R. R., et al. (2019). Machine learning-based

prediction of clinical pain using multimodal neu-

roimaging and autonomic metrics. Pain, 160(3):550.

Li, K., Liu, C., Zhu, T., Herrero, P., and Georgiou, P. (2019).

GluNet: A deep learning framework for accurate glu-

cose forecasting. IEEE journal of biomedical and

health informatics, 24(2):414–423.

Lipton, Z. C., Kale, D., and Wetzel, R. (2016). Directly

modeling missing data in sequences with RNNs: Im-

proved classiﬁcation of clinical time series. In Ma-

chine Learning for Healthcare Conference, pages

253–270.

Liu, M., Zeng, A., Xu, Z., Lai, Q., and Xu, Q. (2021).

Time series is a special sequence: Forecasting with

sample convolution and interaction. arXiv preprint

arXiv:2106.09305.

otsch, J. and Ultsch, A. (2018). Machine learning in pain

research. Pain, 159(4):623.

Makridakis, S., Spiliotis, E., and Assimakopoulos, V.

(2018). Statistical and machine learning forecasting

methods: Concerns and ways forward. PloS one,

13(3):e0194889.

McCracken, L. M. and Eccleston, C. (2005). A prospective

study of acceptance of pain and patient functioning

with chronic pain. Pain, 118(1-2):164–169.

Morlion, B., Coluzzi, F., Aldington, D., Kocot-Kepska, M.,

Pergolizzi, J., Mangas, A. C., Ahlbeck, K., and Kalso,

E. (2018). Pain chroniﬁcation: what should a non-pain

medicine specialist know? Current Medical Research

and Opinion, 34(7):1169–1178.

Moseley, G. L. and Arntz, A. (2007). The context of a

noxious stimulus affects the pain it evokes. PAIN®,

133(1-3):64–71.

Nijs, J., Van Wilgen, C. P., Van Oosterwijck, J., van Itter-

sum, M., and Meeus, M. (2011). How to explain cen-

tral sensitization to patients with ‘unexplained’chronic

musculoskeletal pain: practice guidelines. Manual

therapy, 16(5):413–418.

Puri, C., Kooijman, G., Masculo, F., Van Sambeek, S.,

Den Boer, S., Luca, S., and Vanrumste, B. (2019).

PREgDICT: Early prediction of gestational weight

gain for pregnancy care. In 2019 41st Annual In-

ternational Conference of the IEEE Engineering in

Medicine and Biology Society (EMBC), pages 4274–

4278. IEEE.

Rasmussen, C. E. (2004). Gaussian processes in machine

learning. In Advanced lectures on machine learning,

pages 63–71. Springer.

Shibata, R. (1976). Selection of the order of an au-

toregressive model by akaike’s information criterion.

Biometrika, 63(1):117–126.

Shumway, R. H. and Stoffer, D. S. (2017). Time series anal-

ysis and its applications: with R examples. Springer.

Strodthoff, N. and Wagner, P. e. (2020). Deep learning for

ECG analysis: Benchmarks and insights from PTB-

XL. IEEE Journal of Biomedical and Health Infor-

matics, 25(5):1519–1528.

Tighe, P. J., Harle, C. A., Hurley, R. W., Aytug, H.,

Boezaart, A. P., and Fillingim, R. B. (2015). Teach-

ing a machine to feel postoperative pain: combining

high-dimensional clinical data with machine learning

algorithms to forecast acute postoperative pain. Pain

Medicine, 16(7):1386–1401.

Vlaeyen, J. W. and Linton, S. J. (2000). Fear-avoidance and

its consequences in chronic musculoskeletal pain: a

state of the art. Pain, 85(3):317–332.

Daily Pain Prediction in Workplace Using Gaussian Processes

247