Process Mining for Demographic Insights: A Subpopulation Analysis in
Healthcare Pathways
Priya Naguine
1 a
, Faiza Bukhsh
1 b
, Jeewanie Jayasinghe Arachchige
2 c
and Rob Bemthuis
1 d
1
University of Twente, Enschede, The Netherlands
2
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
priyanaguine@hotmail.co.uk, {f.a.bukhsh, r.h.bemthuis}@utwente.nl, j.jayasinghearachchige@vu.nl
Keywords:
Process Mining, Subpopulation Analysis, Healthcare Pathways, Demographic Variations, Frozen Shoulder.
Abstract:
Demographic variations in healthcare pathways are key for delivering effective and equitable patient care. Ex-
amining pathway differences across age and gender groups can help uncover demographic-specific disparities
in care delivery. In this paper, we demonstrate the use of the Process Mining Project Methodology in Health-
care (PM
2
HC) for the subpopulation-based analysis of treatment pathways, using process mining techniques.
We validate this methodology through a case study on frozen shoulder treatment using the MIMIC-IV data set.
Key findings reveal distinct procedural sequences for male and female patients, as well as notable age-based
variations in treatment choices and timelines. These insights underscore the influence of demographic factors
on healthcare processes. Expert evaluations further highlight the practicality of the methodology and its poten-
tial to guide targeted interventions that address various patient needs, thus enhancing personalized care. This
work contributes to clinical research and practice by identifying inefficiencies and informing tailored interven-
tions. Future efforts will extend the methodology to other medical conditions and integrate multi-institutional
data for broader applicability. By advancing process mining in healthcare, this research provides insight into
improving patient care and addressing demographic diversity.
1 INTRODUCTION
Process mining techniques have demonstrated their
capabilities to uncover inefficiencies and deviations
by analyzing event logs (van der Aalst, 2011). In
healthcare, these techniques can identify delays in
diagnosis, disparities in treatment effectiveness, and
variations in access to therapies (Huang et al., 2013;
Guzzo et al., 2022). In the domain of process min-
ing, obtaining an accurate representation of patient
care pathways is key. However, this task is inherently
complex and challenging (Mans et al., 2009; de Boer
et al., 2024).
Subpopulation methodologies offer a structured
framework for analyzing variations in healthcare care
pathways, providing insight into how demographic
factors, such as age and gender, influence treatment
and outcomes (Campbell, 2013; Partington et al.,
2015; Rademaker et al., 2024; Scholte et al., 2023).
a
https://orcid.org/0000-0002-1225-2040
b
https://orcid.org/0000-0001-5978-2754
c
https://orcid.org/0000-0001-8619-6523
d
https://orcid.org/0000-0003-2791-6070
Improving healthcare delivery increasingly relies on
approaches that address patient diversity. By iso-
lating specific patient groups, subpopulation analy-
sis allows healthcare professionals to identify distinct
care paths, detect inefficiencies, and design personal-
ized interventions that improve patient outcomes and
streamline processes (West et al., 2008; Rotter et al.,
2019; Chen et al., 2023). However, despite the preva-
lence and widespread consideration of distinguishing
clinical pathways (Vanhaecht et al., 2006), barriers
remain, particularly at implementation levell (Evans-
Lacko et al., 2010; Neame et al., 2019).
Focusing on specific patient subgroups simplifies
analytical workflows, yielding insights directly rele-
vant to clinical decision-making. For example, frozen
shoulder (FS) cases often exhibit demographically
driven differences in care processes (Rababah et al.,
2020), influenced by factors such as age, gender, and
health status. Analyzing these variations supports
the development of tailored interventions, enhancing
both the responsiveness and effectiveness of health-
care delivery. Clinical pathways have been shown
to reduce hospital length of stay and costs signifi-
cantly for invasive procedures, though their effect on
Naguine, P., Bukhsh, F., Arachchige, J. J. and Bemthuis, R.
Process Mining for Demographic Insights: A Subpopulation Analysis in Healthcare Pathways.
DOI: 10.5220/0013289800003929
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 27th International Conference on Enterprise Information Systems (ICEIS 2025) - Volume 1, pages 267-277
ISBN: 978-989-758-749-8; ISSN: 2184-4992
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
267
readmissions and complications appears limited, sug-
gesting further refinement in protocol design (Rotter
et al., 2008). Further investigations into the use of
analytically-driven protocols would allow investiga-
tors to identify opportunities for enhancing the pa-
tient’s care path (Neame et al., 2019).
In this study, we use the Process Mining Project
Methodology in Healthcare (PM
2
HC) to perform
subpopulation analysis, with FS treatment serving as
our case study. By revealing demographic trends and
treatment pathways, our study demonstrates potential
for informing targeted interventions. A preliminary
evaluation with domain experts underscores its prac-
tical viability, bridging theoretical constructs and clin-
ical application, strengthening the generalizability of
process mining in healthcare.
Developing a validated methodology for subpopu-
lation analysis is key for ensuring both rigor and rele-
vance (Gonzalez and Sol, 2012; Wieringa and Moralı,
2012). The intrinsic complexity and heterogeneity
of healthcare data (Becker et al., 2021; Ma et al.,
2021; Dasaradharami Reddy and Gadekallu, 2023;
Guo and Chen, 2023) demand a structured approach
that yields consistent and reproducible insights, par-
ticularly for subgroup-specific variations. Such a
methodology underpins personalized care by consid-
ering demographic-specific needs, mitigating health
disparities, and improving overall patient outcomes.
The paper is organized as follows. Section 2 pro-
vides an overview of related work. Section 3 details
the methodology. Section 4 presents the findings from
the case study. Section 5 gives a discussion. Finally,
Section 6 concludes this article.
2 RELATED WORK
2.1 Subpopulation Analysis in
Healthcare
In the literature, we observed most subpopulation
analyses focus on improving care pathways by tailor-
ing interventions to specific demographic factors. For
example, age- and gender-based subpopulations are
frequently considered in efforts to customize health-
care delivery and optimize patient outcomes (Parting-
ton et al., 2015; Scholte et al., 2023; Rademaker et al.,
2024).
The paper (Rademaker et al., 2024) introduced a
sub-population comparison framework for analyzing
treatment procedures across different subpopulations
sepsis patient groups. Using process mining tech-
niques, their approach identifies indicators for im-
proving care pathways, with age emerging as a sig-
nificant demographic factor; an aspect explored in
greater detail throughout this article.
Sub-population analysis is particularly valuable
for chronic diseases, where patient heterogeneity of-
ten results in varying treatment responses. For ex-
ample, studies in diabetes management have demon-
strated that personalized interventions based on sub-
population characteristics, such as age and blood
pressure, can enhance glycemic control and reduce
complications (Valero-Ramon et al., 2020). Simi-
larly, research in oncology has shown that analyzing
subpopulations based on disease progression enables
more targeted and effective therapies (Amatya et al.,
2021; Alrawabdeh et al., 2023).
One of the main objectives of subpopulation anal-
ysis is to shift away from a one-size-fits-all ap-
proach, facilitating individualized, data-driven clini-
cal decision-making. This analytic method helps to
uncover complex patterns that might be overlooked
in broader analyses. However, several challenges re-
main, such as ensuring access to high-quality data, ad-
dressing biases in subpopulation definitions, and bal-
ancing ethical considerations related to patient strati-
fication (Sohail et al., 2021).
2.2 Process Mining Applications in
Clinical Pathways
Process mining play a key role in identifying gaps
between intended care protocols and their real-world
execution by mapping actual care processes, thereby
providing actionable insights to optimize care deliv-
ery. This capability is particularly critical in complex
healthcare environments, where the involvement of
multiple stakeholders and unpredictable patient tra-
jectories often result in process fragmentation and
suboptimal outcomes (Aspland et al., 2021).
Key applications of process mining in clini-
cal pathways include evaluating patient flow within
healthcare facilities, assessing compliance with clin-
ical guidelines, and optimizing resource allocation.
For example, (van der Aalst, 2016) demonstrated the
potential of process mining to reduce waiting times
and enhance care coordination by detecting process
deviations. Furthermore, integrating process mining
with subpopulation analysis enables healthcare practi-
tioners to uncover variations in care pathways that re-
flect underlying differences, facilitating more precise
interventions and targeted resource allocation (Mans
et al., 2009; de Boer et al., 2024).
ICEIS 2025 - 27th International Conference on Enterprise Information Systems
268
3 APPLICATION OF PM
2
HC
The methodology adopted in this research is
PM
2
HC (Pereira et al., 2020), which comprises six
phases. This section introduces FS as case study and
elucidates how each of the phases is applied.
3.1 Frozen Shoulder as a Case Example
This study examines different stages of FS and its
subpopulation groups. FS progresses through three
stages: the freezing stage, the frozen stage, and the
thawing stage (Rababah et al., 2020). During the
freezing stage, patients experience nocturnal pain and
restricted shoulder movement. The subsequent stage
is characterized by reduced joint pain but a progres-
sive loss of range of motion. In the final thawing
stage, patients see further pain reduction and a grad-
ual return of mobility.
FS, or adhesive capsulitis, is marked by fibrosis
and rigidity of the glenohumeral joint, leading to a de-
creased range of motion in the shoulder joint (D’Orsi
et al., 2012). This condition is more prevalent in fe-
males than males and typically occurs between the
ages of 40 and 60 (Neviaser and Hannafin, 2010).
Treatment options for FS include conserva-
tive methods, such as physical therapy, and non-
conservative surgical interventions, like capsular
release (Mena-del Horno et al., 2022). Non-
conservative treatments require the admission of FS
patients to the Intensive Care Unit (ICU).
3.2 Planning
In the planning phase, we identified subgroups for
distinct care path investigation. A brief background
study on subpopulation analysis via process mining
in healthcare is discussed in Section 2.
Subgroups were defined based on age and gen-
der, given their significant influence on FS develop-
ment (Koorevaar et al., 2017). We used the MIMIC-
IV database, encompassing data from approximately
300,000 patients admitted to a tertiary academic med-
ical centre in Boston, USA, from 2008 to 2019 (Gold-
berger et al., 2000; Johnson et al., 2021).
3.3 Extraction
We extracted data from the MIMIC-IV
database (Johnson et al., 2023), which classifies
patient diagnoses at ICU discharge using Interna-
tional Classification of Diseases (ICD) Version 9 and
10 codes. The initial task was to identify ICD codes
related to FS in the “D ICD DIAGNOSES” table
using the keywords “frozen shoulder” and “adhesive
capsulitis” in the ”long title”. The corresponding
ICD codes, versions, and diagnoses are detailed in
Table 1.
Subsequently, we identified all patients diagnosed
with conditions listed in Table 1 from the “DI-
AGNOSES ICD” table, where the “subject id” and
“hadm id” uniquely identify a patient and a patient’s
hospital admission, respectively. Note that a patient
may receive multiple FS-related diagnoses during a
single hospitalization.
To construct individual tables for each sub-
group, we extracted the “anchor age” and “gen-
der” of the patients from the “PATIENTS” ta-
ble. The “D ICD PROCEDURES” and “PROCE-
DURES ICD” tables were used to identify the pro-
cedures performed on patients in each subgroup. We
filtered the procedures to include only those pertinent
to the diagnosis and treatment of FS, based on key-
words associated with FS treatment options: shoulder,
steroid, arthroscopy, magnetic resonance imaging, ro-
tator cuff, physical therapy, range of motion testing,
and injection of insulin. The inclusion of insulin in-
jections is particularly relevant due to the common
association between FS and diabetes in affected pa-
tients (Zreik et al., 2016).
The “D ICD DIAGNOSES”, “DIAG-
NOSES ICD”, “D ICD PROCEDURES”, and
“PROCEDURES ICD” tables were important as
they contain diagnostic and procedural data for the
patients, which is key for hospital billing and is
endorsed by PM
2
HC for its reliability (Pereira et al.,
2020).
For the application of the process mining algo-
rithm, we defined cases, events, start times, and end
times. In both subgroup process comparison and bot-
tleneck analysis, a case represents a patient’s hospital
admission, and events are the procedures billed to the
patient. We used sequence numbers to indicate the
order of procedures in the absence of stored start and
end times.
Upon curating the necessary data and storing it in
the appropriate BigQuery tables, these tables were ex-
ported as CSV files for subsequent analysis.
3.4 Data Processing
During this phase, CSV files encapsulating subgroup
data were imported into ProM and converted into
XES format. These XES files were visualized using
the “LogVisualiser (LogDialog)” plugin. Table 2 pro-
vides a summary of the case, and event counts for
each subgroup, as generated by LogDialog. To dis-
cern care pathway variations across patient cohorts,
Process Mining for Demographic Insights: A Subpopulation Analysis in Healthcare Pathways
269
Table 1: ICD codes, versions and diagnoses for frozen shoulder.
ICD Code ICD Version Diagnoses
7260 9 Adhesive capsulitis of shoulder
M750 10 Adhesive capsulitis of shoulder
M7500 10 Adhesive capsulitis of unspecified shoul-
der
M7501 10 Adhesive capsulitis of right shoulder
M7502 10 Adhesive capsulitis of left shoulder
additional filtering was applied using the “Filter Log
on Event Attribute Values” plugin, enabling the ex-
clusion of specific procedures from the care pathways
to identify distinct differences.
3.5 Mining and Analysis
This phase entailed identifying care path discrepan-
cies across subgroups in medication administration
and procedural adherence during ICU stays. Process
models were constructed using ProM
1
and Disco
2
.
Process models for subgroup comparison were mined
using the following ProM plugins: “Mine with Induc-
tive Visual Miner”, “Mine Petri Net with Inductive
Miner”, and “Convert Petri Net to BPMN Diagram”.
The Inductive Miner was chosen for its superior
fitness, which quantifies the ability of the generated
process models to replicate the cases in the event
log (Bogar
´
ın et al., 2018). Initially, the “Mine with
Inductive Visual Miner” plugin was used to create an-
imations illustrating the sequence of processes. The
settings used were an “activities” slider at 1 and a
“paths” slider at 0.8, ensuring equivalence between
the Petri net and Inductive Visual Miner models. Sub-
sequently, the “Mine Petri Net with Inductive Miner”
plugin was used to generate static process models for
visual comparison, with a “noise threshold” set at 0.2
to accommodate minor deviations. Finally, the “Con-
vert Petri Net to BPMN Diagram” plugin was used
to convert Petri net models into BPMN diagrams for
analysis via BPMNDiffViz.
The tool BPMNDiffViz
3
can be used to calculate
graph similarity measures by comparing two Busi-
ness Process Model and Notation (BPMN) diagrams
and returning the minimal graph edit distance (GED).
GED is defined as the minimum number of operations
(e.g., insertions, deletions, or substitutions) required
to transform one graph into another (Skobtsov and
Kalenkova, 2019). In the context of process mod-
eling, a lower GED indicates greater similarity be-
tween the two diagrams. However, the significance of
1
https://promtools.org/
2
https://fluxicon.com/disco/
3
https://pais.hse.ru/en/research/projects/CompBPMN/
these scores depends on the specific application and
the thresholds defined by the user or domain. BPMN-
DiffViz utilizes BPMN 2.0, one of the most frequently
used standards for process modeling (Ivanov et al.,
2015).
The results and their interpretations are provided
in the subsequent section.
3.6 Evaluation
In this phase, insights obtained from the previous
phase were leveraged to suggest improvements. Fur-
ther details on this phase can be found in Section 5.
3.7 Improvement and Support
During this phase, stakeholders—such as medical
professionals—determine the course of action for im-
plementing the improvements. This step was con-
ducted in collaboration with an expert physiotherapist
in the Netherlands to discuss and evaluate the research
findings (see discussion in Section 5).
4 CASE STUDY ON FROZEN
SHOULDER TREATMENT
In this section, we describe the case study. First,
we present subgroup demographics and data descrip-
tions. Next, we describe the results regarding care
path differences among various FS patient groups, fo-
cusing on gender and age-based variations.
4.1 Subgroup Demographics and Data
Description
In discerning care path dissimilarities among patient
groups, we formulated two guiding questions: (1)
What distinguishes the care paths of male and female
frozen shoulder patients? and (2) How do the care
paths of frozen shoulder patients aged below 40, be-
tween 40 and 60 inclusive, and above 60 differ?
ICEIS 2025 - 27th International Conference on Enterprise Information Systems
270
Table 2: Number of cases and events per subgroup.
Subgroup Number of Cases Number of Events
Female
*
29 61
Male
*
34 55
Age below 40
**
8 18
Age between 40 and 60
**
39 73
Age above 60
**
16 25
*
Includes patients from all age groups
**
Includes patients from both genders
Care path comparisons among subgroups used
three key terms: “parallel” for two procedures occur-
ring in any order, “sequence” for one procedure fol-
lowing another, and “exclusive” for scenarios where
only one of two procedures can occur. Visual com-
parisons were conducted using BPMNDiffViz with
the TabuSearch algorithm, set to a maximum of 100
expansions and a tabu list, to efficiently generate pre-
cise results faster than other algorithms (Skobtsov and
Kalenkova, 2019). Note that the BPMN diagrams use
blue to denote matched elements between the sub-
groups, green for elements to be added, and red for
elements to be deleted to transform one diagram into
another.
The process models, created in ProM and Disco
for subgroup process comparison and bottleneck anal-
ysis, are available in a GitHub repository
4
.
4.2 Gender-Based Variations
Visual comparison of the care paths for male and fe-
male FS patients using BPMNDiffViz yielded a final
score of 167. Statistics are provided in Table 3, and
specific procedures performed exclusively on male or
female patients are listed in Table 4.
The procedure “Release right shoulder joint, open
approach” is performed on both male and female FS
patients. However, in male patients, it follows “Re-
pair right shoulder tendon, open approach”. In con-
trast, in female patients, it follows “Replacement of
right shoulder joint with reverse ball and socket syn-
thetic substitute, open approach”.
If performed on male patients, the procedure “Ro-
tator cuff repair” is always the first and is exclusive
of “Other local excision or destruction of lesion of
joint, shoulder. In female patients, these procedures
can occur sequentially.
If performed, the procedure “Other arthrotomy,
shoulder” is always the first for male patients. It
can be performed sequentially with “Other repair of
4
https://github.com/PriyaNaguine/
Complete-Process-Models-Frozen-Shoulder
shoulder”, but for female patients, it follows “Other
repair of shoulder”.
“Skeletal x-ray of shoulder and upper arm” is ex-
clusive to male patients, while “Magnetic resonance
imaging of other and unspecified sites” is exclusive to
female patients. Neither procedure is combined with
other procedures.
In male patients, “Other repair of the shoulder”
can be performed in parallel with “Division of joint
capsule, ligament, or cartilage, shoulder” and in se-
quence with “Rotator cuff repair”. These procedures
are sequential and exclusive for female patients, as
shown in Figures 1 and Figure 2.
In male FS patients, if performed, “Synovectomy,
shoulder” is the final procedure, following “Rotator
cuff repair” as the first procedure. In female patients,
it is exclusive with “Rotator cuff repair”. These se-
quences are illustrated in Figure 1 and Figure 2.
4.3 Age-Based Variations
We compared the care paths for patients in different
age groups to identify variations. First, we compared
patients under 40 with those aged between 40 and 60,
using BPMNDiffViz with the TabuSearch algorithm,
which yielded a final score of 135. Statistics are pro-
vided in Table 5 details the statistics, and procedures
exclusive to either age group are listed in Table 6.
Patients under 40 undergo “Release shoulder
joint” using a “Percutaneous endoscopic approach”,
whereas those aged 40–60 use an “External ap-
proach”.
Figure 3 and Figure 4 illustrate that “Other arthro-
tomy, shoulder” is sequential with “Other repair of
shoulder” for patients under 40, whereas for those
aged 40–60, these procedures are exclusive. Simi-
larly, “Synovectomy, shoulder” is sequential for pa-
tients aged 40–60 but exclusive for those under 40.
Next, we compared the care paths of patients aged
between 40 and 60 with those aged above 60, result-
ing in a final score of 142. Statistics are detailed in
Table 7, and differences in procedures are listed in
Table 8.
Process Mining for Demographic Insights: A Subpopulation Analysis in Healthcare Pathways
271
Table 3: Statistics for the comparison of the care paths between male and female patients.
Percentage of Elements Number of Elements
Matched elements 37% 35
Deleted elements
*
33% 31
Added elements
*
30% 28
*
Refer to table 4 for the differences in elements
Table 4: Procedures performed on either male or female FS patients.
Procedure Female Male
Drainage of right shoulder joint, Percutaneous approach, Diagnostic
Excision of left shoulder bursa and ligament, Percutaneous endoscopic approach
Excision of right shoulder joint, Percutaneous endoscopic approach
Other total shoulder replacement
Release right shoulder joint, External approach
Repair of recurrent dislocation of shoulder
Repair right shoulder joint, Percutaneous endoscopic approach
Repair right shoulder tendon, Open approach
02/07/2022, 00:41
BPMN Comparator by PAIS Lab
localhost:8080/ru_pais_vkr_war/comparison/fourth_step
1/2
Final step: Comparison results
Results Statistics Settings
Model: male_BPMN
Model: female_BPMN
EN
Other repair
of shoulder
Synovectom
y, shoulder
Other
arthrotomy,
shoulder
Rotator cuff
repair
Arthroscopy,
shoulder
Division of
joint
capsule,
ligament, or
cartilage,
shoulder
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Parallel gateway
Parallel gateway
Figure 1: Snapshot of the BPMN diagram for female FS patients.
02/07/2022, 16:30
BPMN Comparator by PAIS Lab
localhost:8080/ru_pais_vkr_war/comparison/fourth_step
Final step: Comparison results
Results Statistics Settings
Model: male_BPMN
START EVENT
END EVENT
Other local
excision or
destruction
of lesion of
joint,
shoulder
Rotator cuff
repair
Other
arthrotomy,
shoulder
Division of
joint
capsule,
ligament, or
cartilage,
shoulder
Other repair
of shoulder
Arthroscopy,
shoulder
Synovectom
y, shoulder
Right
Shoulder
Joint,
External
Approach
Injection of
steroid
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Parallel gateway
Parallel gateway
(http://bpmn.
Figure 2: Snapshot of the BPMN diagram for male FS patients.
Table 5: Statistics for the comparison of the care paths between patients aged below 40 and patients aged between 40 and 60.
Percentage of Elements Number of Elements
Matched elements 49% 35
Deleted elements
*
10% 7
Added elements
*
41% 29
*
Refer to table 6 for the differences in elements
Patients aged 60 and above undergo “Release
right shoulder joint” using an open approach, whereas
those aged 40–60 use an “External approach”.
For imaging procedures, patients aged 60 and
above receive “Skeletal x-ray of shoulder and upper
arm”, while those aged 40–60 undergo “Magnetic res-
onance imaging of other and unspecified sites”. These
procedures are exclusive, similar to the gender sub-
groups.
“Division of joint capsule, ligament, or cartilage,
ICEIS 2025 - 27th International Conference on Enterprise Information Systems
272
Table 6: Procedures performed on either patients aged below 40 or patients aged between 40 and 60.
Procedure Age Below 40 Age Between 40 and 60
Drainage of Right Shoulder Joint, Percutaneous Approach, Diag-
nostic
Excision of Left Shoulder Bursa and Ligament, Percutaneous En-
doscopic Approach
Excision of Right Shoulder Joint, Percutaneous Endoscopic Ap-
proach
Magnetic resonance imaging of other and unspecified sites
Other total shoulder replacement
Repair of recurrent dislocation of shoulder
Repair Right Shoulder Tendon, Open Approach
Rotator cuff repair
Table 7: Statistics for the comparison of the care paths between patients aged above 60 and patients aged between 40 and 60.
Percentage of Elements Number of Elements
Matched elements 34% 30
Deleted elements
*
27% 24
Added elements
*
39% 34
*
Refer to table 8 for the differences in elements
Table 8: Procedures performed on either patients aged above 60 or patients aged between 40 and 60.
Procedure Age Between 40 and 60 Age Above 60
Drainage of Right Shoulder Joint, Percutaneous Approach, Diag-
nostic
Excision of Right Shoulder Joint, Percutaneous Endoscopic Ap-
proach
Injection of steroid
Magnetic resonance imaging of other and unspecified sites
Other arthrotomy, shoulder
Other total shoulder replacement
Repair of recurrent dislocation of shoulder
Repair Right Shoulder Joint, Percutaneous Endoscopic Approach
Replacement of Right Shoulder Joint with Reverse Ball and
Socket Synthetic Substitute, Open Approach
Skeletal x-ray of shoulder and upper arm
02/07/2022, 00:47
BPMN Comparator by PAIS Lab
localhost:8080/ru_pais_vkr_war/comparison/fourth_step
1/2
Final step: Comparison results
Results Statistics Settings
Model: age_below_40_BPMN
Model: age between 40 and 60 BPMN
Other repair
of shoulder
Synovectom
y, shoulder
Other
arthrotomy,
shoulder
Division of
joint
capsule,
ligament, or
cartilage,
shoulder
Arthroscopy,
shoulder
Excision of
Left
Shoulder
Bursa and
Ligament,
Percutaneou
s
Endoscopic
Approach
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusi
gatewa
Exclusive
gateway
Parallel gateway
Show all markers
Final score: 135
10 Delete Task with name "Excision of Left Shoulder Bursa and Ligament, Percutaneous Endoscopic Approach"
(http://bpmn
Figure 3: Snapshot of the BPMN diagram for FS patients
aged below 40.
shoulder” is optional for patients aged 60 and above
and can be performed in parallel with “Synovectomy,
shoulder”. In patients aged 40–60, these procedures
occur sequentially. This is detailed in Figures 5 and
Figure 6.
02/07/2022, 00:53
BPMN Comparator by PAIS Lab
localhost:8080/ru_pais_vkr_war/comparison/fourth_step
1/2
Final step: Comparison results
Results Statistics Settings
al
r
n
of
f
n
er
Other repair
of shoulder
Rotator cuff
repair
Other
arthrotomy,
shoulder
Synovectom
y, shoulder
h
f
Repair Right
Sh ld
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Parallel gatew
Figure 4: Snapshot of the BPMN diagram for FS patients
aged between 40 and 60.
Figure 5 and Figure 6 show that “Synovectomy,
shoulder” is sequential with “Rotator cuff repair” for
patients aged 40–60 but exclusive for those aged 60
and above. This exclusivity also applies to “Other re-
pair of shoulder” and Arthroscopy, shoulder” in rela-
Process Mining for Demographic Insights: A Subpopulation Analysis in Healthcare Pathways
273
02/07/2022, 00:58
BPMN Comparator by PAIS Lab
localhost:8080/ru_pais_vkr_war/comparison/fourth_step
1/2
Final step: Comparison results
Results Statistics Settings
Model: age_above_60_BPMN
Model: age_between_40_and_60_BPMN
END EVENT
Division of
joint
capsule,
ligament, or
cartilage,
shoulder
Other repair
of shoulder
Rotator cuff
repair
Arthroscopy,
shoulder
Synovectom
y, shoulder
Release
Right
Shoulder
oint, Open
Approach
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Parallel gateway
Parallel gateway
(http://bpmn.io)
Figure 5: Snapshot of the BPMN diagram for FS patients
aged above 60.
02/07/2022, 01:04
BPMN Comparator by PAIS Lab
localhost:8080/ru_pais_vkr_war/comparison/fourth_step
1/2
Final step: Comparison results
Results Statistics Settings
Other repair
of shoulder
Rotator cuff
repair
Other
arthrotomy,
shoulder
Synovectom
y, shoulder
Division of
joint
capsule,
ligament, or
cartilage,
shoulder
Arthroscopy,
shoulder
Repair Right
Shoulder
Joint,
Percutaneou
s
Endoscopic
Approach
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Exclusive
gateway
Parallel gateway
Parallel gateway
Figure 6: Snapshot of the BPMN diagram for FS patients
aged between 40 and 60.
tion to “Synovectomy, shoulder”.
After comparison, we conclude that common
paths emerge in the progression of a particular dis-
ease. In our study of FS, the start and end points of
the process are similar across many subpopulations.
5 DISCUSSION
This section discusses the implications of our study’s
findings and examines the methodological strengths
and limitations of our analysis.
5.1 Implications of Findings
The analysis of FS treatment procedures reveals vari-
ations that can be explored through a subpopulation-
based approach. By incorporating evaluations from
patients, public resources, and providers, our study
identifies demographic patterns and potential areas
for process improvement.
Background research highlights a scarcity of sci-
entific literature specifically addressing FS treatment
processes, likely due to the gap between clinical re-
search and applied practice, which can take up to 17
years to bridge (Robinson et al., 2020). To mitigate
this delay, practice-based research conducted by clin-
icians can serve as a crucial link between evidence-
based findings and real-world applications (Westfall
et al., 2007).
To enrich our analysis, we consulted experienced
physiotherapists to reflect on our findings. Their in-
sights underscored distinct demographic patterns in
FS patient populations and treatment outcomes. Ap-
proximately 70% of FS patients are female, possibly
reflecting a tendency among women to seek treatment
earlier than men. Although gender does not markedly
alter care paths, age significantly influences treatment
choices. FS predominantly affects individuals be-
tween 40 and 60 years old, with older patients (60+)
more likely to develop FS following shoulder trauma
and often less inclined toward surgical interventions
due to associated risks.
Regional differences in FS treatment approaches
were also observed. Patients may opt for hospitals
over physiotherapy clinics for broader care options
and perceived treatment comprehensiveness. The
psychological aspects of FS play a key role, as main-
taining a positive mindset has been associated with
improved recovery and increased patient adherence to
necessary movement protocols.
For FS diagnosis, reliance on imaging alone is of-
ten insufficient. While MRI is favored over X-rays for
assessing capsule thickness, FS diagnosis typically
requires confirmation of a reduction in shoulder mo-
bility exceeding 50%. The discussed subpopulation
approach can support more accurate diagnoses by re-
ducing the likelihood of false positives that arise from
mismatches between patient-reported symptoms and
imaging findings.
5.2 Methodological Strengths and
Limitations
Our study adopts a research-oriented methodology fo-
cusing on FS treatment, contrasting with the approach
presented in (Rademaker et al., 2024), as detailed in
Table 9. Our process begins with an exploration of
FS techniques to identify relevant attributes within the
dataset, followed by the involvement of actual stake-
holders, such as physiotherapists in the Netherlands,
to ensure that our approach aligns with practical ob-
servations and needs. This collaboration with domain
experts emphasizes the importance of integrating clin-
ical expertise into the research.
In contrast, (Rademaker et al., 2024) adopts a
more data-driven approach, emphasizing data selec-
tion, cleaning, and preparation for analysis using pro-
cess mining tools like ProM. Their study centers on
extracting actionable insights from the data itself,
with a particular focus on using the “Inductive Visual
Miner” for process mining. While they also incorpo-
rate literature research to identify relevant attributes
and subpopulations, their primary goal is to analyze
the data to improve the healthcare process without
specific stakeholder engagement.
Our study’s strengths include the integration of
domain expertise, which enhances the practical rel-
ICEIS 2025 - 27th International Conference on Enterprise Information Systems
274
Table 9: Comparison of studies.
This Paper (Rademaker et al., 2024)
Planning Defines subgroups based on age and
gender for FS cases in MIMIC-IV;
includes literature review for back-
ground.
Selects general healthcare process
(sepsis treatment); defines research
goals, metrics, and tools; emphasizes
scope and comparison metrics.
Extraction Extracts FS-specific diagnostic and
procedural data using ICD codes;
creates subgroup datasets for ProM.
Focuses on cleaning and preparing
sepsis data from ER admissions; ex-
cludes irrelevant information.
Processing Organizes data in CSV format, im-
ports into ProM, converts to XES;
uses visual plugins and log filters;
emphasizes case and event counts.
Employs iterative analysis with event
aggregation, filtering, and log enrich-
ment; uses dotted charts; imports
data as XES into ProM for further fil-
tering and subpopulation analysis.
Subpopulation Selection Integrated during data extraction
based on age and gender; no separate
phase specified.
Dedicated phase with literature re-
search; segments data using at-
tributes like age and severity; utilizes
data cubes and the “LogVisualiser”
plugin for analysis.
Mining and Analysis Uses ProM plugins: Inductive Vi-
sual Miner, Inductive Miner for Petri
Nets; converts to BPMN diagrams;
focuses on model fitness.
Uses Inductive Visual Miner; identi-
fies resource usage, paths, and bottle-
necks; emphasizes performance and
conformance analysis.
Evaluation Translates insights into actionable
suggestions for FS care improve-
ment.
Provides suggestions for future sep-
sis studies; aims to offer best prac-
tices for stakeholders.
Improvement and Support Collaborates with stakeholders
(physiotherapist) to discuss imple-
menting findings.
Outlines future research plan; sug-
gests methodology for guiding sub-
sequent studies.
evance of our findings. By involving physiotherapists
in the analysis and interpretation of the data, we en-
sure that the insights are informed by clinical reality
and are more likely to be insightful for practice. Ad-
ditionally, our focus on a specific condition of FS, al-
lows for a detailed examination of treatment pathways
and demographic variations.
6 CONCLUSION
This paper presents a validated methodology for sub-
population analysis in healthcare using process min-
ing techniques, demonstrated through the analysis of
care pathways for frozen shoulder patients within the
MIMIC-IV dataset. By focusing on gender and age
demographics, our analysis suggests categorizing pa-
tients into subgroups—males versus females and age
groups below 40, between 40 and 60, and above
60—to reveal demographic-driven variations in care
paths.
The subpopulation process comparison revealed
differences, as indicated by the highest GED of 167
between male and female FS care paths, followed by
a GED of 166 between patients aged above 60 and
those aged 40–60, and 135 between patients aged be-
low 40 and those aged 40–60. These findings show
the role of demographic factors in shaping healthcare
delivery, offering actionable insights for personalized
interventions. For example, the substantial GEDs
suggest that male and female patients, as well as older
and younger populations, may benefit from tailored
treatment protocols. However, the clinical implica-
tions of these variations require further contextual-
ization through diverse stakeholder engagement and
deeper analysis of causal factors.
This study contributes to the field of process min-
ing by (i) applying PM
2
HC for subpopulation analy-
sis, and (ii) demonstrating how demographic stratifi-
cation can uncover inefficiencies and inform targeted
interventions in healthcare. Despite its contributions,
the study is limited by the use of data from a single
ICU and the inherent constraints of the MIMIC-IV
dataset, which restricts the generalizability of find-
ings and precise attribution of procedures to FS treat-
ment. Future research should address these limita-
tions by incorporating multi-institutional datasets, re-
fining methods to disentangle treatment-specific pro-
Process Mining for Demographic Insights: A Subpopulation Analysis in Healthcare Pathways
275
cedures, and expanding the methodology to other dis-
eases. Future studies on the evaluation of PM
2
HC for
subpopulation analysis are needed to advance the sci-
entific body of knowledge.
ACKNOWLEDGEMENTS
We would like to express our gratitude to the physio-
therapists at Fysiotherapie Polman in Enschede (the
Netherlands) for their valuable discussions and in-
sights on the frozen shoulder case study.
REFERENCES
Alrawabdeh, J., Alzu’bi, M., Alzyoud, M., Odeh, N.,
Hamadneh, Y., Mian, H., Mohyuddin, G. R., Kelkar,
A. H., Goodman, A. M., Chakraborty, R., Russler-
Germain, D. A., Mehra, N., Baggio, D., Cliff, E.
R. S., and Al Hadidi, S. (2023). Characteristics of
post hoc subgroup analyses of oncology clinical tri-
als: a systematic review. JNCI Cancer Spectrum,
7(6):pkad100.
Amatya, A. K., Fiero, M. H., Bloomquist, E. W., Sinha,
A. K., Lemery, S. J., Singh, H., Ibrahim, A.,
Donoghue, M., Fashoyin-Aje, L. A., de Claro, R. A.,
Gormley, N. J., Amiri-Kordestani, L., Sridhara, R.,
Theoret, M. R., Kluetz, P. G., Pazdur, R., Beaver, J. A.,
and Tang, S. (2021). Subgroup analyses in oncology
trials: Regulatory considerations and case examples.
Clinical Cancer Research, 27(21):5753–5756.
Aspland, E., Gartner, D., and Harper, P. (2021). Clinical
pathway modelling: a literature review. Health Sys-
tems, 10(1):1–23.
Becker, A.-K., D
¨
orr, M., Felix, S. B., Frost, F., Grabe, H. J.,
Lerch, M. M., Nauck, M., V
¨
olker, U., V
¨
olzke, H., and
Kaderali, L. (2021). From heterogeneous healthcare
data to disease-specific biomarker networks: A hier-
archical bayesian network approach. PLoS computa-
tional biology, 17(2):e1008735.
Bogar
´
ın, A., Cerezo, R., and Romero, C. (2018). Discover-
ing learning processes using inductive miner: A case
study with learning management systems (lmss). Psi-
cothema, 30:322–329.
Campbell, S. K. (2013). Use of care paths to improve pa-
tient management. Physical & Occupational Therapy
in Pediatrics, 33(1):27–38.
Chen, R. J., Wang, J. J., Williamson, D. F., Chen, T. Y.,
Lipkova, J., Lu, M. Y., Sahai, S., and Mahmood, F.
(2023). Algorithmic fairness in artificial intelligence
for medicine and healthcare. Nature Biomedical En-
gineering, 7(6):719–742.
Dasaradharami Reddy, K. and Gadekallu, T. R. (2023).
A comprehensive survey on federated learning tech-
niques for healthcare informatics. Computational In-
telligence and Neuroscience, 2023(1):8393990.
de Boer, T. R., Arntzen, R. J., Bekker, R., Buurman, B. M.,
Willems, H. C., and van der Mei, R. D. (2024). Pro-
cess mining on national health care data for the dis-
covery of patient journeys of older adults. Journal
of the American Medical Directors Association, page
105333.
D’Orsi, G. M., Via, A. G., Frizziero, A., and Oliva, F.
(2012). Treatment of adhesive capsulitis: A review.
Muscles, ligaments and tendons journal, 2(2):70–78.
Evans-Lacko, S., Jarrett, M., McCrone, P., and Thornicroft,
G. (2010). Facilitators and barriers to implementing
clinical care pathways. BMC Health Services Re-
search, 10:1–6.
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J.,
Ivanov, P., Mark, R., Mietus, J., Moody, G., Peng, C.,
and Stanley, H. (2000). PhysioBank, PhysioToolkit,
and PhysioNet: Components of a new research re-
source for complex physiologic signals. circulation,
101(23):e215–e220.
Gonzalez, R. A. and Sol, H. G. (2012). Validation and de-
sign science research in information systems. In Re-
search methodologies, innovations and philosophies
in software systems engineering and information sys-
tems, pages 403–426. IGI Global.
Guo, C. and Chen, J. (2023). Big data analytics in health-
care. In Knowledge technology and systems: Toward
establishing knowledge systems science, pages 27–70.
Springer.
Guzzo, A., Rullo, A., and Vocaturo, E. (2022). Process
mining applications in the healthcare domain: A com-
prehensive review. Wiley Interdisciplinary Reviews:
Data Mining and Knowledge Discovery, 12(2):e1442.
Huang, Z., Lu, X., Duan, H., and Fan, W. (2013). Summa-
rizing clinical pathways from event logs. Journal of
Biomedical Informatics, 46(1):111–127.
Ivanov, S. Y., Kalenkova, A. A., and Aalst, W. M. P. (2015).
BPMNDiffViz: A tool for BPMN models comparison.
1418:35–39.
Johnson, A., Bulgarelli, L., Pollard, T., Horng, S., Celi,
L. A., and Mark, R. (2021). MIMIC-IV (version 1.0).
Johnson, A. E., Bulgarelli, L., Shen, L., Gayles, A., Sham-
mout, A., Horng, S., Pollard, T. J., Hao, S., Moody,
B., Gow, B., et al. (2023). Mimic-iv, a freely acces-
sible electronic health record dataset. Scientific Data,
10(1):1.
Koorevaar, R., Riet, E., Ipskamp, M., and Bulstra, S.
(2017). Incidence and prognostic factors for post-
operative frozen shoulder after shoulder surgery: A
prospective cohort study. Archives of Orthopaedic and
Trauma Surgery, 137.
Ma, F., Ye, M., Luo, J., Xiao, C., and Sun, J. (2021). Ad-
vances in mining heterogeneous healthcare data. In
Proceedings of the 27th ACM SIGKDD Conference on
Knowledge Discovery & Data Mining, pages 4050–
4051.
Mans, R. S., Schonenberg, M. H., Song, M., van der Aalst,
W. M. P., and Bakker, P. J. M. (2009). Applica-
tion of process mining in healthcare–a case study in
a dutch hospital. In Biomedical Engineering Systems
and Technologies, pages 425–438. Springer.
ICEIS 2025 - 27th International Conference on Enterprise Information Systems
276
Mena-del Horno, S., Due
˜
nas, L., Lluch, E., Louw, A.,
Luque-Suarez, A., Mertens, M. G., Fuentes-Aparicio,
L., and Balasch-Bernat, M. (2022). A central ner-
vous system focused treatment program for people
with frozen shoulder: A feasibility study. Interna-
tional Journal of Environmental Research and Public
Health, 19(5).
Neame, M. T., Chacko, J., Surace, A. E., Sinha, I. P.,
and Hawcutt, D. B. (2019). A systematic review of
the effects of implementing clinical pathways sup-
ported by health information technologies. Jour-
nal of the American Medical Informatics Association,
26(4):356–363.
Neviaser, A. S. and Hannafin, J. A. (2010). Adhesive cap-
sulitis: A review of current treatment. The American
Journal of Sports Medicine, 38(11):2346–2356.
Partington, A., Wynn, M., Suriadi, S., Ouyang, C., and
Karnon, J. (2015). Process mining for clinical pro-
cesses: A comparative analysis of four Australian hos-
pitals. ACM Transactions on Management Informa-
tion Systems (TMIS), 5(4):1–18.
Pereira, G., Santos, E., and Maceno, M. (2020). Process
mining project methodology in healthcare: A case
study in a tertiary hospital. Network Modeling Analy-
sis in Health Informatics and Bioinformatics, 9.
Rababah, E. M., Abu Tariah, H., Halalsheha, R., and
Abo Kebar, M. (2020). Frozen shoulder: Pathogene-
sis, diagnosis and treatment. Journal of Kerman Uni-
versity of Medical Sciences, 27(5):447–455.
Rademaker, F. M., Bemthuis, R. H., Jayasinghe, J., and
Bukhsh, F. A. (2024). Analyzing sepsis treatment
variations in subpopulations with process mining. In
26th International Conference on Enterprise Informa-
tion Systems, pages 85–94.
Robinson, T., Bailey, C., Morris, H., Burns, P., Melder, A.,
Croft, C., Spyridonidis, D., Bismantara, H., Skouteris,
H., and Teede, H. (2020). Bridging the research-
practice gap in healthcare: A rapid review of research
translation centres in England and Australia. Health
research policy and systems, 18(117).
Rotter, T., de Jong, R. B., Lacko, S. E., Ronellenfitsch, U.,
and Kinsman, L. (2019). Clinical pathways as a qual-
ity strategy. Improving Healthcare Quality in Europe,
page 309.
Rotter, T., Kugler, J., Koch, R., Gothe, H., Twork, S., van
Oostrum, J. M., and Steyerberg, E. W. (2008). A sys-
tematic review and meta-analysis of the effects of clin-
ical pathways on length of stay, hospital costs and pa-
tient outcomes. BMC Health Services Research, 8:1–
15.
Scholte, M., Heidkamp, J., Hannink, G., Merkx, M. A.
W. T., Grutters, J. P. C., and Rovers, M. M. (2023).
Care pathway analysis to inform the earliest stages
of technology development: scoping oncological in-
dications in need of innovation. Value in Health,
26(12):1744–1753.
Skobtsov, A. and Kalenkova, A. (2019). Efficient algo-
rithms for finding differences between process mod-
els. In 2019 Ivannikov Ispras Open Conference (IS-
PRAS), pages 60–66.
Sohail, S. A., Bukhsh, F. A., and van Keulen, M. (2021).
Multilevel privacy assurance evaluation of healthcare
metadata. Applied Sciences, 11(22):10686.
Valero-Ramon, Z., Fernandez-Llatas, C., Valdivieso, B.,
and Traver, V. (2020). Dynamic models support-
ing personalised chronic disease management through
healthcare sensors with interactive process mining.
Sensors, 20(18):5330.
van der Aalst, W. M. P. (2011). Process mining: discovery,
conformance and enhancement of business processes,
volume 2. Springer.
van der Aalst, W. M. P. (2016). Data science in action.
Springer.
Vanhaecht, K., Bollmann, M., Bower, K., Gallagher, C.,
Gardini, A., Guezo, J., Jansen, U., Massoud, R.,
Moody, K., Sermeus, W., et al. (2006). Prevalence
and use of clinical pathways in 23 countries–an inter-
national survey by the European Pathway Association.
Journal of Integrated Care Pathways, 10(1):28–34.
West, B. T., Berglund, P., and Heeringa, S. G. (2008).
A closer examination of subpopulation analysis of
complex-sample survey data. The Stata Journal,
8(4):520–531.
Westfall, J. M., Mold, J., and Fagnan, L. (2007). Practice-
Based Research—“Blue Highways” on the NIH
Roadmap. JAMA, 297(4):403–406.
Wieringa, R. and Moralı, A. (2012). Technical action re-
search as a validation method in information systems
design science. In International Conference on De-
sign Science Research in Information Systems, pages
220–238. Springer.
Zreik, N. H., Malik, R. A., and Charalambous, C. P. (2016).
Adhesive capsulitis of the shoulder and diabetes: A
meta-analysis of prevalence. Muscles, ligaments and
tendons journal, 6(1):26–34.
Process Mining for Demographic Insights: A Subpopulation Analysis in Healthcare Pathways
277