Moodle2EventLog: A Tool for Pedagogically-Driven Log Enrichment and
Analysis
Noura Joudieh
1 a
, Wil M. P. van der Aalst
2 b
, Ronan Champagnat
1 c
, Mourad Rabah
1 d
and Samuel Nowakowski
3 e
1
L3i, La Rochelle University, La Rochelle, France
2
PADS, RWTH Aachen University, Aachen, Germany
3
LORIA, Lorraine University, Nancy, France
{noura.joudieh, ronan.champagnat,mourad.rabah}@univ-lr.fr, wvdaalst@pads.rwth-aachen.de,
Keywords:
Process Mining, Educational Process Mining, Moodle, Learning Analytics, Quality of Education.
Abstract:
Learning Management Systems like Moodle generate detailed logs from student interactions, offering signif-
icant potential for learning analytics and educational process mining. However, raw logs capture interaction-
based actions rather than actual learning processes, limiting their pedagogical relevance. To address this, we
developed Moodle2EventLog, a tool that automates the cleaning, preprocessing, and semantic enrichment of
Moodle logs. The tool operates in two modules: the first cleans and structures logs by generating event logs
with key elements (case IDs, activities, timestamps), and the second enriches them by grouping low-level
events into context-aware sub-processes and maps them to ”Semantic Activities” based on Bloom’s Taxon-
omy. We tested Moodle2EventLog on logs from 65 Computer Science courses at Frederick University (471
students) from 2018–2022, and one course from La Rochelle University (36 students) in 2023, which serves
as the use case in this paper. The enriched logs enabled deeper pedagogical analysis, such as identifying learn-
ing phase frequencies, studying specific activities and resource usage, and extracting semantically informed
learner profiles linked to performance. Evaluation and instructor feedback validated the tool’s effectiveness,
demonstrating its ability to transform raw logs into pedagogically rich data, enabling the discovery of learning
paths and providing insights unattainable with original Moodle logs.
1 INTRODUCTION
The widespread use of Learning Management Sys-
tems (LMSs) has transformed e-learning, generating
huge amounts of data, including logs that capture
every interaction between students, instructors, and
course materials. This data offers significant poten-
tial for Learning Analytics (LA), Educational Pro-
cess Mining (EPM) (Bogar
´
ın et al., 2017), and Data
Mining with objectives such as understanding student
profiles, analyzing learning strategies, correlating on-
line behaviors with academic performance (Bey and
Champagnat, 2022), providing personalized recom-
mendations (Joudieh et al., 2023), and visualizing stu-
a
https://orcid.org/0000-0003-3142-2962
b
https://orcid.org/0000-0002-0955-6940
c
https://orcid.org/0000-0001-5256-5706
d
https://orcid.org/0000-0001-8136-5949
e
https://orcid.org/0000-0001-7845-5425
dent learning paths (
´
Alvarez et al., 2016). Under-
standing the learning process of students while cov-
ering a course is critical for both educators and learn-
ers. A learning process refers to the sequence of
phases that a learner undergoes to acquire a course
or skill, involving interactions with educational ma-
terials, engaging in discussions, practicing newly ac-
quired knowledge, and applying it in new contexts.
By uncovering these processes, educators can under-
stand how students are studying their course, allowing
them to improve course design and enhance the stu-
dents’ learning experience. However, the effective-
ness of these analyses relies on data quality, leading
to significant research on ensuring high-quality edu-
cational data (Umer et al., 2022).
Moodle, short for Modular Object-Oriented Dy-
namic Learning Environment, is a widely used
open-source LMS that offers a flexible platform for
creating, managing, and delivering online courses.
Its adaptability and community-driven development
452
Joudieh, N., van der Aalst, W. M. P., Champagnat, R., Rabah, M. and Nowakowski, S.
Moodle2EventLog: A Tool for Pedagogically-Driven Log Enrichment and Analysis.
DOI: 10.5220/0013327300003932
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 17th International Conference on Computer Supported Education (CSEDU 2025) - Volume 1, pages 452-463
ISBN: 978-989-758-746-7; ISSN: 2184-5026
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
make it popular across educational institutions (Cole
and Foster, 2007), supporting a wide range of teach-
ing methodologies, including online and blended
learning. Despite its rich feature set, Moodle logs pri-
marily capture system interactions rather than the ac-
tual learning processes of students, thus lacking ped-
agogical relevance in its raw form.
Learning, however, has a process-oriented na-
ture (Ga
ˇ
sevi
´
c et al., 2015), and Process Mining (PM)
(Van der Aalst, 2016) has emerged as a valuable field
for understanding these learning processes (Reimann
et al., 2014). It bridges the gap between data analysis
and process modeling, used to discover, monitor, and
improve real processes by extracting knowledge from
event logs available in information systems. EPM
in particular focuses on analyzing sequences of stu-
dent activities, such as interactions with e-learning
platforms to identify patterns that influence educa-
tional outcomes (Bogar
´
ın et al., 2017). This special-
ized field helps teachers understand and impact the
learning processes by visualizing how students en-
gage with their learning environment and detecting
bottlenecks or successful strategies.
To address the dual challenge of enhancing the
pedagogical relevance of Moodle logs and prepar-
ing them for analysis, we propose a method for their
semantic enrichment and we put it into action via
Moodle2EventLog, a tool designated to automate the
cleaning, preprocessing, and enrichment of Moodle
logs. It operates in two modules: the first gener-
ates structured event logs by capturing essential el-
ements such as case IDs, activities, and timestamps
from raw Moodle logs, and the second enriches clean
logs by grouping low-level events into context-aware
sub-processes and maps them to Semantic Activi-
ties—higher-order learning processes derived from
Bloom’s Taxonomy. We define Semantic Activities,
such as ”Study, ”Exercise, ”Synthesize, and ”As-
sess,” to reflect the pedagogical intent behind student
actions extracted from logs, offering a more mean-
ingful analysis of student learning behavior. Through
this transformation, the logs capture key stages of
cognitive development, offering instructors a more in-
sightful view of students’ learning behaviors and pro-
cesses. This enriched data serves as a valuable input
source for learning analytics, educational process
and data mining.
By using Moodle2EventLog, instructors and re-
searchers can perform advanced pedagogical anal-
yses, such as identifying learning phase frequen-
cies, studying specific activities and resource usage,
and extracting semantically informed learner profiles
linked to performance. Moreover, the enriched logs
are suitable for process mining, facilitating the dis-
covery of process models that visualize students’
learning paths. This dual benefit—enhancing learning
analytics and supporting process mining—provides a
more comprehensive understanding of student behav-
ior in e-learning environments.
Our research is thus guided by the following ques-
tions:
RQ1. How can Moodle logs be transformed and
enriched to be pedagogically relevant for extract-
ing the learning process of students in a course?
RQ2. What impact does the categorization of
low-level events into higher-order semantic activ-
ities have on the analysis of student learning pro-
cesses?
RQ3. How does Moodle2EventLog facilitate the
use of enriched event logs for learning analytics
and process mining, and what are the implications
for improving instructional design?
The structure of this paper is as follows: Section
2 outlines key concepts and background knowledge.
Section 3 reviews related work, emphasizing the im-
portance of data quality and methods for process-
ing educational logs, particularly those from Moodle.
Section 4 describes the proposed tool, its architecture,
functionalities, and how Semantic Activities are de-
fined and mapped to Moodle logs. To demonstrate its
effectiveness, Section 5 presents a case study, show-
casing the tool’s application, insights, analysis, and
evaluation. Finally, Section 6 concludes the paper
with a discussion of findings and future directions.
2 PRELIMINARIES
2.1 Process Mining
PM (Van der Aalst, 2016) analyzes sequences of
events to reveal the execution of activities in business
or educational settings. The main types of PM are:
Process Discovery: Extracting a process model
from event logs.
Conformance Checking: Comparing the event log
with a predefined model to identify deviations.
Enhancement: Improving an existing process
model using event logs to add new perspectives,
like performance metrics.
2.2 Event Log
An event log is a structured collection of data that
records specific activities or events as they occur
Moodle2EventLog: A Tool for Pedagogically-Driven Log Enrichment and Analysis
453
within an information system. Formally, an event
log L = {t
1
,t
2
, ..., t
k
} is a set of k traces where each
trace t
i
(1 i k) is a set of n
i
consecutive events
t
i
=< e
i1
, e
i2
...e
in
i
> made by the same case id. For
process mining, an event log typically consists of the
following main columns:
Case ID: An identifier for each process instance.
Activity: The specific action or task performed,
such as submitting an assignment.
Timestamp: The exact date and time when the ac-
tivity occurred.
Event logs are commonly stored and shared in the
XES (eXtensible Event Stream) format, an open stan-
dard that supports multiple attributes, hierarchical
structures, and extensions, making it ideal for com-
plex PM analyses.
2.3 Moodle Logs
Moodle logs typically include:
Time: Timestamp of the logged event.
User full name: The user performing the action.
Affected User: The user targeted by the event.
Component: Moodle module where the event oc-
curred (e.g., Assignment, File).
Event Context: The course or resource associated
with the event.
Event Name: The executed action.
Description: Brief details, including user IDs and
resources.
Origin: Source of the event (e.g., web,...).
IP Address: Originating IP address.
3 RELATED WORKS
EPM Using Moodle Logs
EPM has paved the way for a rich body of research
that combines the strengths of data mining and pro-
cess analysis to uncover insights from educational
event data (Costa et al., 2020). Among the various
sources of educational data, Moodle logs have been a
focal point in numerous studies for revealing valuable
insights into student learning behaviors (Wafda et al.,
2022). For instance, in an effort to better understand
students’ learning processes, (Juha
ˇ
n
´
ak et al., 2019)
applied process mining to extract patterns from stu-
dents’ online quiz activities within an LMS. Similarly,
(Cenka and Anggun, 2022) conducted weekly assess-
ments of student activities over a semester, uncover-
ing the most frequently accessed features, usage pat-
terns, and the relationships between engagement and
academic performance. Further, (Real et al., 2021)
used both PM and Sequential Pattern Mining (SPM)
to explore learning paths in an introductory program-
ming course. By analyzing Moodle event logs, this
study delved into specific activities, their sequence,
and the actions performed by students, uncovering
distinct behaviors and learning strategies.
Educational Data Quality
Despite the potential of Moodle logs for educational
analysis, the quality of insights drawn depends heav-
ily on the quality of the event logs. These logs of-
ten contain noise, making it necessary to filter and
enrich the data to extract meaningful learning pro-
cesses (Suriadi et al., 2017). Addressing this issue,
(Umer et al., 2022) highlighted the importance and
challenge of ensuring high-quality educational data
for EPM and LA. They developed methods to extract
standalone activities from Moodle’s database and re-
formatted them to explicitly link learner data to pro-
cess instances, thereby converting process-unaware
logs into process-oriented event logs with a focus on
quiz-taking activities. Similarly, (Aulia and Waspada,
2019) introduced an application for the preprocess-
ing and exploratory data analysis of Moodle logs,
using heuristic filtering and visualization techniques
like flow control and dotted charts. However, despite
these advancements, further improvements to the fil-
tering techniques have yet to be explored.
The importance of proper data cleaning and prepa-
ration in Moodle logs cannot be overstated, as fail-
ure to do so can lead to overly complex or unstruc-
tured process models when applying PM algorithms
(Etinger et al., 2018). One key challenge in process
mining with Moodle data is creating models that ac-
curately reflect general student behaviors without be-
ing too large or complex for teachers and students to
interpret (Bogarin et al., 2014). In educational con-
texts, the comprehensibility of models is critical, as it
ensures that both students and teachers can effectively
monitor learning processes and use the feedback for
improvements (Romero et al., 2016).
From Micro-Interactions to High-Level
Learning Actions
In addition to Moodle-based research, several stud-
ies have focused on analyzing micro-interactions and
low-level data, primarily in the context of Massive
CSEDU 2025 - 17th International Conference on Computer Supported Education
454
Open Online Courses (MOOCs), which differ signif-
icantly from Moodle in terms of their data structure
and log information. For example, (Yu et al., 2021)
investigated how interactive navigational behaviors in
connectivist MOOCs can be used to measure learning
indicators such as engagement, progress, and achieve-
ment. Their analysis involved browser events like
page loads, as well as mouse and keyboard interac-
tions data. By applying sequence pattern mining and
thematic analysis, they were able to transform these
low-level interactions into higher-order behaviors.
These methodologies have also been applied to
Self-Regulated Learning (SRL) strategies (Song et al.,
2024). In such studies, trace data from various
MOOCs and LMSs is first transformed into learn-
ing actions, which are then mapped to SRL processes
using a pattern dictionary (Osakwe et al., 2024). A
trace parser typically implements these mappings by
comparing sequences of learning actions to a prede-
fined pattern dictionary to identify SRL processes.
For example, in (Maldonado et al., 2018), trace data
from a MOOC was organized into six distinct learn-
ing actions, such as ”Video-Lecture Begin” and ”As-
sessment Pass, with process mining used to iden-
tify the most frequent sequences of these actions. By
comparing these sequences to SRL theories, the re-
searchers developed a pattern dictionary that mapped
the actions to SRL processes like elaboration, evalu-
ation, help-seeking, and task exploration. Similarly,
(Li et al., 2024) employed a theoretically informed
trace parser based on Bannert’s SRL framework, cat-
egorizing SRL processes into cognition, metacogni-
tion, and emotion. Their trace parser consisted of an
action library to convert raw trace data into learning
actions and a process library to label these actions
with specific SRL processes. Using Moodle logs,
(Cerezo et al., 2020) assessed students’ SRL skills
in an online Spanish undergraduate course by ana-
lyzing 21,629 events. Upon preprocessing four key
attributes were selected—time, anonymized student
IDs, actions, and action details—by filtering out du-
plicates, irrelevant records, and nonessential actions
like calendar checks. The authors refined 42 default
Moodle actions into 16 SRL-relevant ones, grouped
into ve categories (Planning, Learning, Executing,
Review, and Forum Peer Learning) aligned with Zim-
merman’s SRL model phases. This approach en-
abled analysis of student behaviors in relation to SRL
theory, segmenting logs into pass/fail categories and
course units. In contrast to the authors’ approach of
directly mapping individual Moodle events to high-
level SRL categories, our method avoids this direct
mapping as it isolates events from the broader con-
text of the sub-processes they form. Instead, we first
transform the log to extract the sub-processes that a
specific resource undergoes with a particular student,
which collectively contribute to the full process of
the student’s journey through the course. These sub-
processes, enriched with their contextual information,
are then mapped to higher-level activities, guided by
Bloom’s taxonomy, which we found more relevant
given our focus on learning path recommendations
within our framework (Joudieh et al., 2023).
Summary and Proposed Contributions
In summary, Moodle logs provide rich data for edu-
cational analysis, offering insights into student behav-
ior, learning paths, and engagement to enhance the ed-
ucational process. However, the effectiveness of such
analyses depends greatly on the quality and richness
of Moodle log data. While existing research has ad-
vanced data preparation and abstraction techniques,
challenges specific to Moodle logs remain.
Our proposed tool addresses these by semantically
enriching Moodle logs, transforming raw files into
pedagogically meaningful event logs ready for pro-
cess mining (XES format) without manual interven-
tion or deep expertise in Moodle’s structure. This
enrichment enables process models to illustrate stu-
dents’ pedagogical learning paths rather than simple
material interactions. Beyond EPM, the tool supports
analyses such as studying learner behaviors, extract-
ing study patterns, developing learner profiles, and
examining learner performance and resource usage.
Thus, our work contributes to improving the qual-
ity and value of educational data, specifically Moodle
logs, enabling analyses that improve student learning
experiences and support educators in refining teach-
ing strategies and course design.
4 Moodle2EventLog
This section provides a detailed overview of the Moo-
dle2EventLog, its core components, and their func-
tionalities. The overall architecture is illustrated in
Figure 1.
In a nutshell, the tool processes a log file down-
loaded from a Moodle course in Comma-Separated
Value (CSV) format. A configuration file (referred
to as ”Config File” in Figure 1) accompanies the log
and specifies metadata such as column names, file
structure, output location, language (currently, only
English is supported), time format, and enrollment
method (Moodle or external). Using this information,
the initial log file is processed in Module 1. The out-
put of the former is a clean, filtered, student-centered
Moodle2EventLog: A Tool for Pedagogically-Driven Log Enrichment and Analysis
455
Moodle event log with the key columns necessary for
process mining, as discussed in Section 2. While
Module 1 generates a well-structured event log, Mod-
ule 2 of the tool is dedicated to transforming and en-
riching this log with semantic information. The out-
put of Module 2 can be one or more files provided
in the XES standard, ready for use with any process
mining discovery algorithm, as well as in CSV format
for other types of analysis. The number of XES/CSV
files generated is determined by the configuration file,
as different files are created for different activities, re-
sulting in distinct XES/CSV files for each. In the fol-
lowing sections, a detailed explanation of these two
modules is provided.
4.1 Module 1: Cleaning and
Preprocessing
This module relies on understanding Moodle logs -
their structure and the information they contain to
clean them and prepare them as event logs. In this
module, two key pieces of information are extracted:
the Moodle IDs of users in the log and their roles.
As discussed in Section 2, a Moodle log includes all
users interacting with a course. However, to focus on
student behaviors and paths, the logs must be filtered
to include only student actions.
First, the Moodle ID attribute is added by pars-
ing the description field, which details user actions. If
the action is administrative (e.g., ”Item Created with
id...”), a dummy value is assigned to the Moodle ID.
For actions involving students (e.g., ”The user with
id 1...”), the student’s Moodle ID is extracted. While
Moodle logs can include multiple IDs for various re-
sources, this tool focuses exclusively on student ac-
tivities. To determine user roles, two methods are
used depending on the enrollment system. If enroll-
ment is managed externally, the log lacks role infor-
mation, so filtering is done using the ”User full name”
field, excluding non-students. If enrollment occurs
within Moodle, the ”Role assigned” event is parsed
(e.g., ”The user with id ’1’ assigned the role with id
’5’ to the user with id ’2’”) to identify student roles.
The Moodle IDs are matched with their correspond-
ing role IDs, filtering for id ’5’, which typically de-
notes students.
After filtering, the log includes only student inter-
actions, with the Moodle ID as the case id, uniquely
identifying each student as a process instance.
4.2 Module 2: Transformation and
Enrichment
Structurally, the output of Module 1 is ready for use
as input for process mining algorithms and other anal-
yses. However, to move beyond low-level interaction
records, it is necessary to elevate and enrich these
logs to support higher-level analysis of the learning
process. To accomplish this, Module 2 maps Moo-
dle interaction events to the pedagogical actions a stu-
dent takes while learning. This approach focuses on
capturing how a student learns a course rather than
merely how they navigate it through Moodle, as indi-
cated by the Event Name in the original logs.
Moodle is composed of various components
where each can be viewed as a process instance with
its own sequence of events (Rotelli and Monreale,
2023). Typically, one would examine the process of a
specific component to understand its sequence of in-
teractions. However, in this case, the student is treated
as the process instance. We observe the student’s
learning process as a sequence of sub-processes, each
corresponding to interactions with different contexts
within Moodle. With this in mind, the enrichment
step is preceded by a transformation step for context-
aware grouping of events that represents these sub-
processes. As illustrated in Figure 2, events are
grouped by case id and event context, maintaining
the sequence across different contexts. For instance,
case id 1 interacts with event context c1 three times at
times t1, t2 and t6, interrupted by an interaction with
context c2 at time t3. Thus, the interaction of case id
1 with c1 is divided into two different groups and re-
main separate due to intervening interactions in other
contexts. The time is then adjusted to reflect the start
time of the first event in each group.
In the enrichment step, each group of events is
mapped to one of the Semantic Activities outlined in
Table 1 by a rule-based approach explained in what
follows. An optional column, ”Peda Activity”, can be
added and returned as a new XES/CSV file, combin-
ing the semantic activity with the original event con-
text to provide a finer level of granularity, such as in-
dicating that ”a student is studying lecture 1”. While
the enrichment considers the student as the case id the
user can further select different IDs for the case id to
perform the analysis. Currently, the tool in Module 1
extracts only students IDs.
4.2.1 The up Bringing of ”Semantic Activities”
Benjamin Bloom’s Taxonomy of Learning Domains,
developed in 1956, categorizes learning into three do-
mains: cognitive, affective, and psychomotor. Bloom
views these domains as progressive, with learners
CSEDU 2025 - 17th International Conference on Computer Supported Education
456
Figure 1: Moodle2EventLog Architecture.
Figure 2: Event log transformation and enrichment with Semantic and Pedagogical Activities.
advancing through six stages in each domain as
their knowledge, attitudes, and skills evolve (Bloom,
1956). While the affective and psychomotor do-
mains are important, the cognitive domain—focused
on intellectual capabilities like knowledge and ’think-
ing’—is most measurable through digital interactions
in LMSs. For this reason, Bloom’s Taxonomy, a
framework organizing cognitive objectives into six hi-
erarchical levels, is crucial for analyzing student data
from Moodle logs. Bloom’s Taxonomy (Anderson
and Krathwohl, 2001) classifies cognitive processes
into Remembering, Understanding, Applying, Ana-
lyzing, Evaluating, and Creating, guiding educators
in defining learning outcomes (Fastiggi, 2019; Sha-
batura, 2022). By mapping Moodle logs to these cog-
nitive levels, teachers can see how students’ interac-
tions correspond to deeper cognitive processes, vali-
dating engagement with course material in alignment
with these objectives.
Using Bloom’s Revised Taxonomy, we mapped
Moodle activities to cognitive levels, as shown in Ta-
ble 1. Activities like viewing course resources corre-
spond to Remembering and Understanding (semantic
activity Study), while exercises align with Applying
and Analyzing (semantic activity Exercise). Higher-
order cognitive tasks such as ”Evaluating” and ”Cre-
ating” are translated into the semantic activities As-
sess and Synthesize, respectively, often associated
with project work or knowledge reflection and eval-
uation. Additional activities like View, Feedback,
and Interact occur during the learning process but
are not directly tied to Bloom’s levels, instead sup-
porting the overall learning progression. For activ-
ities like Study, Exercise and Assess we differenti-
ate between passive and active modes (denoted by
P and A, respectively). For example, downloading
lecture materials is considered a passive action, as it
does not guarantee completion, while submitting an
assignment is an active action, signifying completion.
By integrating Bloom’s framework into Moodle logs,
we transform the system-based interaction data into a
rich pedagogical tool. This allows instructors to not
just see what students are doing, but how their ac-
tions align with different stages of learning, providing
deeper insight into the cognitive progression of each
student and validating learning outcomes at both indi-
vidual and course levels.
4.2.2 Rule-Based Semantic Activity Extraction
After the transformation step, events with the same
context and case id are grouped into an eventList, as
previously explained. This eventList, along with the
context and component information recorded in the
Moodle logs, is used in the subsequent rule-based ap-
proach (Table 2), which applies a set of rules to deter-
Moodle2EventLog: A Tool for Pedagogically-Driven Log Enrichment and Analysis
457
Table 1: Semantic Activities to Bloom’s Taxonomy.
Semantic
Activity
Bloom’s
Level
Meaning: Explanation
Study
(Passive/Active)
Remember,
Understand
Course review: Acquiring and
comprehending knowledge.
Exercise
(Passive/Active)
Apply,
Analyze
Practical work: Solving prob-
lems and applying knowledge.
Assess
(Passive/Active)
Evaluate Evaluation: Testing under-
standing and self-reflection.
Synthesize Create Project/Practical: Applying
knowledge in new contexts.
View N/A Exploration: Exploring course
materials.
Feedback N/A Feedback: Receiving grades or
comments.
Interact N/A Interaction: Participating in
chats or forums.
mine the appropriate Semantic Activity for each con-
textualized subprocess (eventList + component + con-
text).
To develop the mapping algorithm, we gained a
base on Moodle components and processes from its
official website, various studies (Rotelli and Mon-
reale, 2023; Costa et al., 2020; Nammakhunt et al.,
2023), and a collection of Moodle logs that we col-
lected. The algorithm consists of a series of rules
structured in an if-else format. Essentially, each group
of events associated with a case
id, together with the
context and component, is translated into a semantic
activity. This translation is achieved by identifying
key events, parsing event contexts for relevant key-
words, and considering specific components.
To facilitate this process, we defined a set of key-
word dictionaries relevant to certain semantic activ-
ities, such as ”exercise, ”assess, ”interact, ”feed-
back,” and ”outline.” For instance, the ”exercise” dic-
tionary includes terms like ”lab”, ”exercise”, ”work-
sheet”, ”homework”, ..., while the ”outline” dic-
tionary identifies terms like ”outline”, ”syllabus”,
”agenda”, ”description”, ... to differentiate between
viewing course descriptions and actual studying.
Some components consistently lead to the same
semantic activity; for example, both Forum and Chat
are always mapped to the ”Interact” activity. How-
ever, certain event names, such as ”Course Module
Viewed,” can indicate different semantic activities de-
pending on the resource type, identified through key-
word searches. For instance, viewing an example
test file suggests a passive assessment activity, while
viewing a lab exercise sheet indicates a passive ex-
ercise activity. Joining a Zoom session, on the other
hand, is categorized as an active study activity. Events
that do not signify any pedagogical value are labeled
as ”Others” and can be filtered out later by instructors.
These semantic activities assist teachers in inter-
preting student behavior, identifying areas needing
support, and adjusting teaching strategies to better
meet student needs. For example, if students engage
in passive exercises (”Exercise P”) without making
active submissions (”Exercise A”), this may indicate
several potential issues: the course design might lack
sufficient opportunities for active engagement and
submission, students could be struggling to complete
the assignments, or additional study materials may be
necessary to help them grasp the concepts.
5 Moodle2EventLog IN ACTION
To illustrate the proposed semantic enrichment for
Moodle logs using the Moodle2EventLog tool, we
analyzed logs from two universities: 471 students
across 65 computer science courses at Frederick Uni-
versity, Cyrpus (2018–2022), and four log files from a
2022–2023 course at La Rochelle University, France.
Detailed results from Frederick University are pre-
sented in our previous work (Joudieh et al., 2024),
where we applied trace clustering on semantically en-
riched traces to demonstrate how semantic activities
can establish learner profiles from discovered process
models, focusing on a novel Trace Clustering Algo-
rithm. In this paper, we focus on a 2023 Process
Mining course for Master 1 students at La Rochelle
University, which lasted three months and involved
36 students. The course included three assignments,
a practical test and a course test, both conducted as
Multiple Choice Questions (MCQ) exams, and a final
project.
This section presents insights from the semantic
enrichment applied to this course, followed by ad-
vanced analyses, such as trace clustering and the ex-
traction of learner profiles using semantic activities.
These profiles are linked to students’ grades in each
cluster. Finally, we evaluate Moodle2EventLog by
comparing input and output logs and sharing instruc-
tor feedback on their experiences with the tool.
5.1 Insights Brought by the Semantic
Enrichment
This section explores the analyses possible with
the enriched Moodle logs produced by Moo-
dle2EventLog, focusing on model discovery, statisti-
cal analysis of semantic activities, and an examination
of study behavior using dotted charts.
Figure 3 presents the directly follows graph dis-
covered using the original Moodle Event names from
the cleaned event log generated from Module 1 of the
tool vs Figure 4 with semantic activities. This com-
parison highlights the primary advantage of our tool:
CSEDU 2025 - 17th International Conference on Computer Supported Education
458
Table 2: Rule-based Mapping Algorithm for Extracting Semantic Activities.
Input: eventList, context, component, Output: semantic activity (SA)
Define: exercise, assess, view, interact, feedback, outline, project
Rule Condition and Semantic Activity Mapping
Rule 1 If ’Course activity completion updated’ in eventList:
If context contains exercise SA = ’Exercise A
Else if context contains assess SA = ’Assess A
If context contains interact SA = ’Interact’
Else SA = ’Study A
Rule 2 If component in {Assignment, File submissions}:
If any of {A submission has been submitted, File uploaded, Submission
updated} in eventList:
If context contains project SA = ’Synthesize’
Else SA = ’Exercise A
Else if ’Course module viewed’ in eventList SA = ’Exercise P’
Else SA = ’others submission viewing’
Rule 3 If component = ’Quiz’:
If any of {Quiz attempt started/submitted or updated} in eventList:
If context contains project SA = ’Synthesize’
Else SA = ’Assess A
Else if ’Course module viewed’ in eventList SA = ’Assess P’
Else SA = ’others quiz attempt viewing’
Rule 4 If context = ’other’ SA = ’others’
Rule 5 If component in {Forum, Chat, Choice} SA = ’Interact’
Rule 6 If component in {H5P Package, H5P} SA = ’Study P’
Rule 7 If component in {Feedback, Overview report, User report} SA = ’Feedback’
Rule 8 If component in {Scheduler, User tours} SA = ’View’
Rule 9 If component = ’Lesson’:
If ’Question answered’ in eventList SA = ’Exercise A
Else if any of {Lesson started, Lesson resumed} in eventList SA = ’Study A
Else if any of {Message viewed, Group message sent} in eventList SA = ’Interact’
Else if ’Question viewed’ in eventList SA = ’Exercise P’
Else if context contains: exercise SA = ’Exercise A’, assess SA = ’Assess A
Rule 10 If any of {Course module viewed, Zip archive of folder downloaded} in eventList:
If context contains: exercise ’Exercise P’, assess ’Assess P’, outline ’View’, feedback
’Feedback’, interact ’Interact’, otherwise ’Study P’
Rule 11 If component = ’System’:
If any of {Badge listing viewed, User graded} in eventList ’Feedback’
Else ’View’
Rule 12 If component = Zoom meeting:
If ’Clicked join meeting button’ in eventList SA = ’Study A
Else SA = ’Study P’
Rule 13 Else SA = ’others’
it simplifies process models while providing a peda-
gogically relevant illustration of the underlying learn-
ing processes in the course. The analysis of the di-
rectly follows graph reveals several insights into stu-
dent learning behaviors. The most common learning
paths begin with Study P, followed by Exercise P and
Synthesize, indicating a typical sequence of learning,
practice, and assessment. Feedback loops between
Exercise P and Synthesize, as well as Assess A and
Assess P, suggest iterative cycles of practice and as-
sessment. Also, alternative paths in the graph imply
that students may take different routes based on indi-
vidual needs or preferences. Figure 5 illustrates the
frequency of each semantic activity, showing that the
two most prevalent activities are passive studying and
exercising, with exercising being the more frequent.
This is justified by the course structure, which in-
cludes two exercise sessions for each lecture. Figure 6
provides another perspective, allowing teachers to fo-
cus on specific activities and visualize student paths
through a dotted chart. Each line in the chart repre-
sents a student’s learning path filtered for the Study P
activity, with colors indicating different resources or
event contexts. Notably, the chart shows random ac-
cess behavior on December 4th and 5th, which teach-
ers confirmed coincided with the quiz period when
students review all course materials in preparation.
Through these analyses, this tool can be integrated
with Moodle via a dashboard for teachers, while of-
fering researchers an effective means to apply process
mining and educational analysis to Moodle logs.
5.2 Extracting Semantic Learner
Profiles
In this part, we present the results of an advanced
analysis to extract learner profiles from the enriched
logs. We apply trace clustering using the Improved
FSS encoding, as described in (Joudieh et al., 2024),
and Hierarchical Agglomerative Clustering (HAC)
Moodle2EventLog: A Tool for Pedagogically-Driven Log Enrichment and Analysis
459
Figure 3: Process Model Before Enrichment using Moodle
Event Name.
Figure 5: Frequency of Semantic Activities.
with Ward linkage, a bottom-up approach that merges
similar clusters. The dendrogram in Figure 7 suggests
a cut, indicated by the red line, which results in four
distinct clusters.
Table 3 provides a summary of the resulting clus-
ters and their profiles. For each cluster, it details the
number of traces, trace lengths, the relative frequency
of studying and exercising, as well as the average
theoretical (course grade) and practical (lab grade)
scores. The relative frequency of a semantic activity
is calculated as the proportion of its occurrences rel-
ative to the total activities within the cluster. Study-
ing and exercising were specifically chosen for anal-
ysis because other activities, such as assessment and
synthesis tasks (e.g., the group project), are typically
mandatory or collaborative, making them less reflec-
tive of individual differences.
Table 3: Resulting Cluster Analysis and Profiles.
Cluster 0 Cluster 1 Cluster 2 Cluster 3
Number of Students 12 5 8 11
Mean Trace Length [Min-Max] [58-123] [141-212] [56-86] [108-160]
Mean Lab Grade 12.4 11.6 9.4 10.9
Mean Course Grade 12 13.8 10 13.4
Relative Frequency of Exercise (in %) 46 42.1 51.4 39.3
Relative Frequency of Study (in %) 47.6 52.7 42.5 56.3
Analyzing the profiles of the clusters reveals dis-
tinct learning patterns and their impact on perfor-
mance. Cluster 0, the largest group, demonstrates
a balanced approach to studying and exercising, re-
sulting in relatively consistent grades across both the
course and lab exams. In contrast, Clusters 1 and 3
Figure 4: Process Model Using Semantic Activities.
Figure 6: Dotted Chart for Study P.
Figure 7: Dendrogram from HAC with ward linkage.
emphasize studying more than exercising, which cor-
relates with higher performance in the course MCQ.
However, this focus comes at the expense of their lab
grades, with Cluster 3 performing particularly lower
than Cluster 1. This difference may be attributed to
Cluster 1’s slightly higher engagement in exercises,
which likely supports better practical understanding.
Cluster 2 however, showcases a different learning be-
CSEDU 2025 - 17th International Conference on Computer Supported Education
460
Table 4: A Comparison between the input and output of
Moodle2EventLog.
Input Moodle Log Output Moodle Event Log
Number of Cases 45 36
Number of Event Classes 66 6
Number of Events 19422 7188
Trace Length [Min-Max] [7-1706] [56-212]
havior characterized by a lower emphasis on studying
relative to exercising. This suggests that students in
this cluster may tend to dive into exercises without
adequate preparation or understanding of the course
materials. Consequently, their grades in both MCQ
exams are relatively low, indicating potential gaps
in understanding and knowledge retention. Particu-
larly concerning is the notably failing grade in the lab
exam, suggesting a lack of foundational understand-
ing or practical application skills. This analysis high-
lights the importance of a balanced approach to study-
ing and exercising. A focus on thorough preparation
before engaging in exercises appears to be crucial for
achieving consistent performance in both course as-
sessments and lab exams. The use of semantic activ-
ities in this analysis provided a deeper understanding
of students’ performance by linking their learning ac-
tions to outcomes, which the raw Moodle event names
could not have achieved.
5.3 Evaluation of Moodle2EventLog
For evaluation, the tool was used on the course logs,
while selecting the ”Semantic Activity” as the ”activ-
ity” for the final output files.
5.3.1 Raw vs Enriched Event Logs
To evaluate Moodle2EventLog, we compared the in-
put and output files, as shown in Table 4. This com-
parison reveals that the number of cases decreased
from 45 to 36, effectively filtering out non-student
users, as confirmed by the course instructors. Addi-
tionally, the reduction in event classes from 66 to 6
–from Event name in the input to Semantic Activity
in the output–demonstrates the tool’s success in ab-
stracting event-level details. This abstraction is evi-
dent in the reduced log length and the changes in min-
imum and maximum trace lengths. The time taken to
process the input CSV file and generate the XES out-
put file for this course log was 17 seconds.
5.3.2 Instructors’ Feedback
To evaluate the effectiveness of Moodle2EventLog
and its alignment with our research questions, feed-
back was collected via a questionnaire from five in-
structors of the Process Mining course who used the
tool to analyze their Moodle course log data. The
evaluation consisted of seven questions answered us-
ing a 5 point Likert scale, where instructors rated var-
ious aspects of the tool from 1 (strongly disagree) to
5 (strongly agree). Additionally, three open-ended
questions allowed instructors to reflect on their expe-
riences with the tool (Questionnaire can be accessed
via this link). The results of the Likert scale questions,
summarized in Table 5, provide insights into the tool’s
strengths and highlight areas for further improvement.
The tool’s ability to help instructors identify new
patterns in student behavior received an average score
of 4.2 (SD = 0.83), indicating that the categorization
of low-level events into semantic activities revealed
insights that were not previously visible to instructors.
This is a key finding relevant to RQ2, as it highlights
the significant impact of semantic enrichment on the
analysis of student interactions with course content,
making the learning process more understandable.
Regarding the interpretability of the enriched
event logs, the tool was rated highly, with a score
of 4.4 (SD = 0.54). This further emphasizes how
the semantic categorization simplifies the logs, mak-
ing them easier for instructors to use in their anal-
ysis, thereby supporting RQ2 and RQ3. However,
when asked about the tool’s ability to influence teach-
ing strategies, instructors provided a lower rating of
3.2 (SD = 0.83). While the tool provided valuable
insights, further refinement is needed to ensure that
these insights are actionable enough to directly inform
instructional design, as explored in RQ3.
In terms of accuracy, the tool received a score of
3.6 (SD = 0.54) for capturing how students interacted
with course materials, suggesting some room for im-
provement. Nonetheless, the process models gener-
ated from the semantic activities were rated as ped-
agogically relevant (4.2, SD = 0.83) and simplified
(4.4, SD = 0.89), indicating that the semantic activi-
ties facilitated a clearer and more useful understand-
ing of student behavior. This reinforces the impor-
tance of log enrichment for generating meaningful
pedagogical models, as discussed in RQ2 and RQ3.
In addition to the structured feedback, instruc-
tors provided valuable insights through open-ended
questions regarding the tool’s functionality. When
asked for suggestions on enhancing the tool’s abil-
ity to provide more meaningful insights, instructors
emphasized the need for more detailed statistical data
on user learning sequences, such as the duration be-
tween events and time of day. They also highlighted
the importance of refining semantic activities to make
them more precise and meaningful. For instance, dis-
tinguishing between different types of ”study” activi-
ties—such as passive study related to reading lectures
versus exploring external resources—could signifi-
Moodle2EventLog: A Tool for Pedagogically-Driven Log Enrichment and Analysis
461
Table 5: Instructors Feedback Likert Scale Results.
Evaluation Mean Standard Deviation (SD)
Reflecting Learning Process 4 0.70
Identifying New Patterns 4.2 0.83
Interpretability of Event Logs 4.4 0.54
Influencing Teaching Strategies 3.2 0.83
Accuracy of Captured Data 3.6 0.54
Process Models Pedagogical Relevance 4.2 0.83
Process Models Simplicity 4.4 0.89
cantly improve clarity. Furthermore, instructors noted
the potential for enriched logs to provide insights
into students’ engagement with course materials from
other courses, which could influence their current as-
signments. This feedback suggests that while the cur-
rent semantic activities enrich the models, there is a
need for further refinement to better capture indicators
of student engagement. Regarding expectations for
additional analyses from enriched Moodle logs, in-
structors expressed interest in tracking the time spent
working on courses outside of class and linking com-
pleted work to learner progress through assessments.
In summary, the instructor feedback shows that
Moodle2EventLog provides a valuable tool for enrich-
ing Moodle logs, making them more pedagogically
relevant and improving their interpretability for learn-
ing analytics and process mining. While the tool has
demonstrated strong results in helping instructors un-
cover new insights into student behavior and gener-
ating simplified models of learning processes, addi-
tional work is required to ensure that these insights
can be integrated into instructional design and teach-
ing strategies.
6 CONCLUSION
Moodle generates extensive data invaluable for an-
alyzing student behavior, learning profiles, and en-
gagement. This data helps educators refine course de-
sign and supports students in improving their learning
experiences. While previous studies have examined
aspects like quiz behavior and resource usage, chal-
lenges with data quality, cleaning, and preparation
persist. The granularity of Moodle data further com-
plicates extracting meaningful insights. To address
these issues, we developed Moodle2EventLog, a tool
that converts raw Moodle logs into student-centered
event logs optimized for process mining. Validated
across multiple courses, the tool cleans and trans-
forms logs by filtering non-student activities and ag-
gregating detailed events into semantic activities, en-
abling clearer analysis and insightful visualizations of
learning patterns.
Despite its benefits, Moodle2EventLog has lim-
itations. The tool could be enhanced by expand-
ing language support, integrating natural language
processing for advanced semantic extraction, and
adding features for analyzing additional Moodle re-
sources. While our research focused on computer sci-
ence courses, adaptations may be needed for other
domains. Improving the user interface would also
increase accessibility for educators and researchers.
Additionally, Moodle’s logging limitations, such as
inadequate recording of H5P packages and lessons,
may lead to misinterpretations of student engagement,
and certain user interactions are only captured in site
logs, not course logs.
Feedback from instructors indicates that while the
semantic enrichment of Moodle logs yields meaning-
ful pedagogical insights and simplifies process model
interpretation, there is a demand for more granular in-
sights—particularly regarding student transitions be-
tween learning activities. Future iterations of Moo-
dle2EventLog will prioritize refining these aspects to
enhance clarity and educational relevance. In con-
clusion, our work demonstrates that by transforming
and enriching Moodle logs, we can significantly im-
prove their pedagogical relevance, ultimately support-
ing more effective learning analytics and process min-
ing applications.
REFERENCES
Anderson, L. W. and Krathwohl, D. R. (2001). A taxon-
omy for learning, teaching, and assessing: A revision
of Bloom’s taxonomy of educational objectives: com-
plete edition. Addison Wesley Longman, Inc.
Aulia, D. and Waspada, I. (2019). The design of exploratory
application and preprocessing of event log data in lms
moodle-based online learning activities for process
mining. Khazanah Informatika: Jurnal Ilmu Kom-
puter dan Informatika, 5:124–133.
Bey, A. and Champagnat, R. (2022). Analyzing Stu-
dent Programming Paths using Clustering and Pro-
cess Mining:. In Proceedings of the 14th Int. Conf.
on Computer Supported Edu., pages 76–84.
Bloom, B. (1956). A taxonomy of cognitive objectives. New
York: McKay.
Bogarin, A., Romero, C., Cerezo, R., and Sanchez-
Santillan, M. (2014). Clustering for improving educa-
tional process mining. In Proceedings of the 4th Int.
Conf. on Learning Analytics And Knowledge, page
11–15.
Bogar
´
ın, A., Cerezo, R., and Romero, C. (2017). A sur-
vey on educational process mining. Wiley Interdisci-
plinary Reviews: Data Mining and Knowledge Dis-
covery, 8.
Cenka, N. and Anggun, B. (2022). Analysing student be-
haviour in a learning management system using a pro-
cess mining approach. Knowledge Management & E-
Learning: An Int. Journal, 14:62–80.
CSEDU 2025 - 17th International Conference on Computer Supported Education
462
Cerezo, R., Bogar
´
ın, A., Esteban, M., and Romero, C.
(2020). Process mining for self-regulated learning
assessment in e-learning. Journal of Computing in
Higher Edu., 32(1):74–88.
Cole, J. and Foster, H. (2007). Using Moodle: Teaching
with the popular open source course management sys-
tem. ”O’Reilly Media, Inc.”.
Costa, J., Azevedo, A., and Rodrigues, L. (2020). Edu-
cational process mining based on moodle courses: a
review of literature. In 20th Conf. of the Portuguese
Association for Information Systems (CAPSI).
Etinger, D., Orehovacki, T., and Babic, S. (2018). Ap-
plying process mining techniques to learning manage-
ment systems for educational process model discovery
and analysis. In Intelligent Human Systems Integra-
tion (IHSI 2018), pages 420–425. Springer.
Fastiggi, W. (2019). Applying bloom’s taxonomy to the
classroom. Technology for Learners.
Ga
ˇ
sevi
´
c, D., Dawson, S., and Siemens, G. (2015). Let’s
not forget: Learning analytics are about learning.
TechTrends, 59:64–71.
Joudieh, N., Eteokleous, N., Champagnat, R., Rabah, M.,
and Nowakowski, S. (2023). Employing a process
mining approach to recommend personalized adap-
tive learning paths in blended-learning environments.
In 12th Int. Conf. in Open and Distance Learning,
Athens, Greece, volume 12.
Joudieh, N., Trabelsi, M., Champagnat, R., Rabah, M., and
Eteokleous, N. (2024). Using trace clustering to group
learning scenarios: An adaptation of fss-encoding to
moodle logs use case. In Proceedings of the 16th Int.
Conf. on Computer Supported Edu., pages 247–254.
Juha
ˇ
n
´
ak, L., Zounek, J., and Rohl
´
ıkov
´
a, L. (2019). Us-
ing process mining to analyze students’ quiz-taking
behavior patterns in a learning management system.
Computers in Human Behavior, 92:496–506.
Li, T., Fan, Y., Srivastava, N., Zeng, Z., Li, X., Khosravi, H.,
Tsai, Y.-S., Swiecki, Z., and Ga
ˇ
sevi
´
c, D. (2024). Ana-
lytics of planning behaviours in self-regulated learn-
ing: Links with strategy use and prior knowledge.
In Proceedings of the 14th Learning Analytics and
Knowledge Conf., page 438–449.
Maldonado, J., P
´
erez-Sanagust
´
ın, M., Kizilcec, R. F.,
Morales, N., and Munoz-Gama, J. (2018). Mining
theory-based patterns from big data: Identifying self-
regulated learning strategies in massive open online
courses. Computers in Human Behavior, 80:179–196.
Nammakhunt, A., Porouhan, P., and Premchaiswadi, W.
(2023). Creating and collecting e-learning event logs
to analyze learning behavior of students through pro-
cess mining. Int. Journal of Information and Edu.
Technology, 13(2):211–222.
Osakwe, I., Chen, G., Fan, Y., Rakovic, M., Singh, S.,
Molenaar, I., and Ga
ˇ
sevi
´
c, D. (2024). Measurement of
self-regulated learning: Strategies for mapping trace
data to learning processes and downstream analysis
implications. In Proceedings of the 14th Learning An-
alytics and Knowledge Conf., page 563–575.
Real, E. M., Pimentel, E. P., and Braga, J. C. (2021). Anal-
ysis of learning behavior in a programming course us-
ing process mining and sequential pattern mining. In
2021 IEEE Frontiers in Edu. Conf. (FIE), pages 1–9.
Reimann, P., Markauskaite, L., and Bannert, M. (2014). e
research and learning theory: What do sequence and
process mining methods contribute? British Journal
of Educational Technology, 45(3):528–540.
Romero, C., Cerezo, R., Bogar
´
ın, A., and S
´
anchez-
Santill
´
an, M. (2016). Educational Process Mining:
A Tutorial and Case Study Using Moodle Data Sets,
chapter 1, pages 1–28. John Wiley and Sons, Ltd.
Rotelli, D. and Monreale, A. (2023). Processing and under-
standing moodle log data and their temporal dimen-
sion. Journal of Learning Analytics, 10:126–141.
Shabatura, J. (2022). Using bloom’s taxonomy to write ef-
fective learning outcomes. University of Arkansas.
Song, Y., Oliveira, E., Kirley, M., and Thompson, P. (2024).
A case study on university student online learning pat-
terns across multidisciplinary subjects. In Proceed-
ings of the 14th Learning Analytics and Knowledge
Conf., page 936–942.
Suriadi, S., Andrews, R., ter Hofstede, A., and Wynn, M.
(2017). Event log imperfection patterns for process
mining: Towards a systematic approach to cleaning
event logs. Information Systems, 64:132–150.
Umer, R., Susnjak, T., Mathrani, A., and Suriadi, S.
(2022). Data quality challenges in educational process
mining: Building process-oriented event logs from
process-unaware online learning systems. Int. Jour-
nal of Business Information Systems, 39:569 – 592.
Van der Aalst, W. (2016). Process mining: data science in
action. Springer.
Wafda, F., Usagawa, T., and Mahendrawathi, E. (2022).
Systematic literature review on process mining in
learning management system. IEEE Int. Conf. on In-
dustry 4.0, Artificial Intelligence, and Communica-
tions Technology (IAICT), pages 160–166.
Yu, H., Harper, S., and Vigo, M. (2021). Modeling micro-
interactions in self-regulated learning: A data-driven
methodology. Int. Journal of Human-Computer Stud-
ies, 151:102625.
´
Alvarez, P., Fabra, J., Hern
´
andez, S., and Ezpeleta, J.
(2016). Alignment of teacher’s plan and students’ use
of lms resources: Analysis of moodle logs. In 15th
Int. Conf. on Information Technology Based Higher
Education and Training (ITHET), pages 1–8.
Moodle2EventLog: A Tool for Pedagogically-Driven Log Enrichment and Analysis
463