Evaluating a GPT-4 and Retrieval-Augmented Generation-Based
Conversational Agent to Enhance Learning Experience in a MOOC
Fatma Miladi
1
, Val
´
ery Psych
´
e
1
, Awa Diattara
2
, Nour El Mawas
3
and Daniel Lemire
1
1
TELUQ University, 5800 rue Saint-Denis, Montreal, QC H2S 3L5, Canada
2
LANI, Gaston Berger University, Saint-Louis, Senegal
3
Universit
´
e de Lorraine, Crem, F-57000 Metz, France
{fatma.miladi, valery.psyche, daniel.lemire}@teluq.ca, awa.diattara@ugb.edu.sn, nour.el-mawas@univ-lorraine.fr
Keywords:
Conversational Agent, Educational Chatbot, Generative AI, Retrieval-Augmented Generation (RAG), Massive
Open Online Course (MOOC), Artificial Intelligence in Education.
Abstract:
Massive Open Online Courses (MOOCs) face significant challenges due to low completion rates, primarily
caused by insufficient personalized support for learners. To address this, we developed a pedagogical AI-
powered conversational agent enhanced with Retrieval-Augmented Generation (RAG) to provide real-time,
contextually relevant support. Our evaluation with 25 learners demonstrated a statistically significant knowl-
edge gain in the experimental group compared to the control group. Additionally, the agent achieved a high
System Usability Scale (SUS) score. These findings highlight the potential of AI technologies to enhance
online learning environments and inform future research on their role as learning companions in distance ed-
ucation.
1 INTRODUCTION
Massive Open Online Courses (MOOCs) allow stu-
dents worldwide to learn at their own pace and on
flexible schedules. This flexibility has contributed
to the rapid growth in the popularity of MOOCs.
However, despite high enrollment rates, the comple-
tion rate of MOOCs remains low. On average, less
than 10% of learners complete a MOOC (Yin et al.,
2019), raising concerns about the effectiveness of
these courses in terms of learner retention and suc-
cess. One of the key challenges contributing to these
low completion rates is the lack of personalized sup-
port during the online learning course, which is cru-
cial for learner retention and success.
A significant issue is the lack of instructor feed-
back in online courses, which leaves learners without
the guidance they need to stay motivated and engaged
in their learning. This absence of direct interaction,
combined with limited opportunities for teamwork or
group interaction, contributes to learner demotivation
and lower retention rates (Hone and El Said, 2016).
Although MOOCs typically include features such
as discussion forums to facilitate social interaction
among learners, participation remains low, with only
5% to 12% of learners actively engaging in these dis-
cussions (Chiu and Hew, 2018). Additionally, the in-
structor’s involvement in these forums is often min-
imal, leaving many learners without timely support.
This challenge is further complicated by the fact that
many participants feel unsure how to initiate mean-
ingful conversations and may be hesitant or shy to en-
gage.
Generative Artificial Intelligence (GAI) has
emerged as a promising solution to these challenges.
Specifically, models based on Generative Pre-trained
Transformers (GPTs) leverage vast amounts of data to
generate human-like text responses. These technolo-
gies are increasingly being used in various settings,
including education (Adeshola and Adepoju, 2024;
Mariani et al., 2023). However, despite its potential,
research on the application of Generative AI in edu-
cation, particularly in the context of MOOCs, is still
in its early stages (Chiu, 2024).
To bridge this gap, we designed and implemented
a pedagogical conversational agent leveraging GPT
with Retrieval-Augmented Generation (RAG). This
integration enables the agent to deliver contextually
accurate and course-specific responses by retrieving
information from a database of documents used in
the course design. This capability aims to enhance
knowledge acquisition and foster a supportive learn-
ing environment by providing relevant and precise in-
formation in real time. Specifically, we address the
Miladi, F., Psyché, V., Diattara, A., El Mawas, N. and Lemire, D.
Evaluating a GPT-4 and Retrieval-Augmented Generation-Based Conversational Agent to Enhance Learning Experience in a MOOC.
DOI: 10.5220/0013366100003932
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 17th International Conference on Computer Supported Education (CSEDU 2025) - Volume 2, pages 347-354
ISBN: 978-989-758-746-7; ISSN: 2184-5026
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
347
following research questions:
RQ1: Does the use of a GPT and RAG-enhanced
conversational agent alongside learners in the MOOC
affect their knowledge acquisition?
RQ2: Can a conversational agent enhanced by
GPT and RAG fulfill learners’ expectations in terms
of usability?
This paper is structured as follows: Section 2
provides a literature review on chatbots powered by
LLMs and RAG in education. Section 3 presents the
design of the conversational agent. Section 4 presents
the research methodology. Section 5 details the re-
sults of the quantitative analyses. Finally, Section 6
discusses the findings, and Section 7 concludes the
paper with implications for future research.
2 LITERATURE REVIEW
This section provides an overview of LLM-based
conversational agents in education, highlighting their
benefits and challenges. It then introduces the RAG
approach and examines how it improves the fac-
tual accuracy and contextual relevance of chatbot re-
sponses in educational settings.
2.1 LLM-Based Conversational Agents
in Education
The emergence of LLMs, such as ChatGPT, has sig-
nificantly enhanced educational tools by providing
richer, more adaptive interactions tailored to diverse
learner needs. Abdelghani et al. (2022) demonstrated
that GPT-3 fosters critical thinking in children by gen-
erating learning hints, which stimulate curiosity and
improve knowledge retention. Similarly, Xie et al.
(2024) found that LLM-based chatbots enhance au-
tonomy for learners seeking social interaction. How-
ever, for those focused on knowledge acquisition, fre-
quent interactions may reduce autonomy. This high-
lights the need to balance emotional support and cog-
nitive guidance for effective learning.
Despite these advantages, LLMs face challenges,
particularly their tendency to generate incorrect or bi-
ased information, known as hallucinations (Ji et al.,
2023). In educational settings, such errors can mis-
lead learners and compromise learning quality. To
mitigate this issue, Retrieval-Augmented Generation
can improve accuracy by retrieving relevant external
information, reducing hallucinations, and enhancing
response reliability (Shuster et al., 2021).
2.2 Theoretical Foundations of RAG
RAG, introduced by Lewis et al. (2020), enhances
LLM reliability by integrating external knowledge re-
trieval into the generation process. It follows three
main stages: indexing, retrieval, and generation (Gao
et al., 2023).
In the indexing stage, text from various sources is
processed and transformed into numerical vector rep-
resentations using an embedding model. These vec-
tors encode the semantic meaning of the text, enabling
the system to efficiently organize and store informa-
tion in a database for retrieval.
The retrieval stage begins when a user submits a
query. The system converts the query into a vector
representation using the same embedding model ap-
plied during indexing. It then compares this vector
with stored vectors, identifying the most relevant text
sections based on similarity scores.
In the generation stage, the retrieved text sections
are combined with the user’s query to form a context-
enriched prompt. This prompt is then processed by an
LLM, which generates a response that is more accu-
rate and contextually relevant.
2.3 RAG-Based Conversational Agents
in Education
Recent advancements in RAG have shown a promis-
ing ability to improve the accuracy and relevance of
chatbot responses in education. Taneja et al. (2024)
introduced Jill Watson, a virtual teaching assistant
that uses RAG to retrieve relevant course materials,
thereby reducing hallucinations and enhancing re-
sponse quality. The study compared Jill Watson to
virtual assistants not enhanced by RAG, demonstrat-
ing a clear improvement in response quality and a re-
duction in errors. Similarly, Yan et al. (2024) demon-
strated how the chatbot VizChat uses RAG to enhance
learning analytics dashboards, providing accurate and
transparent explanations of visual data, reducing er-
rors, and improving user comprehension. Likewise,
Liu et al. (2024) developed CS50 Duck, a GPT-4-
based conversational agent enhanced with RAG to
support students in the course. It outperformed Chat-
GPT alone by providing more accurate and course-
relevant responses. In parallel, Wang et al. (2023)
developed ChatEd, a conversational agent for higher
education that combines contextual information re-
trieval with ChatGPT. Its evaluation focused on rel-
evance, accuracy, and usefulness. Compared to Chat-
GPT alone, ChatEd performed better on these criteria
by leveraging a contextual database to align responses
with course content. Likewise, Miladi et al. (2024)
CSEDU 2025 - 17th International Conference on Computer Supported Education
348
examined the impact of RAG integration in GPT-4
and GPT-3.5 on response accuracy in an AI MOOC.
Their findings showed that RAG-enhanced models
outperformed their standard counterparts.
However, despite these promising advancements,
current research primarily focuses on technical met-
rics such as accuracy, contextual relevance, and re-
sponse clarity. These studies often overlook an
in-depth exploration of the direct impact of RAG-
enhanced language models on learning in real edu-
cational environments, such as MOOCs. Our study
addresses this gap by evaluating the effect of a RAG-
enhanced agent on learners’ knowledge acquisition
and usability.
3 MODEL DESIGN
We designed a conversational agent model based on
the RAG technique (Gao et al., 2023) integrated with
GPT-4. The model aims to enhance user interac-
tion by combining the retrieval of relevant informa-
tion from a specialized database with the generative
capabilities of large language models. Figure 1 il-
lustrates the architecture of our GPT-RAG conversa-
tional agent, which consists of seven key stages.
1. Collection and Standardization of Documents
(Figure 1 (a)). We extracted documents from the
MOOC on artificial intelligence (Psych
´
e, 2020)
as the primary source of information, including
explanatory texts, video transcripts, and tables.
These sources were converted into a uniform plain
text format to ensure consistency for further pro-
cessing.
2. Document Segmentation (Figure 1 (b)). The pre-
processed documents were divided into smaller
segments using Langchain’s recursive character-
based text splitter. Each segment was set to 2000
characters with a 200-character overlap to main-
tain context, following the parameters defined by
Aymeric Roucher
1
.
3. Embedding Model (Figure 1 (c)). The seg-
mented text was transformed into numerical rep-
resentations, called embeddings, using OpenAI’s
text-embedding-ada-002 model (Neelakantan
et al., 2022). These embeddings capture the
meaning of the text, allowing the system to find
relevant information based on similarity in mean-
ing rather than just matching words.
1
https://huggingface.co/learn/cookbook/en/
rag evaluation
4. Knowledge Base (Figure 1 (d)). The generated
embeddings were stored in a structured knowl-
edge base. This enables the system to retrieve rel-
evant information efficiently when a learner asks
a question.
5. Query Processing (Figure 1 (e)). When a learner
submits a question, it is transformed into an em-
bedding vector using the same embedding model
as in stage (c). This transformation allows the sys-
tem to compare the meaning of the question with
the stored information in the Knowledge Base,
even if the exact words do not match.
6. Semantic Search (Figure 1 (f)). The system com-
pares the numerical representation of the question
with the stored vectors using cosine similarity (Vi-
jaymeena and Kavitha, 2016). It then selects the
three most relevant text segments to provide con-
text for generating a response.
7. Enriched Prompt and Response Generation
with GPT-4 (Figure 1 (g)). The selected text seg-
ments are combined with the original question to
create an enriched prompt, which is then sent to
GPT-4. This ensures that the response is based on
reliable sources, which can help reduce errors and
enhance accuracy and contextual relevance.
4 RESEARCH METHODOLOGY
Our research is based on a MOOC focused on artifi-
cial intelligence (Psych
´
e, 2020). The course is struc-
tured into four modules, each covering different as-
pects of AI: general AI concepts, symbolic AI, con-
nectionist AI, and AI applications in education. This
study concentrates specifically on the first module.
We employed a quantitative data collection tech-
nique to address the research questions. Data were
gathered through questionnaires and analysed using
descriptive statistics to answer RQ1 and RQ2. This
approach was selected to provide a clear overview of
the data and support the analysis of experimental out-
comes, thereby improving the study’s replicability.
Ethical considerations were a key aspect of this
study. To ensure data privacy, access to collected data
was restricted to authorized personnel only. All par-
ticipants provided informed consent, and the study
received approval from TELUQ University’s Ethics
Committee (approval no. 10/2023).
4.1 Research Participants
The present study involved a sample of master’s and
bachelor’s degree students in Informatics at a public
Evaluating a GPT-4 and Retrieval-Augmented Generation-Based Conversational Agent to Enhance Learning Experience in a MOOC
349
Figure 1: Architecture of the GPT-RAG conversational agent.
university in Senegal. Initially, there were 42 students
in total, but 17 students did not complete the exper-
iment for personal reasons. Consequently, the final
number of research participants was 25. These par-
ticipants were randomly divided into a control group
(CG) (n=12; four females and eight males) and an ex-
perimental group (EG) (n=13; five females and eight
males), with participants’ ages ranging from 19 to 23.
4.2 Research Procedures
At the beginning of the study, students from both the
CG and EG completed a pre-test to assess their under-
standing of artificial intelligence concepts. The ex-
perimental group watched a short tutorial on the con-
versational agent before using it in Module 1 of the
AI MOOC. In contrast, the control group completed
the same module without access to the conversational
agent.
All participants worked individually and au-
tonomously at their own pace, with three days to com-
plete the task. To ensure timely completion, email re-
minders were sent on the second day to those who had
not yet finished.
At the end of the experiment, all participants took
a post-test to evaluate whether the chatbot signifi-
cantly enhanced their knowledge acquisition. Addi-
tionally, participants in the experimental group com-
pleted a System Usability Scale (SUS) questionnaire
to assess the chatbot’s usability.
The experimental procedure is illustrated in Fig-
ure 2, providing a simplified draft of the key steps in
the study. This figure highlights the sequence of activ-
ities, including pre-tests, post-tests, and the usability
questionnaire conducted with the experimental group.
4.3 Research Instruments
The study employed various instruments to assess
participants’ knowledge acquisition and chatbot us-
ability. To evaluate learners’ understanding of AI
in this MOOC, both groups completed a pre-test be-
fore the experiment and a post-test after Module 1 to
measure knowledge acquisition. The tests included
single-choice and short-answer questions, covering
the same concepts to ensure consistency. The results
helped address RQ1.
The System Usability Scale (SUS) (Brooke, 1996)
was chosen for its simplicity, shortness, and reliabil-
ity, even with a small sample size (Tullis and Stetson,
2004). The SUS consists of 10 statements, each rated
on a 5-point Likert scale from “Strongly Disagree”
(1 point) to “Strongly Agree” (5 points), producing
a single usability score between 0 and 100. Higher
scores indicate better usability. Odd-numbered state-
ments reflect positive attitudes, while even-numbered
statements reflect negative perceptions of the system.
Responses to the SUS questionnaire were collected
from 13 learners in the experimental group, who in-
teracted with the conversational agent. This data pro-
vided insights to answer RQ2.
5 RESULTS
This section presents the quantitative analysis of the
chatbot’s impact on knowledge acquisition and us-
CSEDU 2025 - 17th International Conference on Computer Supported Education
350
Figure 2: Experimental procedure.
ability. Knowledge acquisition was measured through
pre- and post-tests, while the chatbot’s usability was
evaluated using the SUS.
5.1 Knowledge Gain Results
To evaluate knowledge acquisition in both the con-
trol and experimental groups, pre- and post-test as-
sessments were conducted. The results, illustrated in
Figure 3, show the percentage of knowledge gained
by both groups. Initially, their average pre-test
scores were similar (72%), indicating comparable
prior knowledge levels.
After the learning activity, the experimental
group, which used the chatbot, showed a 17% in-
crease in knowledge gain, while the control group,
without the chatbot, demonstrated a 10% gain. These
results indicate that the chatbot had a positive effect
on knowledge acquisition.
Statistical analysis confirmed these findings. Both
groups showed improvement in their post-test scores,
but the experimental group exhibited a more sub-
stantial increase. The statistical analyses of pre-
test scores confirm that the control and experimen-
tal groups follow a normal distribution (Shapiro-Wilk
test, p > 0.05) and have homogeneous variances
(Levene’s test, p > 0.05). These conditions allow for
the application of a Student’s t-test, which is appro-
priate for comparing the means of two independent
groups when distributions are normal and variances
are equivalent.
The t-test revealed no significant difference be-
tween the pre-test scores of the two groups (p =
0.99 > 0.05), indicating that both groups had similar
levels of knowledge before the experiment (Table 1a).
However, a significant difference was observed in
the post-test scores (p = 0.017 < 0.05), indicating that
the conversational agent enhanced knowledge acqui-
sition (Table 1b). The effect size was large (d = 1.02),
indicating a substantial difference between the two
groups.
5.2 SUS Results
A total of 13 responses were collected from the SUS
questionnaire. Table 2 presents the detailed results for
each questionnaire item, including the mean, median,
and standard deviations for the responses.
Based on Brooke (1996), the overall SUS score is
calculated by first adjusting the scores for both odd-
and even-numbered questions. For the odd-numbered
questions (questions 1, 3, 5, 7, and 9), 1 is subtracted
from each score, and the resulting values are summed
to compute the variable X. Similarly, for the even-
numbered questions (questions 2, 4, 6, 8, and 10),
each score is subtracted from 5, and these adjusted
values are summed to compute the variable Y. The fi-
nal SUS score is obtained by adding X and Y together
and then multiplying the sum by 2.5, yielding a score
that ranges from 0 to 100.
For our chatbot, the final SUS score was calcu-
lated as 80.4, indicating a high level of usability. SUS
scores among learners ranged from 52.5 to 95 out of
100. Half of the users scored between 75 and 85, with
a median score of 82.5.
Evaluating a GPT-4 and Retrieval-Augmented Generation-Based Conversational Agent to Enhance Learning Experience in a MOOC
351
Figure 3: Average Pre- and Post-Test Scores for the EG and CG for Module 1 of the MOOC.
Table 1: Analysis of knowledge acquisition in Pre-test and Post-test.
(a) Pre-test
Group N Mean Standard deviations Median P-value
Control 12 7.2 2.19 7.5 0.99
Experimental 13 7.2 1.9 8
(b) Post-test
Group N Mean Standard deviations Median P-value
Control 12 8.2 1.14 8 0.017
Experimental 13 8.9 1.03 9
6 DISCUSSION
The findings suggest that the GPT-4-based chatbot
enhanced with RAG improved knowledge acquisi-
tion. This improvement can be explained by the chat-
bot’s ability to provide contextually relevant support
in real time. By retrieving information from exter-
nal sources, RAG reinforced the chatbot’s generative
capabilities, aiming to provide responses that were
both accurate and adapted to learners’ needs. This
enhanced response quality likely helped clarify diffi-
cult concepts, contributing to the observed increase in
knowledge gain.
Our results align with Slade et al. (2024), who
evaluated a RAG-based tutoring system for writing
assignments in an introductory psychology course.
Their findings show that students using the system
scored significantly higher on a post-test, suggest-
ing improved knowledge retention. Similarly, Ko
et al. (2024) investigated the integration of RAG with
LLMs to enhance students’ understanding and appli-
cation of complex programming concepts. Their re-
sults indicate that learners using RAG achieved bet-
ter results in solving unfamiliar problems, suggesting
improved knowledge transfer and deeper conceptual
understanding.
To address the second research question on chat-
bot usability, we used the SUS questionnaire. The
SUS score obtained for our conversational agent is
80.4. According to Bangor et al. (2009), this corre-
sponds to a “B” grade on the SUS rating scale. In
terms of acceptability, the chatbot is classified as Ac-
ceptable”, and in adjective ratings, it falls under the
“Good” category (see Figure 4). These results indi-
cate that the chatbot is well received by learners and
has strong potential to enhance user experience in ed-
ucational settings.
This work is part of a paradigm change related to
generative AI, marked by an increased use of con-
versational agents in learning, particularly in asyn-
chronous distance learning contexts. These environ-
ments require a high degree of autonomy from learn-
ers, and conversational agents could represent a sig-
nificant advancement in pedagogical support.
CSEDU 2025 - 17th International Conference on Computer Supported Education
352
Table 2: SUS questionnaire and statistics for each item.
Question Statement Mean Median Standard
deviations
1 I think that I would like to use this conversational agent. 4.46 4 0.50
2 I found the conversational agent unnecessarily complex. 1.85 2 0.86
3 I thought the conversational agent was easy to use. 4.54 5 0.63
4 I think that I would need the support of a technical person to be able
to use this conversational agent.
1.15 1 0.36
5 I found the various functions in this conversational agent were well
integrated.
3.85 4 0.77
6 I thought there was too much inconsistency in this conversational
agent.
1.54 1 0.84
7 I would imagine that most people would learn to use this conversa-
tional agent very quickly.
4.38 4 0.62
8 I found the conversational agent very cumbersome to use. 2.15 2 1.10
9 I felt very confident using the conversational agent. 4.31 4 0.72
10 I needed to learn a lot of things before I could get going with this
conversational agent.
2.69 3 1.43
Figure 4: SUS Bangor Scale (Bangor et al., 2009) and SUS score for conversational agent (Mean Value).
In this context, conversational agents function as
learning companions, as envisioned by Chan and
Baskin (1988), providing adaptive support based on
learners’ needs. They leverage their superior knowl-
edge while remaining susceptible to occasional errors.
Rather than replacing teachers or human experts, they
function as interactive learning companions, particu-
larly in contexts with limited instructional support.
This companion role is especially crucial in non-
credit distance courses, such as MOOCs, where learn-
ers must navigate content independently. By deliv-
ering contextualized and tailored responses, RAG-
enhanced conversational agents help sustain learner
engagement, mitigating the risk of dropout in online
education.
7 CONCLUSIONS
This study suggests that a GPT-4-powered conver-
sational agent enhanced with RAG improves knowl-
edge acquisition in MOOCs. By delivering real-
time, contextually relevant support, the chatbot ap-
pears to support learners’ understanding of course
content and promote a more engaging learning expe-
rience. The results indicate a statistically significant
improvement in knowledge gain, along with positive
learner perceptions of usability, reinforcing the poten-
tial of RAG-enhanced AI in online education.
Despite promising results, this study has limita-
tions, notably a small, single-institution sample that
restricts generalizability, particularly in the context
of MOOCs, where large-scale dynamics are essen-
tial. Additionally, the short study duration limited the
ability to assess long-term learning effects. Future re-
search should incorporate a larger and more diverse
participant group, extend the study period, and fur-
ther evaluate the chatbot’s effectiveness in large-scale
MOOC environments.
Future work will focus on designing an empa-
thetic conversational agent based on LLMs and RAG,
capable of detecting learners’ emotions in real time
and adapting its interactions accordingly. By tailoring
responses to learners’ emotions and needs, the agent
could enhance engagement, persistence, and learning
outcomes. Further development will refine its emo-
Evaluating a GPT-4 and Retrieval-Augmented Generation-Based Conversational Agent to Enhance Learning Experience in a MOOC
353
tion recognition capabilities to optimize interactions
and create a more adaptive and enriching educational
experience.
REFERENCES
Abdelghani, R., Wang, Y., Yuan, X., Wang, T., Sauz
´
eon,
H., and Oudeyer, P. (2022). Gpt-3-driven pedagogical
agents for training children’s curious question-asking
skills. arxiv. preprint arXiv, 2211.
Adeshola, I. and Adepoju, A. P. (2024). The opportunities
and challenges of chatgpt in education. Interactive
Learning Environments, 32(10):6159–6172.
Bangor, A., Kortum, P., and Miller, J. (2009). Determining
what individual sus scores mean: Adding an adjective
rating scale. Journal of usability studies, 4(3):114–
123.
Brooke, J. (1996). Sus: A quick and dirty usability scale.
Usability Evaluation in Industry.
Chan, T.-W. and Baskin, A. B. (1988). Studying with the
prince: The computer as a learning companion. In
Proceedings of the International Conference on Intel-
ligent Tutoring Systems, volume 194200.
Chiu, T. K. (2024). Future research recommendations
for transforming higher education with generative
ai. Computers and Education: Artificial Intelligence,
6:100197.
Chiu, T. K. and Hew, T. K. (2018). Factors influencing peer
learning and performance in mooc asynchronous on-
line discussion forum. Australasian Journal of Edu-
cational Technology, 34(4).
Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y.,
Sun, J., and Wang, H. (2023). Retrieval-augmented
generation for large language models: A survey. arXiv
preprint arXiv:2312.10997.
Hone, K. S. and El Said, G. R. (2016). Exploring the factors
affecting mooc retention: A survey study. Computers
& Education, 98:157–168.
Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E.,
Bang, Y. J., Madotto, A., and Fung, P. (2023). Survey
of hallucination in natural language generation. ACM
Computing Surveys, 55(12):1–38.
Ko, H.-T., Liu, Y.-K., Tsai, Y.-C., and Suen, S. (2024). En-
hancing python learning through retrieval-augmented
generation: A theoretical and applied innovation in
generative ai education. In International Conference
on Innovative Technologies and Learning, pages 164–
173. Springer.
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin,
V., Goyal, N., K
¨
uttler, H., Lewis, M., Yih, W.-t.,
Rockt
¨
aschel, T., et al. (2020). Retrieval-augmented
generation for knowledge-intensive nlp tasks. Ad-
vances in Neural Information Processing Systems,
33:9459–9474.
Liu, R., Zenke, C., Liu, C., Holmes, A., Thornton, P., and
Malan, D. J. (2024). Teaching cs50 with ai: leveraging
generative artificial intelligence in computer science
education. In Proceedings of the 55th ACM Techni-
cal Symposium on Computer Science Education V. 1,
pages 750–756.
Mariani, M. M., Hashemi, N., and Wirtz, J. (2023). Arti-
ficial intelligence empowered conversational agents:
A systematic literature review and research agenda.
Journal of Business Research, 161:113838.
Miladi, F., Psych
´
e, V., and Lemire, D. (2024). Leverag-
ing gpt-4 for accuracy in education: A comparative
study on retrieval-augmented generation in moocs. In
International Conference on Artificial Intelligence in
Education, pages 427–434. Springer.
Neelakantan, A., Xu, T., Puri, R., Radford, A., Han,
J. M., Tworek, J., Yuan, Q., Tezak, N., Kim, J. W.,
Hallacy, C., et al. (2022). Text and code embed-
dings by contrastive pre-training. arXiv preprint
arXiv:2201.10005.
Psych
´
e, V. (2020). Clom-Motsia: MOOC sur l’intelligence
artificielle. https://clom-motsia.teluq.ca/, last ac-
cessed Jan 17 2024.
Shuster, K., Poff, S., Chen, M., Kiela, D., and Weston, J.
(2021). Retrieval augmentation reduces hallucination
in conversation. arXiv preprint arXiv:2104.07567.
Slade, J. J., Hyk, A., and Gurung, R. A. (2024). Trans-
forming learning: Assessing the efficacy of a retrieval-
augmented generation system as a tutor for introduc-
tory psychology. In Proceedings of the Human Fac-
tors and Ergonomics Society Annual Meeting, vol-
ume 68, pages 1827–1830. SAGE Publications Sage
CA: Los Angeles, CA.
Taneja, K., Maiti, P., Kakar, S., Guruprasad, P., Rao, S., and
Goel, A. K. (2024). Jill watson: A virtual teaching
assistant powered by chatgpt. In International Con-
ference on Artificial Intelligence in Education, pages
324–337. Springer.
Tullis, T. S. and Stetson, J. N. (2004). A comparison of
questionnaires for assessing website usability. In Us-
ability professional association conference, volume 1,
pages 1–12. Minneapolis, USA.
Vijaymeena, M. and Kavitha, K. (2016). A survey on simi-
larity measures in text mining. Machine Learning and
Applications: An International Journal, 3(2):19–28.
Wang, K., Ramos, J., and Lawrence, R. (2023). Chated:
a chatbot leveraging chatgpt for an enhanced learn-
ing experience in higher education. arXiv preprint
arXiv:2401.00052.
Xie, Z., Wu, X., and Xie, Y. (2024). Can interaction with
generative artificial intelligence enhance learning au-
tonomy? a longitudinal study from comparative per-
spectives of virtual companionship and knowledge ac-
quisition preferences. Journal of Computer Assisted
Learning.
Yan, L., Zhao, L., Echeverria, V., Jin, Y., Alfredo, R.,
Li, X., Ga
ˇ
sevi’c, D., and Martinez-Maldonado, R.
(2024). Vizchat: enhancing learning analytics dash-
boards with contextualised explanations using multi-
modal generative ai chatbots. In International Con-
ference on Artificial Intelligence in Education, pages
180–193. Springer.
Yin, S., Shang, Q., Wang, H., and Che, B. (2019). The
analysis and early warning of student loss in mooc
course. In Proceedings of the ACM Turing Celebra-
tion Conference-China, pages 1–6.
CSEDU 2025 - 17th International Conference on Computer Supported Education
354