Exploring Extended Reality (XR) in Teaching AI:

A Comparative Study of XR and Desktop Environments

Miguel A. Feijoo-Garcia

1 a

, Yiqun Zhang

1 b

, Yiyin Gu

2 c

, Alejandra J. Magana

1 d

Bedrich Benes

2 e

and Voicu Popescu

2 f

Department of Computer and Information Technology, Purdue University,

401 N. Grant St., West Lafayette, IN 47907, U.S.A.

Department of Computer Science, Purdue University, 305 N. University St., West Lafayette, IN 47907, U.S.A.

{mfeijoog, zhan2897, gu251, admagana, bbenes, popescu}@purdue.edu

Keywords:

Extended Reality, Educational Technology, Immersive Learning, User Experience, Human-Computer

Interaction.

Abstract:

Artiﬁcial Intelligence (AI) concepts are abstract and difﬁcult to understand. This paper explores how immer-

sive technologies, such as extended reality (XR), can offer interactive learning experiences that can signif-

icantly enhance the educational outcomes of teaching fundamental concepts in artiﬁcial intelligence. This

comparative study analyzes the effectiveness of an XR environment for introducing Neural Networks and

Handwritten Digit Recognition by comparing the engagement, user experience, and learning outcomes of stu-

dents using XR headsets (Meta Quest 3) to those relying on a traditional desktop setup. Engagement, usability,

and user satisfaction were measured using standardized metrics, including the System Usability Scale (SUS),

Immersion Presence Questionnaire (IPQ), User Satisfaction Questionnaire (USQ), and Net Promoter Score

(NPS), on a diverse group of 56 participants. The ﬁndings indicate that the participants in the XR group

reported higher levels of engagement and immersion than those in the desktop setting. Furthermore, they

reported higher levels of satisfaction. They were more likely to recommend the experience for educational

reasons compared to the users of the Desktop group, suggesting that XR technology increases motivation and

may thus improve learning. Nevertheless, XR users recognized restrictions like unease or lack of familiarity

with immersive technologies. In addition, this study highlights how XR can help transform science, technol-

ogy, engineering, and mathematics education.

1 INTRODUCTION

Artiﬁcial Intelligence (AI) can be abstract and difﬁ-

cult to understand for learners. Modern AI generally

involves deep neural networks and natural that have

distinct particular complexities, characteristics, lim-

itations, and principles (Pham and Sampson, 2022).

Consequently, users cannot easily see or compre-

hend the internal operations of the algorithms, caus-

ing a gap between the technology and its users (Kim,

2023), adding to the difﬁculty in facilitating teaching

and learning processes of systemic and critical think-

ing toward AI concepts from a holistic perspective

https://orcid.org/0000-0001-5648-9966

https://orcid.org/0009-0000-0958-0943

https://orcid.org/0009-0004-6612-0563

https://orcid.org/0000-0001-6117-7502

https://orcid.org/0000-0002-5293-2112

https://orcid.org/0000-0002-8767-8724

(Feij

oo-Garc

ıa et al., 2021). Due to the abstract na-

ture of AI, students may not feel prepared to interact

with it, leading to a lack of conﬁdence (Chen et al.,

2020; Pedr

o, 2020). Considering all these educational

challenges when teaching and learning AI, it is neces-

sary to think of new innovative educational methods

to facilitate learners, demystifying the complexity of

AI and helping them engage with its concepts learner-

centered.

Technology is a driving force for better education

(Guo et al., 2021; Zwoli

nski et al., 2022). Emerg-

ing technologies such as extended reality (XR) have

helped create learner-centered environments in edu-

cation (Rangel-de L

azaro and Duart, 2023; Kuleto

et al., 2021). XR combines computer software with

wearable devices to produce interactive settings that

blend real and digital components (Gu et al., 2024).

Interactions can occur through immersive headsets,

augmented reality (AR) glasses, or mobile devices

472

Feijoo-Garcia, M. A., Zhang, Y., Gu, Y., Magana, A. J., Benes, B. and Popescu, V.

Exploring Extended Reality (XR) in Teaching AI: A Comparative Study of XR and Desktop Environments.

DOI: 10.5220/0013141000003912

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2025) - Volume 1: GRAPP, HUCAPP

and IVAPP, pages 472-482

ISBN: 978-989-758-728-3; ISSN: 2184-4321

(Kosko et al., 2021). XR includes AR, virtual re-

ality (VR), and mixed reality (MR) (Kharvari and

Kaiser, 2022; Jagatheesaperumal et al., 2024). AR

integrates real and digital objects, VR offers sim-

ulated environments, and MR facilitates interaction

between physical and digital elements (Ferreira and

Qureshi, 2020). The affordability and efﬁciency of

XR make it a practical and versatile tool for edu-

cation, further demonstrating its potential in educa-

tion (Kharvari and Kaiser, 2022). XR applications

span various ﬁelds in education, including STEM,

architecture, management, medicine, and art (Zhang

et al., 2024; Chiang and Liu, 2023; Kharvari and

Kaiser, 2022; Zwoli

nski et al., 2022; L

opez-Ojeda

and Hurley, 2021; Koukopoulos et al., 2022). As

a tool that promotes educational sustainability, XR

has gained worldwide interest in research and prac-

tice (Guo et al., 2021).

Using XR in education is beneﬁcial as it can fos-

ter soft skills such as teamwork, problem-solving,

and technical skills (Gu et al., 2024; Guilbaud et al.,

2021a). More beneﬁts include enhanced content

sharing and knowledge acquisition (Idrees et al.,

2022), personalized learning experiences (Fern

andez-

Batanero et al., 2024), and better support for students

with special needs (Meccawy, 2022). However, its

implementation has limitations, including the lack of

motivation or technical ability among instructors, cy-

bersickness, and the high costs of purchasing virtual

learning platforms (Zhang et al., 2024; Obeidallah

et al., 2023).

As XR technology evolves, educators are encour-

aged to adopt it to create engaging and effective learn-

ing experiences (Guilbaud et al., 2021a). Immer-

sive technologies like VR are increasingly utilized in

computer science education, which involves teach-

ing the fundamentals of abstract concepts in STEM

ﬁelds (Zhang et al., 2024). Nevertheless, the teaching

process must be incremental, integrating new topics

while reinforcing prior knowledge and involving new

technologies as instructional resources.

Different questionnaires are related to the effec-

tiveness and user experience of educational tools.

System Usability Scale (SUS) (Grier et al., 2013) is

often used to evaluate usability and satisfaction and to

identify the strengths and areas to improve (Romeike,

2019). Similarly, the User Satisfaction Question-

naire (USQ) emphasizes how usable it is and what

the user’s experience is. The Immersion Presence

Questionnaire (IPQ) (Schwind et al., 2019) is used to

determine the immersive quality of virtual environ-

ments, and the Net Promoter Score (NPS) (Baquero,

2022) provides us with evidence of student advocacy

and satisfaction. Overall, the results of these assess-

ments help optimize immersive technologies to en-

hance engagement and improve learning experiences

and outcomes in computer science education and re-

lated ﬁelds (Kara et al., 2021).

Immersive technologies and simulation-based ex-

periences can be incorporated to increase student en-

gagement and the understanding of complex con-

cepts. This stems further to interactive virtual expe-

rience encouraging active participation and practice,

resulting in better learning outcomes (Zhang et al.,

2024). That is, evidence-based decision-making is

important for improving design and addressing chal-

lenges (Brown et al., 2010), ensuring decisions are

grounded in facts rather than assumptions (Feijoo-

Garcia et al., 2024). Therefore, the following question

arises: What is the effectiveness of XR environments in

enhancing student engagement and user satisfaction

compared to traditional desktop learning setups?

2 METHODS

2.1 Context and Participants

This study is based on the survey responses collected

from N=56 participants on how they experienced an

educational approach that involved an XR environ-

ment or a traditional desktop setup to introduce Neu-

ral Networks (NN) and Handwritten Digit Recogni-

tion. The study was carried out between the spring

and summer of 2024, using a questionnaire to col-

lect demographic information and assess user experi-

ences in VR and XR, including measuring familiarity

with technologies such as NN and Handwritten Digit

Recognition. In addition, another questionnaire was

provided to participants that included different scales

to assess their experiences in the XR and desktop en-

vironments, which were chosen because of their ef-

fectiveness in measuring user experience from differ-

ent perspectives. These diverse lenses ensure that the

overall analysis captures comprehensive user insights.

Thus, the questionnaire included:

• System Usability Scale (SUS). Evaluates user

satisfaction with the XR experience, covering us-

age frequency, perceived complexity, ease of use,

need for technical support, function integration,

consistency, learning curve, and user conﬁdence

and comfort. The questionnaire comprises ten el-

ements, rated on a scale [1, 5].

• Immersion Presence Questionnaire (IPQ). As-

sesses participants’ immersion and satisfaction in

the XR environment, focusing on visual qual-

ity, interaction, realism, disconnection from sur-

Exploring Extended Reality (XR) in Teaching AI: A Comparative Study of XR and Desktop Environments

473

roundings, presence, and overall engagement. It

includes twelve elements, rated on a scale [−3, 3].

• User Satisfaction Questionnaire (USQ). Mea-

sures user satisfaction in the XR environment, em-

phasizing usability, functionality, and overall ex-

perience through three elements, rated on a scale

[1, 5].

• Net Promoter Score (NPS). Assesses partici-

pants’ likelihood or willingness to recommend the

experience to others, reﬂecting the users’ over-

all satisfaction and user advocacy regarding their

experience with the system posed. This score is

rated on a scale from [0, 10].

The age of the participants ranged from 18 to 32

years (M = 22.2, SD = 3.9) and were 67.9% male and

38% female. Many reported limited VR headset us-

age, with 14 (25.0%) having never used one and 17

(30.4%) using it only once. Similarly, 32 participants

(57.1%) had never used XR, and 13 (23.2%) had used

it once. Most of the participants were Computer Sci-

ence majors (41, 73.2%), with the majority being un-

dergraduate students (30, 53.6%).

In the XR group (29, 51.8%), most were male

(18, 62.1%), undergraduate students (20, 68.9%), and

in Computer Science (18, 62.1%). Many had used

VR headsets only once (11, 37.9%), and many had

never used XR (15, 51.7%). While 17 participants

(58.6%) were familiar with Neural Networks (NN),

18 (62.1%) were unfamiliar with Handwritten Digit

Recognition. In the Desktop group (27, 48.2%), on

the other hand, there was also a male majority (20,

74.1%), with some being undergraduate students (10,

37.1%) and studying Computer Science (23, 85.2%).

Like the XR group, many had limited VR headset

experience, with 8 participants (29.6%) using them

fewer than ﬁve times and another 8 (29.6%) never

using them. Familiarity with NN and Handwritten

Digit Recognition was relatively balanced, with 13

participants (48.1%) familiar with Handwritten Digit

Recognition and 14 (51.9%) with NN (see Table 1).

2.2 Data Collection

Data collection sessions lasted up to one hour and in-

volved no more than two participants at a time. Upon

arrival, participants completed a consent form. De-

pending on the experimental condition, they were pro-

vided with either VR headsets (i.e., Meta Quest 3) or

a laptop (see Figure 1). All participants used tablets

to respond to demographic, pre-test and post-test

questionnaires, with conﬁdentiality ensured through

unique numeric identiﬁers assigned to each partici-

pant.

Table 1: Participant Demographics and Characteristics.

Variable XR (n=29) Desktop (n=27) Total (N = 56)

Gender

Male 18 (62.1%) 20 (74.1%) 38 (67.9%)

Female 11 (37.9%) 7 (25.9%) 18 (32.1%)

Other 0 (0%) 0 (0%) 0 (0%)

VR Headset Usage

Frequently 2 (6.9%) 2 (7.4%) 4 (7.1%)

<5 times 7 (24.1%) 8 (29.6%) 15 (26.8%)

>5 times 3 (10.3%) 3 (11.1%) 6 (10.7%)

Never 6 (20.7%) 8 (29.6%) 14 (25%)

Once 11 (37.9%) 6 (22.2%) 17 (30.4%)

XR Usage

Frequently 1 (3.5%) 2 (7.4%) 3 (5.4%)

<5 times 4 (13.8%) 1 (3.7%) 5 (8.9%)

>5 times 1 (3.5%) 2 (7.4%) 3 (5.4%)

Never 15 (51.7%) 17 (62.9%) 32 (57.1%)

Once 8 (27.6%) 5 (18.5%) 13 (23.2%)

Major

Computer Sc. 18 (62.1%) 23 (85.2%) 41 (46.4%)

Data Science 3 (10.3%) 1 (3.7%) 4 (7.1%)

Computer Tech. 0 (0%) 1 (3.7%) 1 (1.8%)

Other 8 (27.6%) 2 (7.4%) 10 (17.9%)

Role

Undergraduate 20 (68.9%) 10 (37.1%) 30 (53.6%)

Graduate 9 (31.1%) 17 (62.9%) 26 (46.4%)

Familiarity with

Neural Networks

No 12 (41.4%) 13 (48.1%) 25 (44.6%)

Yes 17 (58.6%) 14 (51.9%) 31 (55.4%)

Familiarity with

Handwritten Digit

Recognition

No 18 (62.1%) 14 (51.9%) 32 (57.1%)

Yes 11 (37.9%) 13 (48.1%) 24 (42.9%)

Note: Any discrepancies in percentages are due to rounding.

Values represent frequencies with percentages in parentheses.

The steps in this study are as follows (see Fig-

ure 2): (1) Participants take time to review and sign

the consent form [S1]; (2) Participants use the tablets

to answer the demographic questionnaire [S2]; (3)

Participants use the provided tablets to answer the

pre-test questionnaire [S3]; (4) Six educational slides

were provided to teach participants how to use the

headset to complete the study [VR headset users only]

[S4]; Participants wear the headset to learn the topic.

[VR headset users only]; (5) Participants use the des-

ignated laptop to learn the topic [Desktop users only];

Participants respond during the intervention to 6 mul-

tiple choice questions [VR headset and Desktop users]

[S5]; (6) Participants use tablets to answer the post-

test questionnaire [S6].

The research team guided the participants through

each procedure, with the preparation phase lasting 15

minutes, the learning phase lasting roughly 15 min-

utes, and the ﬁnal step lasting 5 to 10 minutes. The

team was available to attend and answer all inquiries

throughout the experimental session. The start times

for the learning phase were staggered by 10 min-

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

474

Figure 1: XR and Desktop/Laptop Environments used for the experimental approach.

utes for the two participants using VR headsets in

the same session to minimize distractions during ver-

bal response questions. The participants then com-

pleted a post-questionnaire with 26 items from the

four metrics mentioned —the XR group responded to

all the items of the four metrics; the desktop group

responded to 14 questions, with three out of the four

metrics, excluding the IPQ. Figure 2 depicts the out-

line of the lesson design for this comparative study.

2.3 Ethical Considerations

This study has been approved by Purdue University’s

Institutional Review Board under IRB-2024-57. A

four-page consent form was designed to inform par-

ticipants about the purpose, duration, conﬁdentiality,

beneﬁts, risks, and other pertinent information of the

study. If they chose to participate, the participants

signed and dated the consent form ﬁrst. Participants

could also withdraw at any time.

3 RESULTS

3.1 Lesson Design

In this study, a lesson was designed on the founda-

tional concepts of Neural Networks and Handwritten

Digit Recognition. This study compared the perfor-

mance of two groups of participants exposed to the

same lesson in two settings (i.e., XR and Desktop en-

vironments), using the MNIST data set trained on a

fully connected network (Deng, 2012). Participants

began by answering ﬁve out of 14 multiple choice

questions assessing their prior knowledge of Neural

Networks.

These questions covered introductory concepts

such as the deﬁnition of a Neural Network, the roles

and functions of nodes and layers, and data processing

sequences. Participants also demonstrated an under-

standing of concepts/practices like supervised and un-

supervised learning, the impact of network architec-

ture on energy consumption, and the signiﬁcance of

Exploring Extended Reality (XR) in Teaching AI: A Comparative Study of XR and Desktop Environments

475

Participant Arrival

Consent Form

Demographic Questionnaire

Pre-test Questionnaire / Prior Knowledge

- 5 conceptual introductory multiple

choice questions

Experimental Condition

XR Group

Desktop Group

Educational Slides on

Tablets - 6 slides about

features of the XR

headset

XR Headset Learning

Laptop Learning

Interaction with the System

and Respond to Multiple

Choice Questions

[6 questions]

XR Post-test

Questionnaire / Surveys

on System Usability, User

Satisfaction, Immersive

Presence, and Likelihood

of Recommendation

Desktop Post-test

Questionnaire / Surveys on

System Usability, User

Satisfaction, and Likelihood

of Recommendation

Study Completion

Additional Details

Preparation: 15

minutes on

average

Learning: 15

minutes

Post-test: 5-10

minutes

For VR users:

staggered start times

of 10 minutes

between participants

Figure 2: Outline of the Lesson Design for the Comparative

Study between XR and Desktop Environments.

hardware components like CPUs, GPUs, and TPUs.

For example, participants faced questions like What

is a layer in a neural network? and How are GPUs,

CPUs, and TPUs important for neural networks or

CNNs? All questions were closed-ended, either mul-

tiple choice or True / False, to gauge the foundational

knowledge of neural network theory.

Further analysis of the 14 multiple choice ques-

tions showed a generally high level of understanding

by the participants, with an average of 80.7% cor-

rect. Most of the questions were correctly answered,

but some had rates below 60%, indicating knowledge

gaps-topics that need more explanation to help under-

stand.

Moreover, an independent samples t-test com-

pared mean differences between the XR and Desk-

top groups regarding “correctness” and “time spent

responding to questions” (in seconds). Welch’s test

was employed due to the lack of equal variances, as

Table 2: Independent Samples T-Test.

t df p

Mean Difference

SE Difference

Cohen’s d

SE Cohen’s d

Correctness -.6 54 .7 .1 .3 .2 .3

Duration -4.4 34.7 .0 -47.1 10.8 -1.2 .3

Note. For all tests, the alternative hypothesis speciﬁes that group

Desktop is less than group XR.

Note. Welch’s t-test.

indicated by Levene’s test for “correctness” (refer to

Table 2).

Regarding correctness, both groups had similar

mean scores (4.2 for XR and 4.1 for Desktop) with

no signiﬁcant difference (p = .7), indicating compa-

rable prior knowledge of Neural Networks. Never-

theless, the period taken by the XR group to com-

plete the pre-questionnaire was signiﬁcantly longer

than the Desktop group (33.4 seconds versus 68.7 sec-

onds; p < .001). This extended duration might result

from XR participants’ lack of familiarity with Neural

Networks —the need for much more cognitive effort.

On the other hand, during the intervention, the

lesson asked six multiple-choice questions assessing

learning outcomes. These questions focused on foun-

dational concepts related to Neural Networks and

Handwritten Digit Recognition, for example, about

How many input neurons are there in network han-

dling MNIST images, and What do weights in the neu-

ral network represent. Some of the topics addressed

in other questions were What ReLU does when given

a negative input and What ReLU is used to break in?,

and Which of the predeﬁned test cases (0, 1, 3, and 8)

has the least conﬁdence in prediction?. These ques-

tions assessed student understanding and interest in

the material presented in the lesson.

Table 3 depicts the scores for the multiple choice

questions during the intervention. The XR group

scored a mean of 4.2 (SE Mean = .2, SD = 1.2), in-

dicating high score variability, with skewness of -1.2,

suggesting more participants scored above the mean.

The scores varied from 1.0 to 6.0. In contrast, the

Desktop group got a mean score of 4.4 (SE Mean

=.1, SD =.8), indicating less variability and more con-

sistent performance. The skewness of zero indicates

a near-normal distribution, which means the scores

are uniformly distributed around the mean. In addi-

tion, the kurtosis of -.4 indicates fewer extreme val-

ues, which means fewer high or low scores, result-

ing in more consistent performance. Their scores

ranged from 3.0 to 6.0. Although the Desktop group

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

476

Table 3: Multiple Choice Questions Scores.

Statistic XR (n=29) Desktop (n=27)

Mean 4.2 4.4

SE Mean .2 .1

Std. Deviation 1.2 .8

Skewness -1.2 0

SE Skewness .4 .4

Kurtosis 1.2 -.4

SE Kurtosis 1.9 1.9

showed higher average performance and less variabil-

ity, the XR group’s lower mean score and more sig-

niﬁcant variability may stem from their unfamiliarity

with XR technology. As participants become more

accustomed to XR, their performance may improve,

potentially narrowing the gap with the Desktop group.

Furthermore, user engagement was assessed in

both the XR and desktop contexts, with the session

time metric considered an important factor in analyz-

ing user engagement and persistence. Extended peri-

ods demonstrate a deeper engagement with the topic

matter.

In addition, Table 4 depicts the results on Ses-

sion Time. Using the XR headset, the XR group had

a mean session time of 723.7 seconds (SE Mean =

33.4, SD = 179.8), indicating considerable variability.

The skewness of -.2 suggests a near-normal distribu-

tion, meaning that session times are mostly balanced

around the average. Most of the participants had ses-

sion times close to the mean, reﬂecting a balanced ex-

perience. Furthermore, a kurtosis of -.9 suggests that

the distribution is slightly ﬂatter, with fewer extreme

values and more consistent session times. That is, this

kurtosis suggests that there was a more uniform ex-

perience among the XR group than among the other

group. The session lasted 359.0-1009.0 seconds.

However, the mean session duration for desktop

users was 578.5 seconds (SE mean = 40.1), show-

ing greater variability due to a standard deviation of

212.2 seconds, suggesting that the duration of the ses-

sions varied signiﬁcantly within the desktop group,

with certain participants spending more or less time

Table 4: Session Time (in seconds).

Statistic XR (n=29) Desktop (n=27)

Mean 723.7 578.5

SE Mean 33.4 40.1

Std. Deviation 179.8 212.2

Skewness -.2 .8

SE Skewness .4 .4

Kurtosis -.9 1.5

SE Kurtosis 1.9 1.9

compared to others. This indicates that most of the

participants ﬁnished their sessions on time, while a

small number took signiﬁcantly longer, resulting in

a skewed distribution to the right. The skewness

of .8 indicates a right-skewed distribution, meaning

most session times were shorter, with a few much

longer outliers. Most desktop users had shorter ses-

sion times, but a few participants took much longer

than the average. Furthermore, a kurtosis of 1.5 in-

dicates more noticeable tails, implying more extreme

session times, both shorter and longer, resulting in in-

creased variability during the intervention. The ses-

sion times ranged from 239.0 to 1237.0 seconds.

3.2 Evaluation: Usability and

Satisfaction

The results show that participants using the XR set-

ting reported higher levels of usability, satisfaction,

and likelihood to recommend the system than those

using the Desktop version. The XR group (n = 29)

experienced consistently positive interactions, while

Desktop users (n = 27) faced more usability chal-

lenges and lower satisfaction levels. Detailed ﬁnd-

ings for each scale assessing user experiences in both

environments focused on Neural Networks and Hand-

written Digit Recognition are provided.

The results of the System Usability Scale (SUS)

(Grier et al., 2013) indicate that users found the XR

system more engaging and easier to navigate. For

instance, the statement “I think that I would like to

use this system frequently” (SUS1) received a mean

score of 3.3 in the XR group versus 3.2 in the Desk-

top group, showing a stronger inclination to use the

XR system. Although both groups rated the statement

“I thought the system was easy to use” (SUS3) simi-

larly, the Desktop group exhibited greater variability,

reﬂecting inconsistent experiences.

Moreover, as indicated in Table 5, the XR group

achieved an average SUS score of 60.1, which aligned

closely with the standard norm of 68. This shows that

the perceived usability was acceptable, yet it suggests

that there is still room for enhancement. Nonethe-

less, the average SUS score for the Desktop group of

57.9 indicated increased challenges. This also indi-

cates that while both systems require enhancements,

the XR system is showing superior performance in

terms of overall usability compared to the Desktop

system.

The User Satisfaction Questionnaire (USQ) re-

sults indicate high overall satisfaction, particularly

among XR participants, who reported greater satis-

faction than Desktop users. The XR group achieved

a mean score of 4.1 for “satisfaction with the XR ex-

Exploring Extended Reality (XR) in Teaching AI: A Comparative Study of XR and Desktop Environments

477

Table 5: System Usability Scale.

SUS1

SUS2

SUS3

SUS4

SUS5

SUS6

SUS7

SUS8

SUS9

SUS10

Descriptive Statistics: XR (n=29)

Mean 3.3 2.4 3.6 2.5 3.8 2.0 2.2 2.9 2.8 2.2

SE Mean .1 .1 .1 .2 .1 .1 .1 .2 .2 .2

Std. Dev. 1.0 1.0 1.0 1.3 .9 .9 1.0 1.1 1.2 1.3

Skewness .1 .4 -.7 .6 -.5 .9 .4 .2 -.4 .8

SE Skew. .3 .3 .3 .3 .3 .3 .3 .3 .3 .3

Kurtosis -.7 -1.0 .1 -.7 -.4 .2 -.8 -.6 -1.1 -.6

SE Kurt. .6 .6 .6 .6 .6 .6 .6 .6 .6 .6

Descriptive Statistics: Desktop (n=27)

Mean 3.2 2.2 3.6 2.3 3.7 2.2 2.6 3.3 2.3 2.3

SE Mean .2 .2 .2 .2 .2 .2 .2 .2 .3 .3

Std.Dev. 1.0 1.0 1.0 1.1 .9 1.1 1.0 1.0 1.3 1.3

Skewness .3 .5 -.4 .6 -.5 .7 -.2 -.3 .4 .5

SE Skew. .5 .5 .5 .5 .5 .5 .5 .5 .5 .5

Kurtosis -.8 -.7 -.8 -.4 -.3 -.6 -1.3 -.4 -1.3 -1.2

SE Kurt. .9 .9 .9 .9 .9 .9 .9 .9 .9 .9

perience” (USQ1), with a standard deviation of .8.

In contrast, the Desktop group reported a mean sat-

isfaction score of 3.6 and a higher standard devia-

tion of .9, indicating more variability. For satisfac-

tion with visual quality (USQ3), the XR group scored

higher at 3.9, compared to the Desktop group’s score

at 3.6. This suggests a better perceived visual qual-

ity and a more consistent positive perception in the

XR environment, indicating an effective user expec-

tation management, compared to the lower scores of

the Desktop group, indicating some areas needing im-

provement (see Table 6).

In general, the participants who used the XR sys-

tem were very satisﬁed, with an overall average rating

of 4.1 out of 5. Despite this, desktop users had an av-

erage rating of 3.6, suggesting that there is room for

improvement in visual appeal and user-friendliness,

although users in this group were generally satisﬁed.

On the other hand, ﬁndings from the Immersive

Presence Questionnaire (IPQ) (Schwind et al., 2019)

show that participants in the XR group experienced

a higher sense of presence than desktop group par-

ticipants. For instance, the item How aware were

you of the real-world surrounding while navigating

in the virtual world? (IPQ1), had a mean of 1.8, stan-

dard deviation of .9, indicating high immersion and

reduced awareness of the real world. In contrast, the

item “How real did the virtual world seem to you?”

(IPQ4) had a mean score of .6 and a standard de-

viation of 1.6, indicating varied perceptions of real-

ism. This variation may stem from individual sensi-

tivities to immersive experiences. The moderate im-

mersion measured by the overall IPQ score of 3.48 on

Table 6: User Satisfaction Questionnaire.

USQ1 USQ2 USQ3

Descriptive Statistics: XR (n=29)

Mean 4.1 4.0 3.9

SE Mean .1 .2 .2

Std. Dev. .8 1.1 .9

Skewness -.1 -1.0 -.3

SE Skewness .4 .4 .4

Kurtosis -1.2 .6 -.7

SE Kurtosis .9 .9 .9

Descriptive Statistics: Desktop (n=27)

Mean 3.6 3.6 3.6

SE Mean .2 .2 .2

Std. Dev. .8 1.2 1.1

Skewness -1.2 -.3 -.7

SE Skewness .5 .5 .5

Kurtosis .6 -.7 -.0

SE Kurtosis .9 .9 .9

56 participants demonstrates the strong experience of

presence when users are engaged in the virtual world.

However, these results also indicate room to improve

the consistency in engagement and realism for all the

users.

Furthermore, the average likelihood of 56 people

recommending their experience was 7.2, indicating

that their opinions differed. XR users displayed more

excitement, with an average NPS score of 7.9, show-

ing their increased likelihood of recommending the

experience. In contrast, the Desktop users scored 6.5,

indicating the need for improvement to match the rec-

ommendability of the XR version.

Another independent sample t-test was conducted

to investigate the effects of the XR environment com-

pared to the Desktop version on instructional de-

sign. This study looked at the System Usability Scale

(SUS), User Satisfaction Questionnaire (USQ), and

Net Promoter Score (NPS) in both the XR and Desk-

top groups. The goal was to demonstrate that par-

ticipants in the XR environment would report higher

scores in perceived presence, system usability, and

overall satisfaction than those using the Desktop ver-

sion (refer to Table 7).

This data analysis indicates that users of the XR

system reported signiﬁcantly higher user satisfaction

(measured by the User Satisfaction Questionnaire,

USQ) and a greater likelihood of recommending the

system (indicated by the Net Promoter Score, NPS)

compared to Desktop users. This suggests that XR

participants were more satisﬁed and inclined to rec-

ommend their experience with the system. However,

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

478

Table 7: Independent Samples T-Test.

t df p

Mean Difference

SE Difference

Cohen’s d

SE Cohen’s d

System Usability Scale

SUS1 -.7 54 .2 -.2 .3 -.2 .3

SUS2 -.9 54 .2 -.3 .3 -.3 .3

SUS3 .4 53.8 .7 .1 .3 .1 .3

SUS4 -1.2 53.7 .1 -.4 .3 -.3 .3

SUS5 -.1 54 .5 -.0 .3 -.0 .3

SUS6 1.0 48.8 .8 .3 .3 .3 .3

SUS7 2.9 43.5 1.0 .7 .3 .8 .3

SUS8 3.0 53.6 1.0 .8 .3 .8 .3

SUS9 -3.3 45.1 .0 -1.0 .3 -.9 .3

SUS10.3 52.5 .6 .1 .3 .1 .3

User Satisfaction Questionnaire

USQ1 -2.1 53.2 .0 -.4 .2 -.6 .3

USQ2 -1.4 53 .1 -.4 .3 -.4 .3

USQ3 -1.2 51.8 .1 -.3 .3 -.3 .3

Net Promoter Score

NPS -2.4 47.7 .0 -1.4 .6 -.7 .3

Note. For all tests, the alternative hypothesis speciﬁes that group

Desktop is less than group XR.

Note. Welch’s t-test.

the two groups did not have signiﬁcant differences in

system usability (as assessed by the System Usability

Scale, SUS), indicating similar usability ratings.

For this independent samples t-test, Welch’s test

approach was used due to unequal variances indicated

by Levene’s test for items SUS7 and SUS9, particu-

larly regarding ease of learning and conﬁdence. Al-

though normality tests showed signiﬁcant deviations

(p < .05), the Central Limit Theorem supports the

idea that the distribution of sample means approxi-

mates normality as sample sizes grow larger (Lakens,

2022).

The system usability scale (SUS) analysis did

not show signiﬁcant differences between the XR and

Desktop groups for most items. However, XR users

felt more conﬁdent using the system. This was

clear in their responses about user conﬁdence (SUS9),

where XR users had higher scores (t = −3.3, p < .01)

and a strong effect size (Cohen’s d = −.9).

In the User Satisfaction Questionnaire (USQ), ap-

parent differences were observed in the ﬁrst item

(USQ1), with a p value below .05. However, effect

sizes of -.6 and -.4, for USQ1 and USQ2 show that

XR users had a better overall experience and rated the

visual quality higher than Desktop users. There was

also a slight difference in how easy users found the

system to use and visual quality (USQ2 and USQ3),

with p-values of .1, suggesting that XR users found

it easier to use and were more satisﬁed regarding the

visual quality of the environment.

Finally, for the Net Promoter Score (NPS), a sig-

niﬁcant difference (p < .01) with a medium effect size

(Cohen’s d = −.7) was found, meaning that XR users

were more likely to recommend the system. Over-

all, XR users reported higher satisfaction, conﬁdence,

and a greater willingness or likelihood to recommend

their experience with the system.

4 DISCUSSION

Involving immersion technologies in education, such

as XR, to introduce complex concepts like AI-related

topics (e.g., neural networks or handwritten digit

recognition) requires a mix of sensory engagement,

interactivity, and relevance. The literature has indi-

cated that high-quality visuals, realistic simulations,

and interactive features help learners actively explore

complex topics (e.g., AI concepts), making them eas-

ier to understand and remember (Marougkas et al.,

2023). However, it is important to avoid overwhelm-

ing students with too much information at once, so

technology should support learning without causing

distractions (Skulmowski, 2024). Thus, gradually in-

troducing XR technologies and providing guidance

can help students focus on their learning outcomes.

The ﬁndings indicate that XR environments boost

engagement, promoting deeper cognitive processing

via meaningful interactions with the content, as vi-

sualizations are particularly useful in comprehending

complex ideas, such as Neural Networks, as they aid

in understanding (Zhang et al., 2024). In general,

users found the XR system to be easier to navigate and

allowing them to focus on learning rather than strug-

gling with technology. This is supported by the Sys-

tem Usability Scale (SUS) scores, where the partici-

pants in the XR group had an average score of 60.1,

compared to 57.9 for the participants in the desktop

group, suggesting that the XR users had a more pos-

itive experience and increased participation in their

learning tasks.

Moreover, desktop interfaces often lack the im-

mersive elements found in XR, leading to a broader

range of user experiences that may hinder the edu-

cational progress of some students. However, based

on the ﬁndings presented in this paper, the interac-

tive quality of XR enables students to interact with

Exploring Extended Reality (XR) in Teaching AI: A Comparative Study of XR and Desktop Environments

479

3D models, improving comprehension and memory

of complex ideas and grasping deeper complex con-

cepts, giving this technology a notable edge over con-

ventional approaches.

However, despite XR participants reporting more

engagement, as evidenced by longer session lengths

and higher user satisfaction, this did not transfer into

better learning outcomes, as both the XR and Desk-

top groups provided similar correct responses. The

variability in XR user performance suggests that not

all beneﬁted equally from the immersive experience,

with some struggling with the technology. Factors,

such as the novelty of the XR technology, can distract

participants from educational content, and the learn-

ing curve associated with XR could hinder material

absorption (Alnagrat et al., 2022). Then, to overcome

these challenges, students may need speciﬁc training

(i.e., scaffolding ) and support to use technology and

stay focused on learning objectives properly. More-

over, it is also important to reﬂect on the training that

educators need when involving immersive technolo-

gies, such as XR. Educators need to understand how

to use these technologies effectively in their lessons,

leading them to understand how to create engaging

activities and support students in using these tech-

nologies.

Despite these challenges, technologies like XR

have the potential to greatly increase student engage-

ment. They help students learn and retain complex

concepts more easily by making them more acces-

sible, improving motivation, and sparking interest

(Guilbaud et al., 2021b). Integrating XR into educa-

tion could create a more dynamic and effective learn-

ing environment (Zhang et al., 2024). However, fur-

ther research is needed, using larger sample sizes and

more diverse STEM topics, to compare XR with tra-

ditional learning methods and their impact on learn-

ing outcomes. Additionally, it is important to explore

how immersive technologies like XR can support dif-

ferent learning styles, particularly for students who

may need extra time or speciﬁc support to adapt to

this technology. Future studies should focus on how

XR can be used to personalize learning experiences,

tailoring them to individual needs. Therefore, when

used effectively, immersive technologies, such as XR,

have great potential to enhance learning experiences

and improve educational outcomes.

5 CONCLUSIONS, LIMITATIONS,

AND RECOMMENDATIONS

The results of this study show the differences in learn-

ing between XR environments and traditional desk-

top settings, highlighting their distinct advantages and

disadvantages in teaching complex AI concepts. The

students’ experiences and outcomes differed, while

both technological approaches intended to introduce

complex topics by employing interactive tasks, such

as Neural Networks and Handwritten Digit Recogni-

tion. XR technology has made learning more engag-

ing and easier to use, but some challenges need to be

considered.

The outcomes of this research reveal that the vari-

ability in performance among XR users may arise

from their unfamiliarity with the technology (Bautista

et al., 2023). Although performance may increase

with more exposure, this leads to concerns about the

initial learning process and differences in adaptabil-

ity among students, particularly those who are not as

familiar with technology (Parong and Mayer, 2021).

Thus, training and scaffolding are needed to support

the adaptability of new immersive technologies in ed-

ucation.

Moreover, desktop environments are generally

considered more intuitive, as they do not need special

equipment as XR environments, making it possible

for students without XR access to participate (Zhang

et al., 2024). There is also a higher price for the XR

equipment, which makes it difﬁcult to acquire. For-

tunately, the performance/price ratio of XR headsets

continues to increase. For example, our XR environ-

ment can now be deployed on a $300 Meta Quest 3S.

The authors acknowledge that this research only

investigated the quantitative perspective. Therefore,

further research will be conducted to analyze the ef-

fectiveness of immersive technologies, such as XR,

in a deeper way using different lenses. For this, mul-

tiple methods will be used to identify and reﬂect on

the interplay between overall performance and ratio-

nales in measuring students’ learning outcomes. That

is, further research aims to provide a more complete

understanding of how these immersive technologies

impact learning outcomes, reﬂecting on the variabil-

ity in students’ experiences and on how they can be

used more effectively in education.

ACKNOWLEDGMENTS

The authors thank all participants who voluntarily and

actively contributed to the study with their partici-

pation. This material is based upon work supported

by the National Science Foundation under Grants

2417510, 2212200, 2219842, 2309564, and 2318657.

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

480

REFERENCES

Alnagrat, A., Ismail, R. C., Idrus, S. Z. S., and Alfaqi, R.

M. A. (2022). A review of extended reality (xr) tech-

nologies in the future of human education: Current

trend and future opportunity. Journal of Human Cen-

tered Technology, 1(2):81–96.

Baquero, A. (2022). Net promoter score (nps) and customer

satisfaction: relationship and efﬁcient management.

Sustainability, 14(4):2011.

Bautista, L., Maradei, F., and Pedraza, G. (2023). Strate-

gies to reduce visual attention changes while learning

and training in extended reality environments. Inter-

national Journal on Interactive Design and Manufac-

turing (IJIDeM), 17(1):17–43.

Brown, N. J., Furtak, E. M., Timms, M., Nagashima, S. O.,

and Wilson, M. (2010). The evidence-based reason-

ing framework: Assessing scientiﬁc reasoning. Edu-

cational Assessment, 15(3-4):123–141.

Chen, L., Chen, P., and Lin, Z. (2020). Artiﬁcial intelli-

gence in education: A review. Ieee Access, 8:75264–

75278.

Chiang, Y.-C. and Liu, S.-C. (2023). The effects of ex-

tended reality technologies in stem education on stu-

dents’ learning response and performance. Journal of

Baltic Science Education, 22(4):568–578.

Deng, L. (2012). The mnist database of handwritten digit

images for machine learning research [best of the

web]. IEEE signal processing magazine, 29(6):141–

142.

Feijoo-Garcia, M. A., Holstrom, M. S., Magana, A. J., and

Newell, B. A. (2024). Simulation-based learning and

argumentation to promote informed design decision-

making processes within a ﬁrst-year engineering tech-

nology course. Sustainability, 16(7).

Feij

oo-Garc

ıa, M. A., Ram

ırez-Ar

evalo, H. H., and Feij

oo-

Garc

ıa, P. G. (2021). Collaborative strategy for soft-

ware engineering courses at a south american univer-

sity. In Proceedings of the 13th International Confer-

ence on Computer Supported Education - Volume 1:

CSEDU, pages 266–273. INSTICC, SciTePress.

Fern

andez-Batanero, J. M., Montenegro-Rueda, M.,

Fern

andez-Cerero, J., and L

opez-Meneses, E. (2024).

Extended reality as an educational resource in the pri-

mary school classroom: An interview of drawbacks

and opportunities. Computers, 13(2).

Ferreira, J. M. M. and Qureshi, Z. I. (2020). Use of xr tech-

nologies to bridge the gap between higher education

and continuing education. In 2020 IEEE Global En-

gineering Education Conference, pages 913–918.

Grier, R. A., Bangor, A., Kortum, P., and Peres, S. C.

(2013). The system usability scale: Beyond stan-

dard usability testing. In Proceedings of the human

factors and ergonomics society annual meeting, vol-

ume 57, pages 187–191. SAGE Publications Sage CA:

Los Angeles, CA.

Gu, Y., Feijoo-Garcia, M. A., Zhang, Y., Magana, A. J.,

Benes, B., and Popescu, V. (2024). An xr environ-

ment for ai education: Design and ﬁrst implementa-

tion. In 2024 IEEE Conference on Virtual Reality and

3D User Interfaces Abstracts and Workshops (VRW),

pages 157–162. IEEE.

Guilbaud, P., Guilbaud, T. C., and Jennings, D. (2021a). Ex-

tended reality, pedagogy, and career readiness: A re-

view of literature. In Chen, J. Y. C. and Fragomeni, G.,

editors, Virtual, Augmented and Mixed Reality, pages

595–613.

Guilbaud, P., Guilbaud, T. C., and Jennings, D. (2021b).

Extended reality, pedagogy, and career readiness:

A review of literature. In International Confer-

ence on Human-Computer Interaction, pages 595–

613. Springer.

Guo, X., Guo, Y., and Liu, Y. (2021). The development

of extended reality in education: Inspiration from the

research literature. Sustainability, 13(24).

Idrees, A., Morton, M., and Dabrowski, G. (2022). Advanc-

ing extended reality teaching and learning opportuni-

ties across the disciplines in higher education. In 2022

8th International Conference of the Immersive Learn-

ing Research Network, pages 1–8.

Jagatheesaperumal, S. K., Ahmad, K., Al-Fuqaha, A.,

and Qadir, J. (2024). Advancing education through

extended reality and internet of everything enabled

metaverses: Applications, challenges, and open is-

sues. IEEE Transactions on Learning Technologies,

17:1120–1139.

Kara, A., Mintu-Wimsatt, A., and Spillan, J. E. (2021). An

application of the net promoter score in higher edu-

cation. Journal of Marketing for Higher Education,

pages 1–24.

Kharvari, F. and Kaiser, L. E. (2022). Impact of extended

reality on architectural education and the design pro-

cess. Automation in Construction, 141.

Kim, S.-W. (2023). Change in attitude toward artiﬁcial

intelligence through experiential learning in artiﬁcial

intelligence education. International Journal on Ad-

vanced Science, Engineering & Information Technol-

ogy, 13(5).

Kosko, K. W., Ferdig, R. E., and Roche, L. (2021). Concep-

tualizing a shared deﬁnition and future directions for

extended reality (xr) in teacher education. Journal of

Technology and Teacher Education, 29(3):257–277.

Koukopoulos, D., Daﬁotis, P., Sylaiou, S., Koukoulis, K.,

and Fidas, C. (2022). Xr technologies for self-

regulated student exhibitions in art education: Survey

and ﬁrst design considerations. In 2022 International

Conference on Interactive Media, Smart Systems and

Emerging Technologies, pages 1–8.

Kuleto, V., P., M. I., Stanescu, M., Rankovi

c, M.,

Sevi

N. P., P

aun, D., and Teodorescu, S. (2021). Extended

reality in higher education, a responsible innovation

approach for generation y and generation z. Sustain-

ability, 13(21).

Lakens, D. (2022). Sample size justiﬁcation. Collabra:

psychology, 8(1):33267.

opez-Ojeda, W. and Hurley, R. A. (2021). Extended-

reality technologies: An overview of emerging ap-

plications in medical education and clinical care.

The Journal of Neuropsychiatry and Clinical Neuro-

sciences, 33(3):A4–177.

Exploring Extended Reality (XR) in Teaching AI: A Comparative Study of XR and Desktop Environments

481

Marougkas, A., Troussas, C., Krouska, A., and

Sgouropoulou, C. (2023). Virtual reality in edu-

cation: a review of learning theories, approaches

and methodologies for the last decade. Electronics,

12(13):2832.

Meccawy, M. (2022). Creating an immersive xr learning

experience: A roadmap for educators. Electronics,

11(21).

Obeidallah, R., Ahmad, A. A., and Qutishat, D. (2023).

Challenges of extended reality technology in higher

education: A review. International Journal of Emerg-

ing Technologies in Learning, 18(14).

Parong, J. and Mayer, R. E. (2021). Cognitive and affec-

tive processes for learning science in immersive vir-

tual reality. Journal of Computer Assisted Learning,

37(1):226–241.

Pedr

o, F. (2020). Applications of artiﬁcial intelligence to

higher education: possibilities, evidence, and chal-

lenges. IUL Research, 1(1):61–76.

Pham, S. T. and Sampson, P. M. (2022). The develop-

ment of artiﬁcial intelligence in education: A review

in context. Journal of Computer Assisted Learning,

38(5):1408–1421.

Rangel-de L

azaro, G. and Duart, J. M. (2023). You can

handle, you can teach it: Systematic review on the use

of extended reality and artiﬁcial intelligence technolo-

gies for online higher education. Sustainability, 15(4).

Romeike, R. (2019). The role of computer science edu-

cation for understanding and shaping the digital so-

ciety. In Sustainable ICT, Education and Learning:

IFIP WG 3.4 International Conference, SUZA 2019,

Zanzibar, Tanzania, April 25–27, 2019, Revised Se-

lected Papers 1, pages 167–176. Springer.

Schwind, V., Knierim, P., Haas, N., and Henze, N. (2019).

Using presence questionnaires in virtual reality. In

Proceedings of the 2019 CHI conference on human

factors in computing systems, pages 1–12.

Skulmowski, A. (2024). Learning by doing or doing

without learning? the potentials and challenges of

activity-based learning. Educational Psychology Re-

view, 36(1):28.

Zhang, Y., Feijoo-Garcia, M. A., Gu, Y., Popescu, V.,

Benes, B., and Magana, A. J. (2024). Virtual and

augmented reality in science, technology, engineer-

ing, and mathematics (stem) education: An umbrella

review. Information, 15(9).

Zwoli

nski, G., Kami

nska, D., Laska-Le

sniewicz, A.,

Haamer, R. E., Vairinhos, M., Raposo, R., Urem, F.,

and Reisinho, P. (2022). Extended reality in education

and training: Case studies in management education.

Electronics, 11(3).

HUCAPP 2025 - 9th International Conference on Human Computer Interaction Theory and Applications

482