Clustering Diagnostic Assessment of Students with the K-Means

Algorithm Based on Talents and Interests

Mesra Betty Yel

, Syahril Efendi

, Muhammad Zarlis

and Saib Suwilo

Graduate Program of Computer Science, Department of Computer Science, Universitas Sumatera Utara,

Medan, Indonesia

Department of Computer Science, Universitas Sumatera Utara, Medan, Indonesia

Information System Management Department, BINUS Graduate Program – Master of Information System Management,

Bina Nusantara University, Jakarta, Indonesia

Department of Mathematics, Universitas Sumatera Utara, Medan, Indonesia

Keywords:

Differentiated E-Learning, Diagnostic Assessment Classiﬁcation, K-Means Clustering Diagnostic

Assessment.

Abstract:

E-Learning is a way of teaching and learning online in virtual classrooms which gives experiences, changes

and needs according to technological developments and new learning paradigms with the ﬂexibility given to

educators in formulating learning designs and assessments. This research contributes to producing a student

classiﬁcation model using the k-means clustering algorithm to be applied to differentiated e-learning, which

can accommodate all user needs according to abilities, interests, and talents by using artiﬁcial intelligence.

The result is a differentiated e-learning model is produced to classify diagnostic assessments. Based on test

data on 20 students, it was successfully classiﬁed into 3 clusters, namely students in class 1 with a trend of X

values being in the do not know the value, Y with a very ignorant value while the Y value is in a value between

not knowing and partially understanding with a total of 9 people. Students are in grade 3 with a trend with

a value of X being between not knowing and partial understanding, Y with a partially understanding value,

while the Y value is at a very ignorant value with a total of 6 people with an accuracy level of f1 test score

100%.

1 INTRODUCTION

The development of technology and artiﬁcial intel-

ligence, especially in the Industrial Revolution 4.0

era, covers almost all ﬁelds, one of which is learn-

ing or e-learning with a machine learning approach.

E-Learning is a way of teaching and learning online

in a virtual classroom via a computer or mobile phone

with a wireless network. In eveloping e-Learning, it is

very important to determine the success factors. The

development of e-Learning quality in an institution

requires good standardization (Jung, 2011). There are

two literature-based and data-deﬁned approaches to

the automatic detection of learning styles. A data-

driven strategy aims to build classiﬁers based on data.

A literature-based approach uses user models to de-

rive clues and generates simple if-then rules to detect

learning styles (Rasheed and Wahid, 2021). Various

efforts to expand access and improve the quality of

education delivery have not resulted in satisfactory

learning outcomes (Ree et al., 2018). However, the

expansion of access to education has not been fully

proportional to the improvement and equity in the

quality of education. The results of PISA in 2018 sur-

vey showed that 60% to 70% of students in Indone-

sia are still below the minimum proﬁciency standards

in science, mathematics, and reading. The disparity

in the quality of education between regions is also

still an issue. Virtual Reality (VR) is useful for in-

creasing student engagement and retention rates, for

some topics, compared to traditional learning tools

such as books, and videos. However, a student can

still be distracted and detached due to various fac-

tors including unwanted stress, thoughts, and noise

(Asish et al., 2022). Recent research, particularly in

the ﬁelds of psychology and human-computer inter-

action, shows that text and audio- based learning is

effective. According to the Modality Principle, on-

screen speech is superior to on-screen text for learn-

ing (Butcher, 2017) in the case of complex graphical

Yel, M., Efendi, S., Zarlis, M. and Suwilo, S.

Clustering Diagnostic Assessment of Students with the K-Means Algorithm Based on Talents and Interests.

DOI: 10.5220/0012457200003848

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 3rd International Conference on Advanced Information Scientiﬁc Development (ICAISD 2023), pages 295-301

ISBN: 978-989-758-678-1

295

representations that include dual channel processing

in working memory. Sarune (Baceviciute et al., 2020)

found that reading texts from virtual books was su-

perior to listening to learning, speciﬁcally for knowl-

edge retention, but found no signiﬁcant difference for

knowledge transfer. Han (Han et al., 2022) proposed

several intervention strategies to increase students’ at-

tention and their ﬁndings indicated that instruction

from real-world teachers can be transferred to virtual

classrooms. Various problems in achieving quality

resources, especially vocational education, including

the disparity in the quality of education between re-

gions, the competence of teachers in Indonesia is also

inadequate, it is believed that the teaching model of

teachers in Indonesia is still not on target, the lack of

readiness of students in participating in learning, ei-

ther due to lack of nutrition from childhood, minimal

family welfare conditions, as well as a lack of basic

literacy skills, lack of competence and motivation of

teachers in teaching, lack of learning resources, man-

agement and governance of education has not devel-

oped well. Learning competencies must be achieved

by students at each stage of development in primary

and secondary education. The learning outcomes con-

tain several competencies within the scope of the ma-

terial in the form of a comprehensive narrative then

adjust the mapping of learning outcomes in the devel-

opmental stages of students who are classiﬁed in the

age phase (Wickramasinghe, 2022).

The presence of the concept of machine learn-

ing brings major changes in various ﬁelds, especially

in terms of prediction and classiﬁcation. Due to

the recent advances in the ﬁeld of artiﬁcial intelli-

gence, there has been an increase in research on how

machine learning can help in detection, and predic-

tion (Nordin et al., 2022). Machine learning and

artiﬁcial intelligence techniques have proven help-

ful when pragmatic for complex problems and ﬁelds

such as energy optimization, workﬂow scheduling,

video games, and cloud computing (Kumar et al.,

2022). Data heterogeneity and large data volumes

create many problems regarding digital system speed

and data storage security. The solution can be found

in artiﬁcial intelligence technologies, speciﬁcally ma-

chine learning (Kuklin et al., 2023; Alani and Tawﬁk,

2022). In research on skin cancer classiﬁcation, the

SVM method excels in classiﬁcation results, namely

KNN at 92.70%, SVM at 93.70%, Decision tree (DT)

at 89.5%, and boosted tree (BT) at 84.30% (Victor

and Ghalib, 2017). Machine learning can be federated

on patient datasets with the same set of variables but

separated across stores. But federated learning can-

not handle situations where different data types for a

given patient are vertically segregated across organi-

zations and when matching patient IDs across multi-

ple institutions is difﬁcult (Liu et al., 2022; Sukanya

et al., 2022). Recently, a segmentation method based

on multi-task learning was proposed, and it can group

more than two images simultaneously and easily add

different types of previous images. The use of mobile

technology has an important role in educational insti-

tutions, including achieving distance learning goals.

Various media can also be used to support the imple-

mentation of E-learning. For example, virtual classes

use Google Classroom, Google Meet, Zoom, Ed-

modo, and Schoology services and instant messaging

applications such as WhatsApp. E-learning is a form

of learning model that is facilitated and supported by

the use of information and communication technol-

ogy (Moore, 2016).

To achieve effective learning, students need to be

classiﬁed based on their talents and interests before

starting the teaching and learning process so that ed-

ucators know students’ talents and interests and pro-

vide learning materials according to their needs, it is

necessary to classify students based on diagnostic as-

sessments.

2 RESEARCH FRAMEWORK

This research is focused on classifying students ac-

cording to their interests and talents so that they can

be used for differentiated e-learning models according

to the needs and interests of students by study learning

style models, dimensions, values, and their combi-

nations to identify students’ competencies, strengths,

weaknesses. The proposed methodology is a machine

learning approach for early detection and the results

of evaluating the development of the learning process

of students, as a research framework using the follow-

ing concepts:

Figure 1: Research Framework.

Beginning with the determination of learning out-

comes which become standards and learning targets

that must be achieved by students who will be tested

during a diagnostic assessment. Diagnostic assess-

ment according to the number of subjects to be tested

ICAISD 2023 - International Conference on Advanced Information Scientiﬁc Development

296

and the weighting is complete understanding, partial

understanding and not understanding. The results of

the diagnostic assessment were classiﬁed and clus-

tered using the k-mean algorithm for proximity Eu-

clidean distance. Cluster results become an alterna-

tive solution in classifying students, which then learn-

ing materials are adapted to each cluster. Continu-

ously evaluate and cluster student development to de-

velop and improve the quality of student learning.

3 RESULT AND DISCUSSION

The most important issue in the discussion about clas-

siﬁcation is the determination of similarity, namely

the degree to which the student data considered re-

sembles the data of other participants. In fact the

measure of similarity is very important for clustering.

(Keogh and Kasetty, ; Fu, 2011; Serra and Arcos, ;

Lauwers and Moor, 2017). Grouping sets W(s) re-

quires understanding the distance with the length w.

There are several possibilities and options for mea-

suring distances in ﬁnding rules. The simplest option

is to treat the subsequence of length w as the element

of R

and then apply the Euclidean distance (that is,

metric L

). (Jiang et al., 2019) has proven empiri-

cally that the Euclidean distance is unbeatable. Eu-

clidean distance is a method that is parameter free,

fast computation time and suitable for various data

mining optimizations such as indexing. The meaning

is, for ˆx = (x

,...,x

) and ˆy = (y

,..., y

) deﬁned

d (ˆx, ˆy) =

∑

− y

)

1/2

(1)

As a metric in grouping. Other metrics include

common metrics L

that are deﬁned with

( ˆx, ˆy) =

∑

− y

)

1/p

(2)

For p ≥ 1 and L

∞

= max

− y

In various uses, we want to obtain a subsequent

shape as the main factor determining the distance.

This means that two subsequences can essentially

have the same shape even though they have different

amplitudes and baselines. One way to achieve this is

to normalize the subsequences and then apply metrics

in the normalized subsequence. State the sequence

version ˆx normalized with κ (ˆx), deﬁned distance be-

tween ˆx and ˆy by

d (ˆx, ˆy) = L

(κ ( ˆx)) −(κ( ˆy)) (3)

Normalization can be done in a number of ways

κ( ˆx)

= x

− E ˆx

(where E ˆx

is the expected or aver-

age value of the sequence value), which results in the

average value of the sequence being 0. Can also be

used κ(ˆx )

= (x

−E ˆx

)/D ˆx (where D ˆx is the variation

of the sequence), which will force sequence mean to

0 and variance to 1. To evaluate the quality of cluster-

ing using cross entropy (Geyer et al., 2019) which is

stated as follows:

Crossentropy =

∑

j=i



SDB



−

∑

i=1

i j

log(P

i j

)

(4)

K : number of clusters

: number of sequences in j cluster

m : the number of classes in the sequence

database

i j

: probability randomly j cluster

SDB : database sequences

Normalized mutual information (NMI) is one sig-

niﬁcant comparison measure for evaluating cluster or

algorithm results. This can help researchers to assess

the performance and analysis of improvement of an

algorithm. Speciﬁcally determined as follows. Sup-

pose C

and C

are a set of class labels and a set

of cluster labels calculated using the clustering algo-

rithm, then NMI between C

and C

is:

NMI (C

) =

H (C

) − H (C

)

H (C

)



= 1 +

I (C

)

H (C

)



(5)

where H(P), H(P,Q) and I(P,Q) represents the

entropy, joint entropy and mutual information vari-

ables P dan Q. When C

and C

independent from

one variable to another, NMI(C

) = 1, because

H(C

) = H(C

) + H(C

) must be met. The

greater the NMI(C

) the more accurate the clus-

ter results will be (Umatani et al., 2023). Sampling

efﬁciency depends on the choice of sampling den-

sity. A sampling density close to optimal can be found

through the application of the cross entropy method.

The cross-entropy method is an adaptive sampling ap-

proach that determines the sampling density by min-

imizing the Kullback-Leibler divergence between the

theoretically optimal needs sampling density and a se-

lected group of parametric distributions(Marelli and

Sudret, 2018).

Clustering Diagnostic Assessment of Students with the K-Means Algorithm Based on Talents and Interests

297

4 DIFFERENTIATED

ELEARNING MODELS

The resulting differentiated e-learning model is de-

scribed by the following formula.

ET =

∑

i−1

∑

i=1

i j

∑

i=1

i j

∑

j=1

i j

∑

k=1



− c



∑

n=1

−

∑

n=1

∑

m=1

k (x

)

(6)

ET : differentiated elearning

X : matrix element data set

y : The value of the elements of the diagnostic

assessment

c : centroid distance

N : number of super vector

a : super vector

t : classiﬁcation data

k : number of classes

Available data is denoted as x

i j

, where k is the

number of participants and l is the number of di-

agnostic assessment variables so that x

i j

∈ k,l, clus-

ters formed after the diagnostic assessment is denoted

by p, then c ∈ p while the respective labels are de-

noted n

∈

{

−1,+1

}

f orn = 1,2,. . . ,N, where N is

the number of data labels.

1. Perform a Diagnostic Assessment

A =

∑

i=1

∑

j=1

i j

(7)

A : diagnostic assessment

i j

: identify the competencies, strengths,

weaknesses of participant data

∑

i=1

∑

j=1

i j

> 1,i = 1, j = 1...k (8)

where k=data set or the number of students carry-

ing out a diagnostic assessment while l is the com-

petency variable being tested. Diagnostic assess-

ment data is data from the results of a student’s

diagnostic assessment of a particular subject.

2. Normalization with individual max value matrix

i j

∑

i=1

i j

(9)



(maxP

i j

| jεk,i = 1,2,3,...,n)



−



(minP

i j

| jεk,i = 1,2,3,...,n)



P : Diagnostic assessment value

carried out to seek advantage with the poten-

tial value of competence, strength, where as N

−

function to ﬁnd the value of the cost or constraints

to be optimized. By normalizing the matrix, the

average and comparison of the diagnostic assess-

ment values between participants were produced.

3. Determine the max value of all participants

i j

∑

j=1

i j

(10)



(maxS

i j

| jεk,i = 1,2,3,...,n)



−



(minS

i j

| jεk,i = 1,2,3,...,n)



used for the maximum value of the diagnos-

tic assessment of all participants, meanwhile NS

−

to determine the minimum value.

4. Carry out classiﬁcation and clustering based on

talent interests

d =

∑

k=1

i j

− v

i j

)

(11)

Each element has a potential corresponding con-

straint NS

and constraints NS

−

, where each ele-

ment is taken into consideration in classifying ac-

cording to interests, talents, strengths and compe-

tencies, strengths and weaknesses. For each el-

ement that has similarities, it will form a group

based on the distance function d to determine the

proximity distance.

The model found produces a classiﬁcation with the

K-Means algorithm clustering approach, with the fol-

lowing stages:

1. Diagnostic Assessment. Diagnostic assessment

is the result of student assessment with certain

ﬁelds and criteria before the learning process be-

gins. Test data using cognitive diagnostic assess-

ment data is shown in the table 1.

2. Normalization. The process of normalization is

to compare the results of the diagnostic assess-

ment between participants, with normalization a

comparison value is produced between students

along with the maximum and minimum scores of

students with the formula:

i j

∑

i=1

i j

and S

i j

∑

j=1

i j

Then the matrix normalization of the results of

the diagnostic assessment is produced in the table

ICAISD 2023 - International Conference on Advanced Information Scientiﬁc Development

298

Table 1: Diagnostic assessment data.

ID X Y Z

1 fully understand fully understand Partly Understand

2 Don’t know Partly Understand Partly Understand

3 fully understand Don’t know Partly Understand

4 fully understand Partly Understand Don’t know

5 fully understand Don’t know fully understand

6 Don’t know Partly Understand fully understand

7 Partly Understand Don’t know fully understand

8 Don’t know Partly Understand Don’t know

9 Partly Understand fully understand Partly Understand

10 Don’t know fully understand Don’t know

11 Don’t know Don’t know fully understand

12 Partly Understand Partly Understand fully understand

13 fully understand Don’t know fully understand

14 fully understand fully understand Don’t know

15 Don’t know fully understand Partly Understand

16 Partly Understand Don’t know Don’t know

17 Partly Understand Don’t know fully understand

18 Don’t know Partly Understand fully understand

19 fully understand Don’t know Partly Understand

20 fully understand Partly Understand Don’t know

Table 2: Matrix normalization.

ID Tranformation Normalization

x y z x y z

1 3 3 2 1.463414634 1.62162 0.95238

2 1 2 2 0.487804878 1.08108 0.95238

3 3 1 2 1.463414634 0.54054 0.95238

4 3 2 1 1.463414634 1.08108 0.47619

5 3 1 3 1.463414634 0.54054 1.42857

6 1 2 3 0.487804878 1.08108 1.42857

7 2 1 3 0.975609756 0.54054 1.42857

8 1 2 1 0.487804878 1.08108 0.47619

9 2 3 2 0.975609756 1.62162 0.95238

10 1 3 1 0.487804878 1.62162 0.47619

11 1 1 3 0.487804878 0.54054 1.42857

12 2 2 3 0.975609756 1.08108 1.42857

13 3 1 3 1.463414634 0.54054 1.42857

14 3 3 1 1.463414634 1.62162 0.47619

15 1 3 2 0.487804878 1.62162 0.95238

16 2 1 1 0.975609756 0.54054 0.47619

17 2 1 3 0.975609756 0.54054 1.42857

18 1 2 3 0.487804878 1.08108 1.42857

19 3 1 2 1.463414634 0.54054 0.95238

20 3 2 1 1.463414634 1.08108 0.47619

3. Cluster Classiﬁcation Model.

(a) The basic concept of K-Means is iterative

search for cluster centers.

(b) The cluster center is determined based on the

distance of each data to the cluster center.

the data to be clustered, x

i j

(i = 1,...,n; j =

1,..., m) with n is the amount of data to be clus-

tered and m is the number of variables.

(d) At the beginning of the iteration, the center of

each cluster is assigned independently (arbitrar-

ily), c

k j

(k = 1,...,K; j = 1,..., m).

(e) Then the distance between each data and each

cluster center is calculated.

(f) To calculate the i-th data distance (X

) at the

center of the k-cluster (C

), named (d

), the Eu-

clidean formula can be used, namely:

d =

∑

k=1



− z



(g) A data will be a member of the J-cluster if the

data distance to the center of the J-cluster is the

smallest compared to the distance to the center

of the other clusters.

(h) Next, group the data that are members of each

cluster.

(i) The new cluster center value can be calculated

by ﬁnding the average value of the data that is

a member of the cluster.

Table 3: Model formulation data sample.

X Y Z

1.463414634 1.621621622 0.952380952

0.487804878 1.081081081 0.952380952

1.463414634 0.540540541 0.952380952

1.463414634 1.081081081 0.476190476

1.463414634 0.540540541 1.428571429

0.487804878 1.081081081 1.428571429

0.975609756 0.540540541 1.428571429

0.487804878 1.081081081 0.476190476

0.975609756 1.621621622 0.952380952

0.487804878 1.621621622 0.476190476

0.487804878 0.540540541 1.428571429

0.975609756 1.081081081 1.428571429

1.463414634 0.540540541 1.428571429

1.463414634 1.621621622 0.476190476

0.487804878 1.621621622 0.952380952

0.975609756 0.540540541 0.476190476

0.975609756 0.540540541 1.428571429

0.487804878 1.081081081 1.428571429

1.463414634 0.540540541 0.952380952

1.463414634 1.081081081 0.476190476

1. Grouping the data into 3 clusters K = 3. Suppose

the cluster center is set arbitrarily

: (1.463414634, 1.621621622, 0.952380952);

: (0.487804878, 0.540540541, 0.952380952);

: (0.952380952, 1.081081081, 1.621621622)

2. Calculate the distance of each data to each cluster

center, for example, to calculate the distance of

the ﬁrst data (No.1) to the ﬁrst cluster center is:

(a) K1 = (1.463414634 + 1.463414634)

+ (1.621621622 + 1.621621622)

(0.952380952 + 0.952380952)

(b) K2 = (0.487804878 + 1.463414634)

+ (0.540540541 + 1.621621622)

(0.952380952 + 0.952380952)

+ (1.081081081 + 0.952380952)

(1.621621622 + 0.952380952)

Clustering Diagnostic Assessment of Students with the K-Means Algorithm Based on Talents and Interests

299

The complete distance calculation results are nor-

malized data are classiﬁed based on similarity, where

each element is considered in classifying accord-

ing to interests, talents, strengths and competencies,

strengths and weaknesses formed in 3 clusters with a

cluster center.

Cluster 1:

X : 0.9756

Y : 0.5405

Z : 1.4286

Cluster 2 :

X : 0.4878

Y : 1.0811

Z : 0.9524

Cluster 3 :

X : 1.4634

Y : 1.6216

Z : 0.4762.

Figure 2: Cluster Diagnostic Assessment.

The results of the student’s diagnostic assessment

were clustered into 3 clusters, cluster 1 where the

value of X was in the do not know value, Y was

in the value of not knowing very much, while the

value of Y was in the value between not knowing

and partially understanding. Cluster 2 is where the

X value is very ignorant, Y is partially understood,

while Y is partially understood. Cluster 3 is where

the value of X is in the value between not knowing

and partially understanding, Y with the value of

partially understanding while the value of Y is in the

value of not knowing very much. Follow-up of the

results of the diagnostic assessment cluster becomes

a decision support for educators to provide learning

materials to students according to their abilities,

talents and interests which are represented by the

variables X, Y and Z. Formed 3 clusters, namely:

Cluster 1

There are 9 students in cluster 1 with an X value of

not knowing, Y with a very ignorant value while the

Table 4: Results of cluster 1 diagnostic assessment.

ID x y z Max Min Cluster

3 1.463414634 0.54054 0.95238 3 1 1

5 1.463414634 0.54054 1.42857 3 1 1

7 0.975609756 0.54054 1.42857 3 1 1

11 0.487804878 0.54054 1.42857 3 1 1

12 0.975609756 1.08108 1.42857 3 2 1

13 1.463414634 0.54054 1.42857 3 1 1

16 0.975609756 0.54054 0.47619 2 1 1

17 0.975609756 0.54054 1.42857 3 1 1

19 1.463414634 0.54054 0.95238 3 1 1

Y value is between not knowing and partially under-

standing, the average max value= 3 and minimum=1.

Table 5: Results of cluster 2 diagnostic assessment.

ID x y z Max Min Cluster

2 0.487804878 1.08108 0.95238 2 1 2

6 0.487804878 1.08108 1.42857 3 1 2

8 0.487804878 1.08108 0.47619 2 1 2

15 0.487804878 1.62162 0.95238 3 1 2

18 0.487804878 1.08108 1.42857 3 1 2

Cluster 2

There are 5 students in cluster 2 with X values being

very ignorant, Y with partial understanding scores

while Y scores are partially understanding scores,

average max score = 2.8 and minimum = 1.

Cluster 3

There are 6 students in cluster 3 with an X value be-

tween not knowing and partial understanding, Y with

a partial understanding value while the Y value is very

ignorant, max average value = 3 and minimum = 1.2.

Table 6: Results of cluster 3 diagnostic assessment.

ID x y z Max Min Cluster

1 1.463414634 1.62162 0.95238 3 2 3

4 1.463414634 1.08108 0.47619 3 1 3

9 0.975609756 1.62162 0.95238 3 2 3

10 0.487804878 1.62162 0.47619 3 1 3

14 1.463414634 1.62162 0.47619 3 1 3

20 1.463414634 1.08108 0.47619 3 1 3

5 CONCLUSIONS

The existence of e-learning system with a new

paradigm that provides ﬂexibility for educators to for-

mulate learning designs and assessments according to

the characteristics and needs of students can be opti-

mized by applying artiﬁcial intelligence using the k-

means clustering algorithm.

The results of the classiﬁcation of the diagnostic

assessment can accommodate all the needs of students

according to abilities, interests, and talents by using

artiﬁcial intelligence in identifying the level of abili-

ties and needs of participants to get a label after the

diagnostic assessment.

ICAISD 2023 - International Conference on Advanced Information Scientiﬁc Development

300

Based on test data on 20 people, it was success-

fully classiﬁed into 3 clusters, namely students in

class 1 with a trend of X values being in the do not

know the value, Y with a very ignorant value while

the Y value is in a value between not knowing and par-

tially understanding with the total 9 people. Students

in class 2 with the trend that the value of X is in the

very ignorant value, Y with a partial understanding

value while the Y value is in the partial understanding

value with a total of 5 people. Students are in class

3 with a trend with a value of X being between not

knowing and partially understanding, Y with a partial

understanding value while the Y value is at a very ig-

norant value with a total of 6 people with an accuracy

level of the f1 test score of 100%.

REFERENCES

Alani, M. and Tawﬁk, H. (2022). Phishnot: A cloud-based

machine-learning approach to phishing url detection.

Comput. Networks, 218:109407,. doi:.

Asish, S., Kulshreshth, A., and Borst, C. (2022). Detect-

ing distracted students in educational vr environments

using machine learning on eye gaze data. Comput.

Graph, 109:75–87,. doi:.

Baceviciute, S., Mottelson, A., Terkildsen, T., and Makran-

sky, G. (2020). Investigating representation of text and

audio in educational vr using learning outcomes and

eeg. In Proceedings of the 2020 CHI Conference on

Human Factors in Computing Systems, pages 1–13,.

Butcher, K. (2017). The multimedia principle. In The

Cambridge handbook of multimedia learning, page

174–205. Cambridge University Press, New York,

NY, US, 2nd edition.

Fu, T. (2011). Engineering applications of artiﬁcial intelli-

gence a review on time series data mining. Eng. Appl.

Artif. Intell, 24(1):164–181,.

Geyer, S., Papaioannou, I., and Straub, D. (2019). Cross

entropy-based importance sampling using gaussian

densities revisited. Struct. Saf, 76:15–27,. doi:.

Han, Y., Miao, Y., Lu, J., Guo, M., and Xiao, Y. (2022). Ex-

ploring intervention strategies for distracted students

in vr classrooms.

Jiang, G., Wang, W., and Zhang, W. (2019). A novel dis-

tance measure for time series: Maximum shifting cor-

relation distance. Pattern Recognit. Lett, 117:58–65,.

doi:.

Jung, I. (2011). The dimensions of e-learning quality: From

the learner’s perspective. Educ. Technol. Res. Dev,

59(4):445–464,.

Keogh, E. and Kasetty, S. On the need for time series data

mining benchmarks : A survey and empirical demon-

stration.

Kuklin, V., Alexandrov, I., Polezhaev, D., and Tatarkanov,

A. (2023). Prospects for developing digital

telecommunication complexes for storing and ana-

lyzing media data. Bull. Electr. Eng. Informatics,

12(3):1536–1549,.

Kumar, Y., Kaul, S., and Hu, Y.-C. (2022). Machine learn-

ing for energy-resource allocation, workﬂow schedul-

ing and live migration in cloud computing: State-

of-the-art survey. Sustain. Comput. Informatics Syst,

36:100780,. doi:.

Lauwers, O. and Moor, B. (2017). A time series distance

measure for efﬁcient clustering of input / output sig-

nals by their underlying dynamics.

Liu, D., Fox, K., Weber, G., and Miller, T. (2022). Confed-

erated learning in healthcare: Training machine learn-

ing models using disconnected data separated by in-

dividual, data type and identity for large-scale health

system intelligence. J. Biomed. Inform, 134:104151,.

doi:.

Marelli, S. and Sudret, B. (2018). An active-learning algo-

rithm that combines sparse polynomial chaos expan-

sions and bootstrap for structural reliability analysis.

Struct. Saf, 75:67–74,. doi:.

Moore, D. (2016). E-learning and the science of instruc-

tion: Proven guidelines for consumers and design-

ers of multimedia learning. Educ. Technol. Res. Dev,

54(2).

Nordin, N., Zainol, Z., Noor, M., and Chan, L. (2022).

Suicidal behaviour prediction models using machine

learning techniques: A systematic review. Artif. In-

tell. Med, 132:102395,. doi:.

Rasheed, F. and Wahid, A. (2021). Learning style detection

in e-learning systems using machine learning tech-

niques. Expert Syst. Appl, 174(February).

Ree, J., Muralidharan, K., Pradhan, M., and Rogers, H.

(2018). Double for nothing? experimental evidence

on an unconditional teacher salary increase in indone-

sia. Q. J. Econ, 133(2).

Serra, J. and Arcos, J. An empirical evaluation of similarity

measures for time series classiﬁcation.

Sukanya, J., Gandhi, K., and Palanisamy, V. (2022). An

assessment of machine learning algorithms for health-

care analysis based on improved mapreduce. Adv.

Eng. Softw, 173:103285,. doi:.

Umatani, R., Imai, T., Kawamoto, K., and Kunimasa, S.

(2023). Time series clustering with an em algorithm

for mixtures of linear gaussian state space models.

Pattern Recognit, pages 109375,. doi:.

Victor, A. and Ghalib, M. (2017). Automatic detection and

classiﬁcation of skin cancer. Int. J. Intell. Eng. Syst,

10(3):444–451,.

Wickramasinghe, I. (2022). Applications of machine learn-

ing in cricket: A systematic review. Mach. Learn. with

Appl, 10:100435,. doi:.

Clustering Diagnostic Assessment of Students with the K-Means Algorithm Based on Talents and Interests

301