conducted based on the students’ performance on a
preliminary test posed to the student at the first time
of his/her interaction with the system. Then, the k-
means algorithm takes as input multiple students’
characteristics, which are described below and
serves as means for the initialization of the new-
student-model based on recognized similarities
between the new student and past students who
belong to the same stereotype category.
This paper is organized as follows. First, we
present the related scientific work. In sections 3 and
4, we discuss our system’s architecture, namely the
machine learning in student modelling and the k-
means clustering algorithm. Finally, in section 5, we
come up with a discussion about the usability of
centroid-based clustering for user models and we
present our next plans.
2 RELATED WORK
Teaching languages through computer-assisted
approaches is a quite significant field in language
learning. User modeling has already been applied in
a wide variety of scientific areas, including
educational software for language instruction.
Machine learning techniques have been applied to
user modeling problems for acquiring models of
users. In this section, we try to imprint the speckle of
the scientific progress of student modeling
concerning Machine Learning and CALL (Computer
Assisted Language Learning).
Basile et al (2011) proposed the exploitation of
machine learning techniques to improve and adapt
the set of user model stereotypes by making use of
user log interactions with the system. To do this, a
clustering technique is exploited to create a set of
user models prototypes; then, an induction module is
run on these aggregated classes in order to improve a
set of rules aimed as classifying new and unseen
users. Their approach exploited the knowledge
extracted by the analysis of log interaction data
without requiring an explicit feedback from the user.
Nino (2009) presented a snapshot of what has been
investigated in terms of the relationship between
machine translation (MT) and foreign language (FL)
teaching and learning. Moreover, the author outlined
some of the implications of the use of MT and of
free online MT for FL learning. Friaz-Martinez et al
(2007) investigated which human factors are
responsible for the behavior and the stereotypes of
digital libraries users so that these human factors can
be justified to be considered for personalization. To
achieve this aim, the authors have studied if there is
a statistical significance between the stereotypes
created by robust clustering and each human factor,
including cognitive styles, levels of expertise and
gender differences. Virvou and Chrysafiadi (2006)
described a web-based educational application for
individualized instruction on the domain of
programming and algorithms. Their system
incorporates a user model, which relies on
stereotypes, the determination of which is based on
the knowledge level of the learner. Liccheli et al
(2004) focused on machine learning approaches for
inducing student profiles, based on Inductive Logic
Programming and on methods using numeric
algorithms, to be exploited in this environment.
Moreover, an experimental session has been carried
out from the authors, comparing the effectiveness of
these methods along with an evaluation of their
efficiency in order to decide how to best exploit
them in the induction of student profiles. Tsiriga and
Virvou (2004) introduced the ISM framework for
the initialization of the student model in Web-based
ITSs, which is a methodology that uses an
innovative combination of stereotypes and the
distance weighted k-nearest neighbor algorithm to
set initial values for all aspects of the student model.
SignMT was implemented by Ditcharoen et al
(2010) to translate sentences/phrases from different
sources in four steps, which are word
transformation, word constraint, word addiction and
word ordering. Finally, Virvou and Troussas (2011)
described a ubiquitous e-learning tutoring system for
multiple language learning, called CAMELL
(Computer-Assisted Multilingual E-Language
Learning). It is a post-desktop model of human-
computer interaction in which students “naturally”
interact with the system in order to get used to
electronically supported learning. Their system
presents advances in user modeling, error proneness
and user interface design.
However, after a thorough investigation in the
related scientific literature, we came up with the
result that there was no implementation of
multilingual educational systems that combine
student modeling and machine learning. Hence, we
implemented a prototype system, which incorporates
intelligence in its diagnostic component, offers
proneness to students’ errors provides error
diagnosis and advice based on students’ needs.
3 MACHINE LEARNING IN
USER MODELING
Student modeling can undoubtedly benefit from
Centroid-basedClusteringforStudentModelsinComputer-basedMultipleLanguageTutoring
199