can be found in Manouselis et al. (2011). A common
method to analyse educational data is to use
educational data mining methods (see Romero and
Ventura (2007)). It deals with the analysis of data
for understanding student behaviour. These
techniques can reveal useful information to teachers
and help them design or modify the structure of
courses. Students can also facilitate their studies
using the discovered knowledge. Nowadays,
researchers use educational data mining techniques
mostly to guide student learning efforts, develop or
refine student models, measure effects of individual
interventions or improve teaching support.
One of the most important issues often solved in
educational environment is understanding what
influences student performance. The task involves
the prediction of student's grades or student's course
difficulties. This information can identify students
with greater potential and also those that may
require timely help from teachers or peers to fare
well in the course.
Researchers usually mine from data stored in
university information systems. Mostly, they use
data such as grades, gender, field of study or age.
Thai Nghe et al. (2007) concluded that better results
were gained using decision trees than using
Bayesian networks.
Vialardi et al. (2009) aimed to select courses for
students in order to obtain good exam results.
Difficulties of courses were compared with student
potentials. Both variables were computed from
grades. The work extension can be found in Vialardi
et al. (2010) where the analysis was based on profile
similarity. The results were satisfactory but the false
positives obtained in results were too high. It is
worse to recommend a course that students enrol in
and fail than missing a course that they could pass.
The solution was to sample the data again. It
lowered the accuracy, but decreased significantly the
false positive errors.
Another common topic of mining in educational
data is the prediction of drop-out rate of students.
Dekker et al. (2009) explored the possibilities of the
assignment. The task is similar to the student's
performance analysis but we are interested in the
complex performance and in the chance to
successfully complete their studies .
Our previous work also explored drop-out
prediction (Bayer et al. (2012)). We collected useful
information about students’ studies. We applied
educational data mining methods to this data. We
then created a sociogram from the social data. We
used social network analysis methods to this data
and obtained new attributes such as centrality,
degree or popularity, etc. When we enriched the
original study-related data with these social
attributes and employed educational data mining
methods again, the accuracy of classification
increased from 82.5% to 93.7%.
Marquez-Vera et al. (2011) used questionnaires
to get some detailed information of students’ lives
directly from students because this type of data is
not present in the information system, e.g. the family
size, the smoking habits or the time spent doing
exercises. These data can improve predictions about
students failure.
In this work, we applied data mining methods to
explore the study-related data. Unlike Marquez-Vera
(2011) who was dependent on answers from a
questionnaire, we used confirmed and complete data
from the university information system. If compared
with Thai Nghe et al. (2007) we tested broader
spectrum of machine learning algorithms—bayesian,
as well as instance-based learners, decision tree and
also various rule-based learners. We further
extended the method of Vialardi et al. (2009) by
addition of social data. In this way we were able to
compare students' data together with the information
about their friends. Therefore, we could increase
prediction accuracy.
3 A RECOMMENDER SYSTEM
PROPOSAL
Students are interested in information resources and
learning tasks that would improve their skills and
knowledge. The recommender system should, hence,
monitor their duties and show them either an easy or
an interesting way to graduate.
The proposal of recommender system consists of
three parts: data extraction module that extracts data
from the Information System of Masaryk University
(IS MU) database, pre-processing and analytical part
(allows the user to select relevant features, to
compute new ones, to obtain basic statistics about
those features, and to run machine learning
algorithms) and the presentation module (selects
important knowledge and presents it to the user).
3.1 Use of the System
The proposed system will recommend mandatory
courses and associated prerequisite courses. Elective
and optional courses will be selected according to
the student's potential with respect to vacancies in
the timetable. The system will recommend
interesting, beneficial and achievable courses for
CourseRecommendationfromSocialData
269