parent's job transfer, job factors, inability to continue
their education, tuition fees, and many other factors.
The factors that resulted in the student's
resignation were varied and several cases did not
explain the reason for the resignation. But in general
cases of student resignation are caused by factors
such as poor student Achievement Index Rating,
attendance rates and income of parents that affect the
tuition fee.
Based on this, a study was conducted to find
factors that influenced the level of the resignation of
students at the University by using the C4.5
algorithm. The results of this study were expected to
help the college to anticipate the level of student
resignation so that it did not become too high.
2 DATA MINING
Data mining is a term that is often said to be a way
to describe and to search for knowledge discovery in
a database. One of the difficulties to define data
mining is the fact that data mining inherits many
aspects and techniques from various established
fields of science.
Data mining is a process that uses statistical
techniques, mathematics, artificial intelligence and
machine learning to extract and identify useful
information and related knowledge from various
large databases (Silitonga, Parasian, Irene Sri
Morina., 2018). According to Partner Group, Data
Mining is a process of finding meaningful
relationships, patterns, and tendencies by examining
in a large collection of data stored in storage using
pattern recognition techniques such as statistical and
mathematical techniques (Larose, 2005).
One of the data mining techniques is
classification. Classification is the process of finding
a model or function that explains or distinguishes a
concept or class of data, to be able to estimate the
class of an object which label is unknown. The model
itself can be an if-then rule, decision tree,
mathematical formula or neural network.
2.1 Classification
Classification is a process in data mining that is
used to find models or functions that explain or
differentiate concepts or data classes (Saputra, Rizal.,
2014). Classification of data is used to estimate a
class of an object whose label is unknown (Sharma,
Jitendra, Sanjeev, 2013).
Data mining classification is done by placing
objects into one of several predetermined categories.
Classification is used widely to predict classes on a
particular label. That is by classifying data (building
models) based on training sets and values (class
labels) in classifying certain attributes and using them
in classifying new data (Breiman, et al., 1984). The
stages of classification in data mining consist of (Lior
Rokach & Oded Maimon., 2005):
1. Building a model, in this stage, a model is
created to solve the problem of class
classification or attributes in the data, this model
is built based on a training set - an example of
data from a problem encountered, this training
set already has complete information both
attributes and classes.
2. Implementation of the model, at this stage the
model that has been built previously is used to
determine the attribute/class of new data whose
attributes/class is not known before.
3. Evaluation, at this stage the results of the
application of the model in the previous stage
are evaluated using measured parameters to
determine whether the model is acceptable.
2.2 Decision Tree
Decision tree and decision rules are data mining
methodologies that applied widely as a solution to
classify problems (Arcega, et al., 2013). Data mining
is a term that is often said to be a way to describe and
find discoveries in the form of knowledge in a
database (Silitonga, Parasian.2017).
The main function of the application of the
decision tree is a decision tree's ability to break down
the complex decision making processes more simple
(Sarma, Sunil, 2013). Through the decision tree,
decision-makers will interpret the solution to the
problem more (Dai, W., Ji, W., 2014). Besides this,
decision tree are useful for exploring data, finding
hidden relationships between several input variables
with output variables. The decision tree combine data
exploration and model.
Decision tree representations are considered as
a logical method which often used in the discussion
of applied statistics and learning machine (Ling.,
Charles X., et. Al., 200 4). Decision tree making uses
a supervised learning method is a learning process
where new data is classified based on existing
training samples.
Decision tree consist of nodes that are attributes
of the sample data. Branches come out of the node the
values or outcomes that are associated with the
attributes (nodes). While the leaves in the decision
tree show the class of the tested data sample.